Design and Evaluation of Architectures for Commercial Applications

Post on 08-Jan-2016

24 views 4 download

Tags:

description

Design and Evaluation of Architectures for Commercial Applications. Part I: benchmarks. Luiz André Barroso. Why architects should learn about commercial applications?. Because they are very different from typical benchmarks Because they are demanding on many interesting architectural features - PowerPoint PPT Presentation

Transcript of Design and Evaluation of Architectures for Commercial Applications

Western Research Laboratory

Design and Evaluation of Design and Evaluation of Architectures for Architectures for Commercial ApplicationsCommercial Applications

Luiz André BarrosoLuiz André Barroso

Part I: benchmarksPart I: benchmarks

2 UPC, February 1999

Why architects should learn about Why architects should learn about commercial applications?commercial applications?

Because they are very different from typical Because they are very different from typical benchmarksbenchmarks

Because they are demanding on many interesting Because they are demanding on many interesting architectural featuresarchitectural features

Because they are driving the sales of mid-range Because they are driving the sales of mid-range and high-end systemsand high-end systems

3 UPC, February 1999

Shortcomings of popular benchmarksShortcomings of popular benchmarks SPECSPEC

uniprocessor-orienteduniprocessor-orientedsmall cache footprintssmall cache footprintsexacerbates impact of CPU core issuesexacerbates impact of CPU core issues

SPLASHSPLASHsmall cache footprintssmall cache footprintsextremely optimized sharingextremely optimized sharing

STREAMSSTREAMSno real sharing/communicationno real sharing/communicationmainly bandwidth-orientedmainly bandwidth-oriented

4 UPC, February 1999

SPLASH vs. Online Transaction Processing SPLASH vs. Online Transaction Processing (OLTP)(OLTP)

A typical SPLASH app. hasA typical SPLASH app. has

> 3x the issue rate,> 3x the issue rate,

~26x less cycles spent in memory barriers,~26x less cycles spent in memory barriers,

1/4 of the TLB miss ratios,1/4 of the TLB miss ratios,

< 1/2 the fraction of cache-to-cache transfers,< 1/2 the fraction of cache-to-cache transfers,

~22x smaller instruction cache miss ratio,~22x smaller instruction cache miss ratio,

~1/2 L2$ miss ratio~1/2 L2$ miss ratio

...of an OLTP ...of an OLTP app.app.

5 UPC, February 1999

But the real reason we care? $$$!But the real reason we care? $$$!

Server market:Server market:Total: > $50 billionTotal: > $50 billionNumeric/scientific computing: < $2 billionNumeric/scientific computing: < $2 billionRemaining $48 billion?Remaining $48 billion?

– OLTPOLTP

– DSSDSS

– Internet/WebInternet/Web Trend is for numerical/scientific to remain a nicheTrend is for numerical/scientific to remain a niche

6 UPC, February 1999

Relevance of server vs. PC marketRelevance of server vs. PC market

High profit marginsHigh profit margins Performance is a differentiating factorPerformance is a differentiating factor If you sell the server you will probably sell:If you sell the server you will probably sell:

the clientthe client the storagethe storage the networking infrastructurethe networking infrastructure the middlewarethe middleware the servicethe service ......

7 UPC, February 1999

Need for speed in the commercial marketNeed for speed in the commercial market

Applications pushing the envelopeApplications pushing the envelopeEnterprise resource planning (ERP)Enterprise resource planning (ERP)Electronic commerceElectronic commerceData mining/warehousingData mining/warehousingADSL serversADSL servers

Specialized solutionsSpecialized solutions Intel splitting Pentium line into 3-tiersIntel splitting Pentium line into 3-tiersOracle’s raw iron initiativeOracle’s raw iron initiativeNetwork Appliances’ machinesNetwork Appliances’ machines

8 UPC, February 1999

Seminar disclaimerSeminar disclaimer

Hardware centric approach:Hardware centric approach: target is build better machines, not better softwaretarget is build better machines, not better software focus on fundamental behavior, not on software focus on fundamental behavior, not on software

“features”“features” Stick to general purpose paradigmStick to general purpose paradigm Emphasis on CPU+memory system issuesEmphasis on CPU+memory system issues Lots of things missing:Lots of things missing:

object-relational and object-oriented databasesobject-relational and object-oriented databasespublic domain/academic database enginespublic domain/academic database enginesmany othersmany others

9 UPC, February 1999

OverviewOverview

Day I: Introduction and workloadsDay I: Introduction and workloadsBackground on commercial applicationsBackground on commercial applicationsSoftware structure of a commercial RDBMSSoftware structure of a commercial RDBMSStandard benchmarksStandard benchmarks

– TPC-BTPC-B– TPC-CTPC-C– TPC-DTPC-D– TPC-WTPC-W

Cost and pricing trendsCost and pricing trendsScaling down TPC benchmarksScaling down TPC benchmarks

10 UPC, February 1999

Overview(2)Overview(2)

Day 2: Evaluation methods/toolsDay 2: Evaluation methods/tools IntroductionIntroductionSoftware instrumentation (ATOM) Software instrumentation (ATOM) Hardware measurement & profilingHardware measurement & profiling

– IPROBEIPROBE– DCPIDCPI– ProfileMeProfileMe

Tracing & trace-driven simulationTracing & trace-driven simulationUser-level simulatorsUser-level simulatorsComplete machine simulators (SimOS)Complete machine simulators (SimOS)

11 UPC, February 1999

Overview (3)Overview (3)

Day III: Architecture studiesDay III: Architecture studiesMemory system characterizationMemory system characterizationOut-of-order processorsOut-of-order processorsSimultaneous multithreadingSimultaneous multithreadingFinal remarksFinal remarks

12 UPC, February 1999

Background on commercial applicationsBackground on commercial applications

Database applications:Database applications:Online Transaction Processing (OLTP)Online Transaction Processing (OLTP)

– massive number of short queriesmassive number of short queries

– read/update indexed tablesread/update indexed tables

– canonical example: banking systemcanonical example: banking systemDecision Support Systems (DSS)Decision Support Systems (DSS)

– smaller number of complex queriessmaller number of complex queries

– mostly read-only over large (non-indexed) tablesmostly read-only over large (non-indexed) tables

– canonical example: business analysiscanonical example: business analysis

13 UPC, February 1999

Background (2)Background (2)

Web/Internet applicationsWeb/Internet applicationsWeb serverWeb server

– many requests for small/medium filesmany requests for small/medium filesProxyProxy

– many short-lived connection requestsmany short-lived connection requests– content caching and coherencecontent caching and coherence

Web search indexWeb search index– DSS with a Web front-endDSS with a Web front-end

E-commerce siteE-commerce site– OLTP with a Web front-endOLTP with a Web front-end

14 UPC, February 1999

Background (3)Background (3)

Common characteristicsCommon characteristicsLarge amounts of data manipulationLarge amounts of data manipulation Interactive response times requiredInteractive response times requiredHighly multithreaded by designHighly multithreaded by design

– suitable for large multiprocessorssuitable for large multiprocessorsSignificant I/O requirementsSignificant I/O requirementsExtensive/complex interactions with the operating Extensive/complex interactions with the operating

systemsystemRequire robustness and resiliency to failuresRequire robustness and resiliency to failures

15 UPC, February 1999

Database performance bottlenecksDatabase performance bottlenecks

I/O-bound until recently (Thakkar, ISCA’90)I/O-bound until recently (Thakkar, ISCA’90) Many improvements since thenMany improvements since then

multithreading of DB enginemultithreading of DB engine I/O prefetchingI/O prefetchingVLM (very large memory) database cachingVLM (very large memory) database cachingmore efficient OS interactionsmore efficient OS interactionsRAIDsRAIDsnon-volatile DRAM (NVDRAM)non-volatile DRAM (NVDRAM)

Today’s bottlenecks:Today’s bottlenecks:Memory systemMemory systemProcessor architectureProcessor architecture

16 UPC, February 1999

Structure of a database workloadStructure of a database workload

clients Application server(optional)

Database server

Simple logic checks Formulates and issues DB query

Executes query

17 UPC, February 1999

Who is who in the database market?Who is who in the database market?

DB engine:DB engine:Oracle is dominantOracle is dominantother players: Microsoft, Sybase, Informixother players: Microsoft, Sybase, Informix

Database applications:Database applications:SAP is dominantSAP is dominantother players: Oracle Apps, PeopleSoft, Baanother players: Oracle Apps, PeopleSoft, Baan

Hardware:Hardware:players: Sun, IBM, HP and Compaqplayers: Sun, IBM, HP and Compaq

18 UPC, February 1999

Who is who in the database market? (2)Who is who in the database market? (2)

Historically, mainly mainframe proprietary OSHistorically, mainly mainframe proprietary OS Today:Today:

Unix: 40%Unix: 40%NT: 8%NT: 8%Proprietary: 52%Proprietary: 52%

In two years:In two years:Unix 46%Unix 46%NT 19%NT 19%Proprietary 35%Proprietary 35%

19 UPC, February 1999

Overview of a RDBMS: Oracle8Overview of a RDBMS: Oracle8

Similar in structure to most commercial enginesSimilar in structure to most commercial engines Runs on:Runs on:

uniprocessorsuniprocessorsSMP multiprocessorsSMP multiprocessorsNUMA multiprocessors*NUMA multiprocessors*

For clusters or message passing multiprocessors:For clusters or message passing multiprocessors:Oracle Parallel Server (OPS)Oracle Parallel Server (OPS)

20 UPC, February 1999

The Oracle RDBMSThe Oracle RDBMS

Physical structurePhysical structureControl filesControl files

– basic info on the database, it’s structure and statusbasic info on the database, it’s structure and statusData filesData files

– tables: actual database datatables: actual database data

– indexes: sorted list of pointers to dataindexes: sorted list of pointers to data

– rollback segments: keep data for recovery upon a rollback segments: keep data for recovery upon a failed transactionfailed transaction

Log filesLog files– compressed storage of DB updatescompressed storage of DB updates

21 UPC, February 1999

Index filesIndex files

Critical in speeding up access to data by avoiding Critical in speeding up access to data by avoiding expensive scansexpensive scans

The more selective the index, the faster the accessThe more selective the index, the faster the access Drawbacks:Drawbacks:

Very selective indexes may occupy lots of storageVery selective indexes may occupy lots of storageUpdates to indexed data are more expensiveUpdates to indexed data are more expensive

22 UPC, February 1999

Files or raw disk devicesFiles or raw disk devices

Most DB engines can directly access disks as raw Most DB engines can directly access disks as raw devicesdevices

Idea is to bypass the file systemIdea is to bypass the file system Manageability/flexibility somewhat compromisedManageability/flexibility somewhat compromised Performance boost not large (~10-15%)Performance boost not large (~10-15%) Most customer installations use file systemsMost customer installations use file systems

23 UPC, February 1999

Transactions & rollback segmentsTransactions & rollback segments

Single transaction can access/update many itemsSingle transaction can access/update many items Atomicity is required:Atomicity is required:

transaction either happens or nottransaction either happens or not

old value of old value of balance(X)balance(X) is kept in a rollback is kept in a rollback segmentsegment

rollback: old values restored, all locks releasedrollback: old values restored, all locks released

Example: bank transfer Transaction A (accounts X,Y; value M) { read account balance(X) subtract M from balance(X) add M to balance(Y) commit}

failurefailure

24 UPC, February 1999

Transactions & log filesTransactions & log files A transaction is only committed after it’s side A transaction is only committed after it’s side

effects are in stable storageeffects are in stable storage Writing all modified DB blocks would be too Writing all modified DB blocks would be too

expensiveexpensive random disk writes are costlyrandom disk writes are costly a whole DB block has to be written backa whole DB block has to be written back no coalescing of updatesno coalescing of updates

Alternative: write only a log of modificationsAlternative: write only a log of modifications sequential I/O writes (enables NVDRAM optimizations)sequential I/O writes (enables NVDRAM optimizations) batching of multiple commitsbatching of multiple commits

Background process periodically writes dirty data Background process periodically writes dirty data blocks out blocks out

25 UPC, February 1999

Transactions & log files (2)Transactions & log files (2)

When a block is written to disk the log file entries When a block is written to disk the log file entries are deletedare deleted

If the system crashes:If the system crashes: in-memory dirty blocks are lostin-memory dirty blocks are lost

Recovery procedure:Recovery procedure:goes through the log files and applies all updates to goes through the log files and applies all updates to

the databasethe database

26 UPC, February 1999

Transactions & concurrency controlTransactions & concurrency control

Many transactions in-flight at any given timeMany transactions in-flight at any given timeLocking of data items is requiredLocking of data items is required

Lock granularity:Lock granularity:

Efficient row-level locking is needed for high Efficient row-level locking is needed for high transaction throughputtransaction throughput

Table

Block

Row

concurrenc y

ove rh ead

27 UPC, February 1999

233

Row-level lockingRow-level locking Each new transaction is assigned an unique IDEach new transaction is assigned an unique ID A transaction table keeps track of all active transactionsA transaction table keeps track of all active transactions Lock: write ID in directory entry for rowLock: write ID in directory entry for row Unlock: remove ID from transaction tableUnlock: remove ID from transaction table

Data block directory

Transaction table

234235

120 230

233

Dat

a bl

ock

Simultaneous release of all locksSimultaneous release of all locks Simultaneous release of all locksSimultaneous release of all locks

233233233233

28 UPC, February 1999

Transaction read consistencyTransaction read consistency A transaction that reads a full table should see a A transaction that reads a full table should see a

consistent snapshotconsistent snapshot

For performance, reads shouldn’t lock a tableFor performance, reads shouldn’t lock a table

Problem: intervening writesProblem: intervening writes

Solution: leverage rollback mechanismSolution: leverage rollback mechanism intervening write saves old value in rollback segmentintervening write saves old value in rollback segment

29 UPC, February 1999

Oracle: software structureOracle: software structure Server processesServer processes

actual execution of transactionsactual execution of transactions

DB writerDB writer flush dirty blocks to diskflush dirty blocks to disk

Log writerLog writer writes redo logs to disk at writes redo logs to disk at

commit timecommit time Process and system monitorsProcess and system monitors

misc. activity monitoring and misc. activity monitoring and recoveryrecovery

Processes communicate Processes communicate through SGA and IPCthrough SGA and IPC

30 UPC, February 1999

Oracle: software structure(2)Oracle: software structure(2) SGA: SGA:

shared memory segment mapped shared memory segment mapped by all processes by all processes

Block buffer areaBlock buffer area cache of database blockscache of database blocks larger portion of physical memorylarger portion of physical memory

Metadata areaMetadata area where most communication takes where most communication takes

placeplace synchronization structuressynchronization structures shared proceduresshared procedures directory informationdirectory information

Block buffer area

Redo buffers

Data dictionary

Fixed region

Shared pool

System Global Area (SGA)

Metadata area

Incr

easi

ng v

irtua

l add

ress

31 UPC, February 1999

Oracle: software structure(3)Oracle: software structure(3)

Hiding I/O latency:Hiding I/O latency:many server processes/processormany server processes/processor large block buffer arealarge block buffer area

Process dynamics:Process dynamics: server reads/updates database server reads/updates database

(allocates entries in the redo buffer pool)(allocates entries in the redo buffer pool) at commit time server signals Log writer and sleepsat commit time server signals Log writer and sleeps Log writer wakes up, coalesces multiple commits and issues Log writer wakes up, coalesces multiple commits and issues

log file writelog file write after log is written, Log writer signals suspended serversafter log is written, Log writer signals suspended servers

32 UPC, February 1999

Oracle: NUMA issuesOracle: NUMA issues

Single SGA region complicates NUMA localizationSingle SGA region complicates NUMA localization Single log writer process becomes a bottleneckSingle log writer process becomes a bottleneck Oracle8 is incorporating NUMA-friendly Oracle8 is incorporating NUMA-friendly

optimizationsoptimizations Current large NUMA systems use OPS even on a Current large NUMA systems use OPS even on a

single address spacesingle address space

33 UPC, February 1999

Oracle Parallel Server (OPS)Oracle Parallel Server (OPS)

Runs on clusters of SMPs/NUMAsRuns on clusters of SMPs/NUMAs Layered on top of RDBMS engineLayered on top of RDBMS engine Shared data through diskShared data through disk Performance very dependent on how well data can Performance very dependent on how well data can

be partitionedbe partitioned Not supported by most application vendorsNot supported by most application vendors

34 UPC, February 1999

Running Oracle: other issuesRunning Oracle: other issues

Most memory allocated to block buffer areaMost memory allocated to block buffer area Need to eliminate OS double bufferingNeed to eliminate OS double buffering Best performance attained by limiting process Best performance attained by limiting process

migrationmigration In large SMPs, dedicating one processor to I/O may In large SMPs, dedicating one processor to I/O may

be advantageousbe advantageous

35 UPC, February 1999

TPC Database BenchmarksTPC Database Benchmarks

Transaction Processing Performance Council (TPC)Transaction Processing Performance Council (TPC)Established about 10 years agoEstablished about 10 years agoMission: define representative benchmark standards Mission: define representative benchmark standards

for vendors (hardware/software) to compare their for vendors (hardware/software) to compare their productsproducts

Focus on both performance and price/performanceFocus on both performance and price/performanceStrict rules about how the benchmark is ranStrict rules about how the benchmark is ranOnly widely used benchmarksOnly widely used benchmarks

36 UPC, February 1999

TPC pricing rulesTPC pricing rules

Must includeMust includeAll hardwareAll hardware

– server, I/O, networking, switches, clientsserver, I/O, networking, switches, clientsAll softwareAll software

– OS, any middleware, database engineOS, any middleware, database engine5-year maintenance contract5-year maintenance contractCan include usual discountsCan include usual discountsAudited components must be products Audited components must be products

37 UPC, February 1999

TPC history of benchmarksTPC history of benchmarks TPC-ATPC-A

First OLTP benchmarkFirst OLTP benchmark Based on Jim Gray’s Debit-Credit benchmarkBased on Jim Gray’s Debit-Credit benchmark

TPC-BTPC-B Simpler version of TPC-ASimpler version of TPC-A Meant as a stress test of the server onlyMeant as a stress test of the server only

TPC-CTPC-C Current TPC OLTP benchmarkCurrent TPC OLTP benchmark Much more complex than TPC-A/BMuch more complex than TPC-A/B

TPC-DTPC-D Current TPC DSS benchmarkCurrent TPC DSS benchmark

TPC-WTPC-W New Web-based e-commerce benchmarkNew Web-based e-commerce benchmark

38 UPC, February 1999

The TPC-B benchmarkThe TPC-B benchmark Models a bank with many branchesModels a bank with many branches

1 transaction type: account update1 transaction type: account update

Metrics: Metrics: tpsB (transactions/second)tpsB (transactions/second) $/tpsB$/tpsB

Scale requirement:Scale requirement: 1 tpsB needs 100,000 accounts 1 tpsB needs 100,000 accounts

Branch

Teller Account

History

100,00010

Begin transaction Update account balance Write entry in history table Update teller balance Update branch balanceCommit

39 UPC, February 1999

TPC-B: other requirementsTPC-B: other requirements

System must be ACIDSystem must be ACID (A)tomicity(A)tomicity

– transactions either commit or leave the system as if transactions either commit or leave the system as if were never issuedwere never issued

(C)onsistency(C)onsistency– transactions take system from a consistent state to transactions take system from a consistent state to

anotheranother (I)solation(I)solation

– concurrent transactions execute as if in some serial concurrent transactions execute as if in some serial orderorder

(D)urability(D)urability– results of committed transactions are resilient to faultsresults of committed transactions are resilient to faults

40 UPC, February 1999

The TPC-C benchmarkThe TPC-C benchmark

Current TPC OLTP benchmarkCurrent TPC OLTP benchmark

Moderately complex OLTPModerately complex OLTP

Models a wholesale supplier managing ordersModels a wholesale supplier managing orders

Workload consists of five transaction typesWorkload consists of five transaction types

Users and database scale linearly with throughputUsers and database scale linearly with throughput

Specification was approved July 23, 1992Specification was approved July 23, 1992

41 UPC, February 1999

TPC-C: schemaTPC-C: schema

WarehouseWarehouseWW

LegendLegend

Table NameTable Name<cardinality><cardinality>

one-to-manyone-to-manyrelationshiprelationship

secondary indexsecondary index

DistrictDistrictW*10W*10

1010

CustomerCustomerW*30KW*30K

3K3K

HistoryHistoryW*30K+W*30K+

1+1+

ItemItem100K (fixed)100K (fixed)

StockStockW*100KW*100K100K100K WW

OrderOrderW*30K+W*30K+1+1+

Order-LineOrder-LineW*300K+W*300K+

10-1510-15

New-OrderNew-OrderW*5KW*5K0-10-1

42 UPC, February 1999

TPC-C: transactionsTPC-C: transactions

New-order: enter a new order from a customerNew-order: enter a new order from a customer Payment: update customer balance to reflect a Payment: update customer balance to reflect a

paymentpayment Delivery: deliver orders (done as a batch Delivery: deliver orders (done as a batch

transaction)transaction) Order-status: retrieve status of customer’s most Order-status: retrieve status of customer’s most

recent orderrecent order Stock-level: monitor warehouse inventoryStock-level: monitor warehouse inventory

43 UPC, February 1999

TPC-C: transaction flowTPC-C: transaction flow

22

11

Select txn from menu:Select txn from menu:1. New-Order 1. New-Order 45%45%2. Payment 2. Payment 43%43%3. Order-Status3. Order-Status 4%4%4. Delivery 4. Delivery 4%4%5. Stock-Level 5. Stock-Level 4%4%

Input screenInput screen

Output screenOutput screen

Measure menu Response TimeMeasure menu Response Time

Measure txn Response TimeMeasure txn Response Time

Keying time

Think time

33

Go back to 1Go back to 1

44 UPC, February 1999

TPC-C: other requirementsTPC-C: other requirements

TransparencyTransparency tables can be split horizontally and vertically provided tables can be split horizontally and vertically provided

it is hidden from the applicationit is hidden from the application SkewSkew

1% of new-order txn are to a random remote 1% of new-order txn are to a random remote warehousewarehouse

15% of payment txn are to a random remote 15% of payment txn are to a random remote warehousewarehouse

Metrics:Metrics:performance: new-order transactions/minute (tpmC)performance: new-order transactions/minute (tpmC)cost/performance: $/tpmCcost/performance: $/tpmC

45 UPC, February 1999

TPC-C: scaleTPC-C: scale

Maximum of 12 tpmC per warehouseMaximum of 12 tpmC per warehouse Consequently:Consequently:

A quad-Xeon system today (~20,000 tpmC) needsA quad-Xeon system today (~20,000 tpmC) needs– over 1668 warehousesover 1668 warehouses

– over 1 TB of disk storage!!over 1 TB of disk storage!!

That’s a VERY expensive benchmark to run!That’s a VERY expensive benchmark to run!

46 UPC, February 1999

TPC-C: side effects of the skew rulesTPC-C: side effects of the skew rules

Very small fraction of transactions go to remote Very small fraction of transactions go to remote warehouseswarehouses

Transparency rules allow data partitioningTransparency rules allow data partitioning Consequence:Consequence:

Clusters of powerful machines show exceptional Clusters of powerful machines show exceptional numbersnumbers

Compaq has current TPC-C record of over 100 Compaq has current TPC-C record of over 100 KtpmC with an 8-node memory channel clusterKtpmC with an 8-node memory channel cluster

Skew rules are expected to change in the futureSkew rules are expected to change in the future

47 UPC, February 1999

The TPC-D benchmarkThe TPC-D benchmark

Current DSS benchmark from TPCCurrent DSS benchmark from TPC Moderately complex decision support workloadModerately complex decision support workload Models a worldwide reseller of partsModels a worldwide reseller of parts Queries ask real world business questionsQueries ask real world business questions 17 ad hoc DSS queries (Q1 to Q17)17 ad hoc DSS queries (Q1 to Q17) 2 update queries2 update queries

48 UPC, February 1999

TPC-D: schemaTPC-D: schema

CustomerCustomerSF*150KSF*150K

LineItemLineItemSF*6000KSF*6000K

OrderOrderSF*1500KSF*1500K

SupplierSupplierSF*10KSF*10K

NationNation2525

RegionRegion55

PartSuppPartSuppSF*800KSF*800K

PartPartSF*200KSF*200K

49 UPC, February 1999

TPC-D: scaleTPC-D: scale

Unlike TPC-C, scale not tied to performanceUnlike TPC-C, scale not tied to performance Size determined by a Scale Factor (SF)Size determined by a Scale Factor (SF)

SF = {1,10,30,100,300,1000,3000,10000}SF = {1,10,30,100,300,1000,3000,10000} SF=1 means a 1GB database sizeSF=1 means a 1GB database size Majority of current results are in the 100GB and Majority of current results are in the 100GB and

300GB range300GB range Indices and temporary tables can significantly Indices and temporary tables can significantly

increase the total disk capacity. (3-5x is typical)increase the total disk capacity. (3-5x is typical)

50 UPC, February 1999

TPC-D example queryTPC-D example query

Forecasting Revenue Query (Q6)Forecasting Revenue Query (Q6) This query quantifies the amount of revenue increase that would have resulted from This query quantifies the amount of revenue increase that would have resulted from

eliminating company-wide discounts in a given percentage range in a given year. eliminating company-wide discounts in a given percentage range in a given year. Asking this type of “what if” query can be used to look for ways to increase Asking this type of “what if” query can be used to look for ways to increase revenuesrevenues

Considers all line-items shipped in a yearConsiders all line-items shipped in a year Query definition:Query definition: SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM SELECT SUM(L_EXTENDEDPRICE*L_DISCOUNT) AS REVENUE FROM LINEITEM

WHERE L_SHIPDATE >= DATE ‘WHERE L_SHIPDATE >= DATE ‘[DATE][DATE]]’ ]’ AND L_SHIPDATE < DATE ‘AND L_SHIPDATE < DATE ‘[DATE][DATE]’ + INTERVAL ‘1’ YEAR ’ + INTERVAL ‘1’ YEAR AND L_DISCOUNTBETWEEN AND L_DISCOUNTBETWEEN [DISCOUNT][DISCOUNT] - 0.01 AND - 0.01 AND [DISCOUNT][DISCOUNT] + 0.01 + 0.01 AND L_QUANTITY < AND L_QUANTITY < [QUANTITY][QUANTITY]

51 UPC, February 1999

TPC-D execution rulesTPC-D execution rules Power TestPower Test

Queries submitted in a single stream (i.e., no concurrency)Queries submitted in a single stream (i.e., no concurrency) Each Query Set is a permutation of the 17 read-only queriesEach Query Set is a permutation of the 17 read-only queries

Throughput TestThroughput Test

Multiple concurrent query streams Multiple concurrent query streams Single update stream Single update stream

CacheCache FlushFlush

QueryQuerySet 0Set 0(optional)(optional)

UF1UF1 QueryQuerySet 0Set 0

UF2UF2

Timed SequenceTimed SequenceWarm-up, not timedWarm-up, not timed

Query Set 1Query Set 1Query Set 2Query Set 2

Query Set NQuery Set NUF1 UF2 UF1 UF2 UF1 UF2UF1 UF2 UF1 UF2 UF1 UF2Updates:Updates:

.. .. ..

52 UPC, February 1999

TPC-D: metricsTPC-D: metrics

Power Metric (QppD)Power Metric (QppD)Geometric Mean Geometric Mean

Throughput (QthD)Throughput (QthD)Arithmetic MeanArithmetic Mean

Both Metrics represent Both Metrics represent “Queries per Gigabyte Hour”“Queries per Gigabyte Hour”

QppD Size SF

QI i UI jj

j

i

i@

( , ) ( , )

3600

0 0191

2

1

17

where

QI(i,0) Timing Interval for Query i, stream 0

UI(j,0) Timing Interval for Update j, stream 0

SF Scale Factor

QthD SizeS

SFTS@

17

3600

where:

S number of query streams

T elapsed time of test (in seconds)S

53 UPC, February 1999

TPC-D: metrics(2)TPC-D: metrics(2)

Composite Query-Per-Hour Rating (QphD)Composite Query-Per-Hour Rating (QphD)The Power and Throughput metrics are combined to The Power and Throughput metrics are combined to

get the composite queries per hour.get the composite queries per hour.

Reported metrics are:Reported metrics are:– Power: QppD@SizePower: QppD@Size

– Throughput: QthD@SizeThroughput: QthD@Size

– Price/Performance: $/QphD@SizePrice/Performance: $/QphD@Size

QphD Size QppD Size QthD Size@ @ @

54 UPC, February 1999

TPC-D: other issuesTPC-D: other issues

Queries are complex and long-runningQueries are complex and long-running Crucial that DB engine parallelizes queries for Crucial that DB engine parallelizes queries for

acceptable performanceacceptable performance Quality of query parallelizer is the most important Quality of query parallelizer is the most important

factorfactor Large improvements are still observed from Large improvements are still observed from

generation to generation of softwaregeneration to generation of software

55 UPC, February 1999

The TPC-W benchmarkThe TPC-W benchmark

Just introducedJust introduced Represent a business that markets and sells over Represent a business that markets and sells over

the internetthe internet Includes security/authenticationIncludes security/authentication Uses dynamically generated pages (e.g. cgi-bins)Uses dynamically generated pages (e.g. cgi-bins) Metric: Web Interactions Per Second (WIPS)Metric: Web Interactions Per Second (WIPS) Transactions:Transactions:

Browse, shopping-cart, buy, user-registration, and Browse, shopping-cart, buy, user-registration, and searchsearch

56 UPC, February 1999

A look at current audited TPC-C systemsA look at current audited TPC-C systems

Leader in price/performance:Leader in price/performance:Compaq ProLiant 7000-6/450, MS SQL 7.0, NTCompaq ProLiant 7000-6/450, MS SQL 7.0, NT

– 4x 450MHz Xeons, 2MB cache, 4GB DRAM, 1.4 TB 4x 450MHz Xeons, 2MB cache, 4GB DRAM, 1.4 TB diskdisk

– 22,479 tpmC, $18.84/tpmC22,479 tpmC, $18.84/tpmC Leader in non-cluster performance:Leader in non-cluster performance:

Sun Enterprise 6500, Sybase 11.9, Solaris7Sun Enterprise 6500, Sybase 11.9, Solaris7– 24x 336MHz UltraSPARC IIs, 4MB cache, 24 GB 24x 336MHz UltraSPARC IIs, 4MB cache, 24 GB

DRAM, 4TB diskDRAM, 4TB disk

– 53,050 tpmC, $76.00/tpmC53,050 tpmC, $76.00/tpmC

57 UPC, February 1999

Audited TPC-C systems: price breakdownAudited TPC-C systems: price breakdown

Server sub-component pricesServer sub-component prices$/CPU $/MB DRAM $/GB Disk

Compaq Proliant $4,816.00 $3.92 $145.33Sun E6500 $15,375.00 $9.16 $382.03

Server Price Breakdown

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Compaq Proliant Sun E6500

Disk

Memory

CPU

Base

58 UPC, February 1999

Using TPC benchmarks for architecture studiesUsing TPC benchmarks for architecture studies

Brute force approach: use full audit-sized systemBrute force approach: use full audit-sized system Who can afford it?Who can afford it? How can you run it on top of a simulator?How can you run it on top of a simulator? How can you explore a wide design space?How can you explore a wide design space?

Solution: scaling down the sizeSolution: scaling down the size

59 UPC, February 1999

Careful Scaling of WorkloadsCareful Scaling of Workloads

Identify architectural issue under studyIdentify architectural issue under study Apply appropriate scaling to simplify monitoring and Apply appropriate scaling to simplify monitoring and

enable simulation studiesenable simulation studies

Most scaling experiments on real machinesMost scaling experiments on real machinessimulation-only is not a viable option!simulation-only is not a viable option!

Validation through sanity checks and comparison Validation through sanity checks and comparison with audit-sized runswith audit-sized runs

60 UPC, February 1999

Scaling OLTPScaling OLTP Forget about TPC complianceForget about TPC compliance Determine lower bound on DB sizeDetermine lower bound on DB size

monitor contention for smaller tables/indexesmonitor contention for smaller tables/indexes DB size will change with number of processorsDB size will change with number of processors

I/O bandwidth requirements vary with fraction of DB I/O bandwidth requirements vary with fraction of DB resident in memoryresident in memory

completely in-memory run: no special I/O requirementscompletely in-memory run: no special I/O requirements favor more small disks vs. few large onesfavor more small disks vs. few large ones place all redo logs on a separate diskplace all redo logs on a separate disk reduce OS double-bufferingreduce OS double-buffering

Limit number of transactions executedLimit number of transactions executed

61 UPC, February 1999

Scaling OLTP(2)Scaling OLTP(2) Achieve representative cache behaviorAchieve representative cache behavior

relevant data structures >> size of hardware caches relevant data structures >> size of hardware caches (metadata area size is key)(metadata area size is key)

maintain same number of processes/CPU as larger maintain same number of processes/CPU as larger runrun

Simplify setup by running clients on the server Simplify setup by running clients on the server machinemachine

need to make lighter-weight versions of the clientsneed to make lighter-weight versions of the clients Ensure efficient executionEnsure efficient execution

excessive migration, idle time, OS or application excessive migration, idle time, OS or application spinning distorts metricsspinning distorts metrics

62 UPC, February 1999

Scaling DSSScaling DSS Determine lower bound DB sizeDetermine lower bound DB size

sufficient work in parallel sectionsufficient work in parallel section Ensure representative cache behaviorEnsure representative cache behavior

DB >> hardware cachesDB >> hardware cachesmaintain same number of processes/CPU as large maintain same number of processes/CPU as large

runrun Reduce execution time through sampling Reduce execution time through sampling Major difficulty is ensuring representative query Major difficulty is ensuring representative query

plansplans DSS results more volatile due to improvements in DSS results more volatile due to improvements in

query optimizersquery optimizers

63 UPC, February 1999

Tuning, tuning, tuningTuning, tuning, tuning

Ensure scaled workload is running efficientlyEnsure scaled workload is running efficiently Requires a large number of monitoring runs on Requires a large number of monitoring runs on

actual hardware platformactual hardware platform Resembles “black art” on OracleResembles “black art” on Oracle Self-tuning features in Microsoft SQL 7.0 are Self-tuning features in Microsoft SQL 7.0 are

promisingpromisingability for user overrides is desirable, but missingability for user overrides is desirable, but missing

64 UPC, February 1999

Does Scaling Work?Does Scaling Work?

65 UPC, February 1999

TPC-C: scaled vs. full sizeTPC-C: scaled vs. full size

Breakdown profile of CPU cyclesBreakdown profile of CPU cycles Platform: 8-proc. AlphaServer 8400Platform: 8-proc. AlphaServer 8400

TPC-C, scaled

1-issue8% 2-issue

8%

tlb3%

repl trap5%

br/pc mispr.2%

mb3%

scache hit17%

bcache hit30%

bcache miss24%

TPC-C, full-size

1-issue11%

2-issue8%

tlb1%

repl trap2%

br/pc mispr.

3%

mb6%

scache hit22%

bcache hit20%

bcache miss27%

66 UPC, February 1999

Using simpler OLTP benchmarks:Using simpler OLTP benchmarks:

Although “obsolete” TPC-B can be used in architectural Although “obsolete” TPC-B can be used in architectural studiesstudies

TPC-C, full-size

1-issue11%

2-issue8%

tlb1%

repl trap2%

br/pc mispr.

3%

mb6%

scache hit22%

bcache hit20%

bcache miss27%

TPC-B, scaled

1-issue7%

2-issue6%

tlb2%

repl. trap5%

br/pc mispr.2%

mb9%

scache hit16%

bcache hit16%

bcache miss37%

67 UPC, February 1999

Benchmarks wrap-upBenchmarks wrap-up

Commercial applications are complex, but need to Commercial applications are complex, but need to be considered during design evaluationbe considered during design evaluation

TPC benchmarks cover a wide range of TPC benchmarks cover a wide range of commercial application areascommercial application areas

Scaled down TPC benchmarks can be used for Scaled down TPC benchmarks can be used for architecture studiesarchitecture studies

Architect needs deep understanding of the Architect needs deep understanding of the workloadworkload