ExpoLab - Staring into the Abyss: An Evaluation of Concurrency … into... · 2020. 10. 3. ·...
Transcript of ExpoLab - Staring into the Abyss: An Evaluation of Concurrency … into... · 2020. 10. 3. ·...
StaringintotheAbyss:AnEvaluationofConcurrencyControlwithOneThousandCores
XiangyaoYu1GeorgeBezerra1 AndrewPavlo2
Srinivas Devadas1 MichaelStonebraker1
1CSAIL,MassachusettsInstituteofTechnology
2Dept.ofComputerScienceCarnegieMellonUniversity
PublishedinVLDB2014
Presenter:VaibhavJain
1
Motivation(1)
ØTheeraofsingle-coreCPUspeed-upisover.
ØNumberofcoresonachipisincreasingexponentially§ Increasecomputationpowerbythreadlevelparallelism
§ 1000-corechipsarenear…
XeonPhi(upto61cores) Tilera (upto100cores)
2
Motivation(2)
ØIstheDBMSreadytobescaled?§ MostDBMSsstillfocusonsingle-threadedperformance
§ Existingworksonmulti-coresfocusonsmallcorecount
3
Objective
• Toevaluatetransactionprocessingat1000cores.• Focusononescalabilitychallenge:Concurrencycontrol.• Discussthebottlenecksandimprovementsneeded.
4
Implementation
• ConcurrencyControlSchemes• DBMSTestBed
5
ConcurrencyControlSchemes
CC Scheme Description
DL_DETECT 2PLwithdeadlockdetection
NO_WAIT 2PLwithnon-waitingdeadlockprevention
WAIT_DIE 2PLwithwait-and-diedeadlockprevention
TIMESTAMP Basic T/Oalgorithm
MVCC Multi-version T/O
OCC Optimisticconcurrencycontrol
HSTORE T/Owithpartition-levellocking
Two–PhaseLocking(2PL)
TimestampOrdering(T/O)
Partitioning
6
Two-PhaseLocking(1)
7
Two-PhaseLocking(2)
8
ØLockconflict§ DL_DETECT:alwayswait.§ NO_WAIT:alwaysabort.§ WAIT_DIE:waitifolder,otherwiseabort
ØExamplesystems§ Ingres,Informix,IBMDB2,MSSQLServer,MySQL(InnoDB)
deadlockdetection
deadlockprevention
ConcurrencyControlSchemes
CC Scheme Description
DL_DETECT 2PLwithdeadlockdetection
NO_WAIT 2PLwithnon-waitingdeadlockprevention
WAIT_DIE 2PLwithwait-and-diedeadlockprevention
TIMESTAMP Basic T/Oalgorithm
MVCC Multi-version T/O
OCC Optimisticconcurrencycontrol
HSTORE T/Owithpartition-levellocking
Two–PhaseLocking(2PL)
TimestampOrdering(T/O)
Partitioning
9
TimestampOrdering(T/O)(1)
Eachtransactionhasauniquetimestampindicatingtheserialorder.1.TIMESTAMP(BasicTimestampOrdering)• R/Wrequestrejectediftx timestamp <timestampoflastwrite.
2.MVCC(Multi-VersionConcurrencyControl)• Everywriteopcreatesanewtimestampedversion• Forreadop,DBMSdecideswhichversionitaccesses.
10
TimestampOrdering(T/O)(2)
3.OCC(Optimistic Concurrency Control)• Privateworkspaceofeachtransaction.• Atcommittime,ifanyoverlap,tx isabortedandrestarted.• Advantage:shortcontentionperiod.
ExamplesystemsOracle,Postgres,MySQL(InnoDB),SAPHANA,MemSQL,MSHekaton
11
ConcurrencyControlSchemes
CC Scheme Description
DL_DETECT 2PLwithdeadlockdetection
NO_WAIT 2PLwithnon-waitingdeadlockprevention
WAIT_DIE 2PLwithwait-and-diedeadlockprevention
TIMESTAMP Basic T/Oalgorithm
MVCC Multi-version T/O
OCC Optimisticconcurrencycontrol
HSTORE T/Owithpartition-levellocking
Two–PhaseLocking(2PL)
TimestampOrdering(T/O)
Partitioning
12
H-Store
• Databasedividedintodisjointmemorysubsetscalledpartitions.• Eachpartitionprotectedbylocks.• Tx acquireslockstoallpartitionsitneedstoaccess.• DBMSassignsitatimestampandaddsittolockqueues.
13
DBMSTestBed(1)Graphite:CPUsimulator,scalesupto1024cores.• Applicationthreadsmappedtosimulatedcorethreads.• Simulatedthreadsmappedtomultipleprocessesonhostmachines.
14
DBMSTestBed(2)
• Implementedlight-weightpthread basedDBMS.• Allowstoswapdifferentconcurrencyschemes.• Ensuresnootherbottlenecksthanconcurrencycontrol.• Reportstransactionstatistics.
15
GeneralOptimizations
1. MemoryAllocation:Custommalloc ,resizablememorypoolforeachthread.2.LockTable:Insteadofcentralizedlocktable,per-tuplelocks3.Mutexes:Avoidmutex oncriticalpath.- For2PL,centralizeddeadlockdetector- Fort/o:allocatinguniquetimestamps.
16
Scalable2PL
1. DeadlockDetection- Makingdeadlockdetectorlockfreebykeepinglocalwait-forgraph.- Threadsearchesforcyclesinpartialwait-forgraph.
2.LockThrashing- Holdinglocksuntilcommit=>bottleneckinconcurrentTxs.- Timeoutthreshold:abortTx ifwaittimeexceedstimeout.
17
ScalableT/O
1. TimestampAllocationa) Batchedatomicaddition- Managerreturnsmultipletimestampsforarequest.b)CPUclocks- Readlogicalclockofcore,concatenatewiththreadid.- requiressynchronizedclocks.c) Hardwarecounters- PhysicallylocatedatcenterofCPU.
18
EvaluationRead-OnlyWorkload
19
ReadOnlyWorkload
20
Ø 2PLschemesarescalableforreadonlybenchmarks
ReadOnlyWorkload
21
Ø 2PLschemesarescalableforreadonlybenchmarksØ Timestampallocationlimitsscalability
ReadOnlyWorkload
22
Ø 2PLschemesarescalableforreadonlybenchmarksØ TimestampallocationlimitsscalabilityØMemorycopyhurtsperformance
WriteIntensive(mediumcontention)
23
No_Wait,Wait_Die scalesbetterthanothers.DL_Detect inhibitedbylockthrashing.
WriteIntensive(Highcontention)
24
Ø Scalingstopsatsmallcorecount(64)
WriteIntensive(Highcontention)
25
Ø Scalingstopsatsmallcorecount(64)ØNO_WAIThasgoodperformancebutfallsduetothrashing.
WriteIntensive(Highcontention)
26
Ø Scalingstopsatsmallcorecount(64)ØNO_WAIThasgoodperformancebutfallsduetothrashing.ØOCCwinsat1000cores asoneTx alwayscommits.
MoreAnalysis
1. ShortTransactions=>LowLockcontentionLongerTransactions=>Timestampallocationnotabottleneck.
2. Morereadtransactions=>Betterthroughput.
3. Multipartitiontransactions=>H-Storeschemeperformsbad.Partitionedworkloads=>H-Storebestalgorithm
27
BottlenecksSummary
28
ConcurrencyControl
Waiting(Thrashing)
HighAbortRate
TimestampAllocation
Multi-partition
DL_DETECT
NO_WAIT
WAIT_DIE
TIMESTAMP
MULTIVERSION
OCC
HSTORE
Summary
Allalgorithmsfailtoscaleascoreincreases.ØThrashing limitsthescalabilityof2PLalgorithmsØTimestampallocation limitsthescalabilityofT/Oalgorithms
29
ProjectIdeas
• Newconcurrencycontrolapproachestotacklescalabilityproblem.• HardwaresolutionstoDBMSbottlenecksunsolvableinsoftwareside.• Hybridapproach:Switchb/wschemesdependingonworkload.
30
Questions
31
Thrashing
32
v"uz"y"x"tuples
transactions A" B" C" D"
Locking Waiting