GNW01: In-Memory Processing for Databases
-
Upload
tanel-poder -
Category
Data & Analytics
-
view
2.730 -
download
0
Transcript of GNW01: In-Memory Processing for Databases
![Page 1: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/1.jpg)
gluent.com 1
In-MemoryExecutionforDatabases
TanelPoderalongtimecomputerperformancegeek
![Page 2: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/2.jpg)
gluent.com 2
Intro:Aboutme
• TanelPõder• OracleDatabasePerformancegeek(18+years)• ExadataPerformancegeek• LinuxPerformancegeek• HadoopPerformancegeek
• CEO&co-founder:
ExpertOracleExadatabook
(2nd editionisoutnow!)
Instantpromotion
![Page 3: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/3.jpg)
gluent.com 3
GluentOracle
TeradataNoSQL
BigDataSources
MSSQL
AppX
AppY
AppZ
Gluentasadatavirtualizationlayer
OpenDataFormats!
![Page 4: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/4.jpg)
gluent.com 4
GluentAdvisor
1. Analyzes DBstorageuseandaccesspatternsforsafeoffloading
2. 500+Databasesanalyzed
3. 10+PB analyzed– 81% offloadable
4. 2-24x queryspeedup
10PBInterestedinanalyzingyourdatabase?
http://gluent.com/whitepapers
![Page 5: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/5.jpg)
gluent.com 5
Tapeisdead,diskistape,flashisdisk,RAMlocalityisking
JimGray,2006
http://research.microsoft.com/en-us/um/people/gray/talks/flash_is_good.ppt
![Page 6: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/6.jpg)
gluent.com 6
SeagateCheetah15kRPMdiskspecs
200MB/sec!
![Page 7: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/7.jpg)
gluent.com 7
SpinningdiskIOthroughput
• B-Treeindex-walking disk-basedRDBMS• 15000rpmspinningdisks• ~200random IOPSperdisk• ~8kBreadperrandomIO
• 8kB*200IOPS=1.6MB/sec perdisk
• Fullscanning basedworkloads• Potentiallymuchmoredatatoaccess&filter• Partitionpruning,zonemaps,storageindexeshelptoskipdata1• Scanonlyrequiredcolumns(formatswithlargechunksizes)• SequentialIOrateupto200MB/sec perdisk
http://www.dbms2.com/2013/05/27/data-skipping/
However,indexscanscanreadonlyasubsetofdata
![Page 8: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/8.jpg)
gluent.com 8
ScanningabunchofspinningdiskscankeepyourCPUsreallybusy!
*NoteventalkingaboutflashorRAMhere!
![Page 9: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/9.jpg)
gluent.com 9
AsimplequerybottleneckedbyCPU
9GBscanned,processedin7seconds:
~1300MB/sinPX~80MB/sperslave
![Page 10: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/10.jpg)
gluent.com 10
AcomplexquerybottleneckedbyCPU
ComplexQuery:MuchmoreCPUspenton
aggregations,joins.9GBprocessedin1.5minutes
9GB/90seconds=~100MB/sPX
6MB/sperslave
![Page 11: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/11.jpg)
gluent.com 11
Ifdisksandstoragesubsystemsaregettingsofast,whyallthebuzzaroundin-memorydatabasesystems?
*Can’twejustcachetheolddatabasefilesinRAM?
![Page 12: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/12.jpg)
gluent.com 12
AsimpleDataRetrievaltest!
• Retrieve1% rowsoutofa8GBtable:
SELECTCOUNT(*)
, SUM(order_total)FROM
orders WHERE
warehouse_id BETWEEN 500 AND 510
TheWarehouseIDsrangebetween
1and999
Testdatageneratedby
SwingBench tool
![Page 13: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/13.jpg)
gluent.com 13
DataRetrieval:TestResults• Remember,thisisaverysimplescanning+filteringquery:
TESTNAME PLAN_HASH ELA_MS CPU_MS LIOS BLK_READ------------------------- ---------- -------- -------- --------- ---------test1: index range scan * 16715356 265203 37438 782858 511231test2: full buffered */ C 630573765 132075 48944 1013913 849316test3: full direct path * 630573765 15567 11808 1013873 1013850test4: full smart scan */ 630573765 2102 729 1013873 1013850test5: full inmemory scan 630573765 155 155 14 0test6: full buffer cache 630573765 7850 7831 1014741 0
Test5&Test6runentirelyfrommemory
Source:http://www.slideshare.net/tanelp/oracle-database-inmemory-option-in-action
Butwhy50xdifferenceinCPUusage?
![Page 14: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/14.jpg)
gluent.com 14
Tapeisdead,diskistape,flashisdisk,RAMlocalityisking
JimGray,2006
http://research.microsoft.com/en-us/um/people/gray/talks/flash_is_good.ppt
![Page 15: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/15.jpg)
gluent.com 15
LatencyNumbersEveryProgrammerShouldKnow
Latency Comparison Numbers--------------------------L1 cache reference 0.5 nsBranch mispredict 5 ns
L2 cache reference 7 ns 14x L1 cacheMutex lock/unlock 25 nsMain memory reference 100 ns 20x L2 cache,
200x L1 cacheCompress 1K bytes with Zippy 3,000 ns 3 usSend 1K bytes over 1 Gbps network 10,000 ns 10 us
Read 4K randomly from SSD* 150,000 ns 150 us ~1GB/sec SSDRead 1 MB sequentially from memory 250,000 ns 250 usRound trip within same datacenter 500,000 ns 500 us
Read 1 MB sequentially from SSD* 1,000,000 ns 1,000 us 1 ms ~1GB/sec SSD, 4X memory
Disk seek 10,000,000 ns 10,000 us 10 ms 20x datacenter roundtrip
Read 1 MB sequentially from disk 20,000,000 ns 20,000 us 20 ms 80x memory,20X SSD
Send packet CA->Netherlands->CA 150,000,000 ns 150,000 us 150 ms
Source:https://gist.github.com/jboner/2841832
![Page 16: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/16.jpg)
gluent.com 16
CPU=fast
CPUL2/L3cacheinbetween
RAM=slow
![Page 17: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/17.jpg)
gluent.com 17
RAMaccessisthebottleneckofmoderncomputers
WaitsforRAMaccessshowupasCPUusageinmonitoringtools
Wanttowaitless?Doitless!
![Page 18: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/18.jpg)
gluent.com 18
CPU&cachefriendlydatastructuresarekey!
Headers,ITLentries
RowDirectory
#0hdr row
#1hdr row
#2hdr row
#3hdr row
#4hdr row
#5hdr row
#6hdr row
#7hdr row
#8hdr row
… row
#1offset#2offset#3offset
#0offset
…
Hdrbyte ColumndataLock
byteCCbyte
Col.len ColumndataCol.
len ColumndataCol.len ColumndataCol.
len
• OLTP:Block->Row->Columnformat• 8kBblocks• Greatforwrites,changes
• Field-lengthencoding• Readingcolumn#100requireswalking
throughallprecedingcolumns
• Columns(withsimilarvalues)notdenselypackedtogether
• NotCPUcachefriendlyforanalytics!
![Page 19: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/19.jpg)
gluent.com 19
Scanningcolumnardatastructures
Scanningacolumninarow-oriented datablock
Scanningacolumninacolumn-oriented compressionunit
col1 col2
col3
col4
col5
col6
col2col2
col3col3
col4col4
col5col5
col5col6
col1 col2
3…
col3 col4col4 col5
col6 col1 col2col3
col3
col4
col4
col5
col5col1 col2
col6col6
col1 col2
3…
col3 col4col4 col5
col6 col1 col2col3
col3
col4
col4
col5
col5col1 col2
col6col6
col1 col2
3…
col3 col4col4 col5
col6 col1 col2col3
col3
col4
col4
col5
col5col1 col2
col6col6 Readfilter
column(s)first.Accessonly
projectedcolumnsifmatchesfound.
Reducedmemorytraffic.More
sequentialRAMaccess,SIMD onadjacentdata.
![Page 20: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/20.jpg)
gluent.com 20
Howtomeasure thisstuff?
![Page 21: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/21.jpg)
gluent.com 21
CPUPerformanceCountersonLinux# perf stat -d -p PID sleep 30
Performance counter stats for process id '34783':
27373.819908 task-clock # 0.912 CPUs utilized86,428,653,040 cycles # 3.157 GHz 32,115,412,877 instructions # 0.37 insns per cycle
# 2.39 stalled cycles per insn7,386,220,210 branches # 269.828 M/sec
22,056,397 branch-misses # 0.30% of all branches 76,697,049,420 stalled-cycles-frontend # 88.74% frontend cycles idle 58,627,393,395 stalled-cycles-backend # 67.83% backend cycles idle
256,440,384 cache-references # 9.368 M/sec 222,036,981 cache-misses # 86.584 % of all cache refs 234,361,189 LLC-loads # 8.562 M/sec 218,570,294 LLC-load-misses # 93.26% of all LL-cache hits 18,493,582 LLC-stores # 0.676 M/sec 3,233,231 LLC-store-misses # 0.118 M/sec
7,324,946,042 L1-dcache-loads # 267.589 M/sec 305,276,341 L1-dcache-load-misses # 4.17% of all L1-dcache hits 36,890,302 L1-dcache-prefetches # 1.348 M/sec
30.000601214 seconds time elapsed
Measurewhat’sgoingoninside a
CPU!
Metricsexplainedinmyblogentry:
http://bit.ly/1PBIlde
![Page 22: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/22.jpg)
gluent.com 22
TestingdataaccesspathdifferencesonOracle12c
SELECT COUNT(cust_valid) FROM customers_nopart c WHERE cust_id > 0
Runthesamequeryonsamedatasetstoredindifferentformats/layouts.
Fulldetails:http://blog.tanelpoder.com/2015/11/30/ram-is-the-new-disk-and-how-to-measure-its-performance-part-3-cpu-instructions-cycles/
Testresultdata:http://bit.ly/1RitNMr
![Page 23: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/23.jpg)
gluent.com 23
CPUinstructionsusedforscanning/counting69Mrows
![Page 24: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/24.jpg)
gluent.com 24
AverageCPUinstructionsperrowprocessed
• Knowingthatthetablehasabout69Mrows,Icancalculatetheaveragenumberofinstructionsissuedperrowprocessed
![Page 25: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/25.jpg)
gluent.com 25
CPUcyclesconsumed(fullscansonly)
![Page 26: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/26.jpg)
gluent.com 26
CPUefficiency(Instructions-per-Cycle)
Yes,modernsuperscalarCPUscanexecutemultiple
instructionspercycle
![Page 27: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/27.jpg)
gluent.com 27
ReducingmemorywriteswithinSQLexecution
• Oldapproach:1. Readcompresseddatachunk2. Decompressdata(writedatatotemporarymemorylocation)3. Filteroutnon-matchingrows4. Returndata
• Newapproach:1. Readandfilter compressedcolumns2. Decompressonlyrequiredcolumnsofmatchingrows3. Returndata
![Page 28: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/28.jpg)
gluent.com 28
Memoryreads&writesduringinternalprocessing
Unit=MB Readonlyrequestedcolumns
Rowscountedfromchunkheaders
Scancompresseddata:fewmemorywrites
![Page 29: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/29.jpg)
gluent.com 29
Past&Future
![Page 30: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/30.jpg)
gluent.com 30
Somecommercialcolumnstorehistory
• Disk-optimizedcolumnstores• Expressway103/SybaseIQ(early‘90s)• MonetDB (early‘90s)• OracleHybridColumnarCompression(disk/OLTPoptimized)• …
• Memory-optimizedcolumnstores• …• SAPHANA(December2010)• IBMDB2withBLUAcceleration(June2013)• OracleDatabase12cwithIn-MemoryOption(July2014)• …
*Notaddressingmemory-optimizedOLTP/row-storeshere
![Page 31: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/31.jpg)
gluent.com 31
Future-proofOpenDataFormats!
• Disk-optimizedcolumnardatastructures• ApacheParquet
• https://parquet.apache.org/
• ApacheORC• https://orc.apache.org/
• Memory/CPU-cacheoptimizeddatastructures• ApacheArrow
• Notonlystorageformat• …alsoacross-system/cross-platformIPCcommunicationframework• https://arrow.apache.org/
![Page 32: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/32.jpg)
gluent.com 32
Future
1. RAMgetscheaper+bigger,notnecessarilyfaster
2. CPUcachesgetlarger
3. RAMblendswithstorageandbecomesnon-volatile
4. IOsubsystems(flash)getevenclosertoCPUs
5. IOlatenciesshrink
6. Thelatencydifferencebetweennon-volatilestorageandvolatileRAMshrinks- newdatabaselayouts!
7. CPUcacheisking– newdatastructuresneeded!
![Page 33: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/33.jpg)
gluent.com 33
References
• Slides&Videoofthispresentation:• http://www.slideshare.net/tanelp• https://vimeo.com/gluent
• Indexrangescansvsfullscans:• http://blog.tanelpoder.com/2014/09/17/about-index-range-scans-
disk-re-reads-and-how-your-new-car-can-go-600-miles-per-hour/
• RAMisthenewdiskseries:• http://blog.tanelpoder.com/2015/08/09/ram-is-the-new-disk-and-
how-to-measure-its-performance-part-1/• https://docs.google.com/spreadsheets/d/1ss0rBG8mePAVYP4hlpvjqA
AlHnZqmuVmSFbHMLDsjaU/
![Page 34: GNW01: In-Memory Processing for Databases](https://reader031.fdocuments.in/reader031/viewer/2022030305/5870fc591a28ab5f528b5dbf/html5/thumbnails/34.jpg)
gluent.com 34
Thanks!
http://gluent.com/whitepapers
Wearehiringdevelopers&dataengineers!!!
http://[email protected]
@tanelpoder