Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
-
Upload
sameer-tiwari -
Category
Technology
-
view
1.758 -
download
4
description
Transcript of Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis
![Page 1: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/1.jpg)
Storage Systems forBig Data
Sameer TiwariHadoop Storage Architect, Pivotal [email protected], @sameertech
![Page 2: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/2.jpg)
Storage Systems forBig Data
Sameer TiwariHadoop Storage Architect, Pivotal [email protected], @sameertech
![Page 3: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/3.jpg)
Storage Hierarchy
HDFS
HBase
Posix filesystem. *nixGeneral purpose FS
- Large Distributed Storage- High aggregate throughput
- Large indexed Tables- Fast Random access- Consistent
Redis- In-memory KV Store- Extremely fast access
Other KV Store(s)
![Page 4: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/4.jpg)
Storage Hierarchy
HDFS
HBase
Posix filesystem. *nixGeneral purpose FS
- Large Distributed Storage- High aggregate throughput
- Large indexed Tables- Fast Random access- Consistent
Redis- In-memory KV Store- Extremely fast access
Other KV Store(s)
![Page 5: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/5.jpg)
Hadoop Distributed File System(HDFS)
● History○ Based on Google File System Paper (2003)○ Built at Yahoo by a small team
● Goals○ Tolerance to Hardware failure○ Sequential access as opposed to Random○ High aggregated throughput for Large Data Sets○ “Write Once Read Many” paradigm
![Page 6: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/6.jpg)
HDFS - Key Components
Client1-FileA
NameNode
DataNode 1 DataNode 2 DataNode 3 DataNode 4
Client2-FileB
Rack 1 Rack 2
![Page 7: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/7.jpg)
HDFS - Key Components
Client1-FileA
NameNode
DataNode 1 DataNode 2 DataNode 3 DataNode 4
Client2-FileB
Rack 1 Rack 2
File.create() MetaDataNN OPs
FileA: Metadata e.g. Size, Owner...AB1:D1, AB1:D3, AB1:D4AB2:D1, AB2:D3, AB2:D4
FileB: Metadata e.g. Size, Owner...BB1:D1, BB1:D2, BB1:D4
![Page 8: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/8.jpg)
HDFS - Key Components
Client1-FileA
NameNode
DataNode 1 DataNode 2 DataNode 3 DataNode 4
Client2-FileB
Rack 1 Rack 2
File.create() MetaDataNN OPs
FileA: Metadata e.g. Size, Owner...AB1:D1, AB1:D3, AB1:D4AB2:D1, AB2:D3, AB2:D4
FileB: Metadata e.g. Size, Owner...BB1:D1, BB1:D2, BB1:D4
AB1
BB1
Data BlocksDN OPs
File.write()
![Page 9: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/9.jpg)
HDFS - Key Components
Client1-FileA
NameNode
DataNode 1 DataNode 2 DataNode 3 DataNode 4
AB1 AB2 BB1
BB1
AB1
BB1
AB1
Client2-FileB
Rack 1 Rack 2
AB2 AB2
File.create() MetaDataNN OPs
Data BlocksDN OPs
File.write()
FileA: Metadata e.g. Size, Owner...AB1:D1, AB1:D3, AB1:D4AB2:D1, AB2:D3, AB2:D4
FileB: Metadata e.g. Size, Owner...BB1:D1, BB1:D2, BB1:D4
Replication PipeLining
![Page 10: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/10.jpg)
Client1-FileA
NameNode
HDFS - Communication
HDFS Client API. RPC:ClientProtocol
![Page 11: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/11.jpg)
Client1-FileA
NameNode
DataNode 1
AB1 AB2
BB1
HDFS - Communication
HDFS Client API. RPC:ClientProtocol
HDFS Client API- DataNodeProtocol- Non-RPC, Streaming- Heavy Buffering
![Page 12: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/12.jpg)
Client1-FileA
NameNode
DataNode 1 DataNode 2
AB1 AB2 BB1
BB1
HDFS - Communication
HDFS Client API. RPC:ClientProtocol
HDFS Client API- DataNodeProtocol- Non-RPC, Streaming- Heavy Buffering
RPC:DataNodeProtocolDN registration: At init timeHeart Beat: Stats about Activity and Capacity (secs)Block Report: List of blocks (hour)Block Received: (Triggered by Client upload)
AB2
![Page 13: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/13.jpg)
Client1-FileA
NameNode
DataNode 1 DataNode 2
AB1 AB2 BB1
BB1
HDFS - Communication
HDFS Client API. RPC:ClientProtocol
HDFS Client API- DataNodeProtocol- Non-RPC, Streaming- Heavy Buffering
RPC:DataNodeProtocolDN registration: At init timeHeart Beat: Stats about Activity and Capacity (secs)Block Report: List of blocks (hour)Block Received: (Triggered by Client upload)
AB2ReplicationPipeLining.Streaming
![Page 14: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/14.jpg)
HDFS - NameNode 1 of 4
● Heart of HDFS. Typically Lots of Memory ~128Gigs● Hosts two important tables● The HDFS Namespace: File->Block mapping
○ Persisted for backup● The iNode table: Block->Datanode mapping
○ Not persisted.○ Re-built from block reports
● HDFS is Journaled File system○ Maintains a WAL called edit log○ Edit log is merged into fsimage at a preset log size
![Page 15: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/15.jpg)
HDFS - NameNode 2 of 4
● Can take on 3 roles● Regular mode: Hosts the HDFS Namespace● Backup mode: Secondary NN
○ Downloads fsimage regularly○ Merges changes to namespace○ Its a misnomer, it more of a checkpointing server
● Safemode: Startup time○ Its a R/O mode○ Collects data from active DNs
![Page 16: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/16.jpg)
HDFS - NameNode 3 of 4HA using Quorum Journal Manager (Hadoop 2.0+)
Active NN Standby NN
DataNodesDataNodes
DataNodesDataNodes
JournalNodesJournal
NodesJournalNodes
ClientsClients
Clients
ZK ClusterZK
ClusterZK Cluster
![Page 17: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/17.jpg)
HDFS - NameNode 4 of 4
● Replication Monitor: Fix over/under replicated blocks○ Replica Modes: Corrupt, Current, Out-of-date,
under-construction● Lease Management: During file creation
○ Ensures single writer (multiple readers are ok)○ Synchronously checks active lease○ Asynchronously checks the entire Tree of leases
● Heartbeat monitor: Collects DN stats and marks them down if no heartbeat recvd for ~10mins.
![Page 18: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/18.jpg)
HDFS - DataNode
● Typical Machine: ~ 4TB X 12 disks JBOD● Has no idea about HDFS, only knows about blocks● Serves 2 types of requests
○ NN requests for Block create/delete/replicate○ Serves Block R/W requests from Clients
● Maintains only one table○ Block->Real Bytes on the local FS○ Stored locally and not backed up○ DN can re-build this table by scanning its local dir
![Page 19: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/19.jpg)
● Creates a chksum file for each block● Runs blockScanner() to find corrupt blocks● DataNode to NameNode communication
○ Init - registration○ Sends HeartBeat to NN every few secs○ Block completion: blockReceived()○ Lets NN respond with block commands○ Sends full Block Report every hour
HDFS - DataNode
![Page 20: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/20.jpg)
HDFS - Typical Deployment
RACK1 . . .
Aggregator Switch 1 Aggregator Switch 2
Master Switch
TOR
RACK N(10-20)
TOR
RACK1 . . .
TOR
RACK N(10-20)
TOR
Aggregator Switch 3. . .
. . .
![Page 21: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/21.jpg)
● NN holds the Namespace in a single Java process● 64Gig Heap == ~250 million files + blocks
○ Federation sort of solves the problem○ Moving Namespace to a KV Store is one solution
● Enterprise features slowly being added○ Snapshots○ NFS access○ Geo replication○ Run Length Encoding to reduce 3X copies to 1.3X
HDFS - Limitations
![Page 22: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/22.jpg)
● Support for fadvise readahead and drop-behind● HDFS takes advantage of multiple disks
○ Individual failures do not cause DN failures○ Spills are parallelized
● Replica and Task placement○ Done by DNSToSwitchMapping():resolve()○ User supplied rack topology○ IP address -> Rack id mapping○ net.topology.* setttings in core-site.xml
HDFS - Advanced Concepts
![Page 23: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/23.jpg)
● Couple of tools for Perf monitoring○ Ganglia for HDFS○ Nagios for general health of the machine.
HDFS - Advanced Concepts
![Page 24: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/24.jpg)
Storage Hierarchy
HDFS
HBase
Posix filesystem. *nixGeneral purpose FS
- Large Distributed Storage- High aggregate throughput
- Large indexed Tables- Fast Random access- Consistent
Redis- In-memory KV Store- Extremely fast access
Other KV Store(s)
![Page 25: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/25.jpg)
Storage Hierarchy
HDFS
HBase
Posix filesystem. *nixGeneral purpose FS
- Large Distributed Storage- High aggregate throughput
- Large indexed Tables- Fast Random access- Consistent
Redis- In-memory KV Store- Extremely fast access
Other KV Store(s)
![Page 26: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/26.jpg)
HBase
● History○ Based on Google’s Big Table (2006)○ Built at Powerset (later acquired by Microsoft)○ Facebook and Yahoo use it extensively (~1000 machines)
● Goals○ Random R/W access○ Tables with Billions of Rows X Millions of Columns○ Often referred to as a “NoSQL” Data store○ High speed ingest rate. FB == ~Billion msgs+chat per day.○ Good consistency model
![Page 27: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/27.jpg)
HBase - Key Components
NameNodeJobTrackerHMaster
DataNodeTaskTrackerHRegionServer
ZK ClusterZK
ClusterZK Cluster
Client
Master(s):Active and Backup
Slaves:Many
![Page 28: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/28.jpg)
HBase - Data Model
● Google BigTable Paper on #2 says
A Bigtable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, column key, and a timestamp; each value in the map is an uninterpreted array of bytes
Let’s break that down over the next few slides...
![Page 29: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/29.jpg)
HBase - Data Model
● Data is stored in Tables● Tables have Rows and Columns● Thats where the similarity ends
○ Columns are grouped into Column Families
● Rows are stored in a sorted(increasing) order○ Implies, there is only one primary key
● Rows can be sparsely populated○ Variable length rows are common
● Same row can be updated multiple times○ Each will be stored as a versioned update
![Page 30: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/30.jpg)
HBase - Data ModelConceptual View
Row Key Time Stamp ColumnFamily contents ColumnFamily anchor
"com.cnn.www" t9 anchor:cnnsi.com = "CNN"
"com.cnn.www" t8 anchor:my.look.ca = "CNN.com"
"com.cnn.www" t5 contents:html = "<html>..."
"com.cnn.www" t3 contents:html = "<html>..."
Single column in “contents”byte-array
Column => Column Family: Qualifiere.g. Two Columns in the “anchor”byte-array
Versionstimemillis()
Row-Keybyte-array, Sorted by byte order
![Page 31: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/31.jpg)
HBase - Data ModelPhysical View
Row Key Time Stamp ColumnFamily anchor
"com.cnn.www" t9 anchor:cnnsi.com = "CNN"
"com.cnn.www" t8 anchor:my.look.ca = "CNN.com"
Row Key Time Stamp ColumnFamily contents
"com.cnn.www" t5 contents:html = "<html>..."
"com.cnn.www" t3 contents:html = "<html>..."
![Page 32: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/32.jpg)
HBase - Table Objects
Region1R1-R10
Logical Table
Data : R1- R40
HFileMemStore Blocks Blocks
Region Server : ~200 Regions per Server
Region Servers
Shards
HDFSBlocksHDFS
Blocks
HDFSBlocks
HLog/WAL HDFSBlocksHDFS
Blocks
Region2R11-R20 HFileMemStore Blocks Blocks
HLog/WALHDFSBlocks
![Page 33: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/33.jpg)
HBase - Data Model Operations
○ HTable class offers 4 techniques: get, put, delete and scan.○ The first 3 have a single or batch mode available
//Scan example
public static final byte[] CF1 = "empData1".getBytes();public static final byte[] ATTR1 = "empId".getBytes();HTable htable = new HTable(blah... // create an instance of HTable
Scan scan = new Scan();scan.addColumn(CF1, ATTR1);scan.setStartRow(Bytes.toBytes("200"));scan.setStopRow(Bytes.toBytes("500"));ResultScanner rs = htable.getScanner(scan);try { for (Result r = rs.next(); r != null; r = rs.next()) { // do something with it...} finally { rs.close();}
![Page 34: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/34.jpg)
HBase - Data Versioning
○ By default a put() uses timestamp, but you can override it○ Get.setMaxVersions() or Get.setTimeRange○ By default a get() returns the latest version, but you can ask for any○ All Data model operations are in !sorted order. Row:CF:Col:Version○ Delete flavors: delete col+ver, delete col, delete col family, delete row○ Deletes work by creating tombstone markers○ LIMITATIONS:
■ delete() masks a put() till a major compaction takes place■ Major compactions can change get() results
○ All operations are ATOMIC within a row
![Page 35: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/35.jpg)
HBase - Read Path
ZK ClusterZK
ClusterZK Cluster
Client
Region Server1
Q:Where is -ROOT-?A: RegionServer1
.META. Table for all regions in the system, never splits
table, startKey, id::regionInfo, Server
Q:Where is .META.?A: RegionServer2
Region Server2
Q: HTable.get()
-ROOT- Table for keeping track of .META. table
.META.,region,key:regionInfo, Server
MemStore
HFile - 1HFile - 2
1
2
3 4
5A: Row
6
![Page 36: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/36.jpg)
HBase - Write Path
ZK ClusterZK
ClusterZK Cluster
Client
Region Server1
Q:Where is -ROOT-?A: RegionServer1
.META. Table for all regions in the system, never splits
table, startKey, id::regionInfo, Server
Q:Where is .META.?A: RegionServer2
Region Server2
HTable.put()
-ROOT- Table for keeping track of .META. table
.META.,region,key:regionInfo, Server
MemStore
HLog/WAL
HDFSBlocks
1
2
3 4
5
Offline flush
6return Code
![Page 37: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/37.jpg)
HBase - Shell
○ Table MetaData: e.g. create/alter/drop/describe table○ Table Data: e.g. put/scan/delete/count row(s)○ Admin: e.g. flush/rebalance/compact regions, split tables○ Replication Tools: e.g. add/enable/list/start/stop replication○ Security: e.g. grant/revoke/list user permissions
■ Shell interaction example:■ hbase(main):001:0> create 'myTable', 'myColFam1'■ 0 row(s) in 3.8890 seconds■■ hbase(main):002:0> put 'myTable’, 'row-1', 'myColFam1:col1', 'value-1'■ 0 row(s) in 0.1840 seconds ■■ hbase(main):003:0> scan 'test'■ ROW COLUMN+CELL row-11 column=myColFam1:col1, timestamp=1457381922312, value=value-1 ■ 1 row(s) in 0.1160 seconds■■ hbase(main):004:0>
![Page 38: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/38.jpg)
HBase - Advanced Topics
○ Bulk Loading○ Cluster Replication○ Merging and Splitting of regions○ Predicate pushdown using Server side Filters○ Bloom filters○ Co-Processors○ Snapshots○ Performance Tuning
![Page 39: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/39.jpg)
HBase - What its not
○ HBase is not for everyone○ Has no support for
■ SQL■ Joins■ Secondary indexes■ Transactions■ JDBC driver
○ Works well with large deployments○ Requires good working knowledge of the Hadoop eco-system.
![Page 40: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/40.jpg)
HBase - What its good at
● Strongly consistent reads/writes● Automatic sharding● Automatic RegionServer failover● HBase supports MapReduce for using HBase as both source and sink● Works on top of HDFS● HBase provides Java Client AP and a REST/Thrift API● Block Cache and Bloom Filters support● Web UI and JMX support, for operational management
![Page 41: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/41.jpg)
Storage Hierarchy
HDFS
HBase
Posix filesystem. *nixGeneral purpose FS
- Large Distributed Storage- High aggregate throughput
- Large indexed Tables- Fast Random access- Consistent
Redis- In-memory KV Store- Extremely fast access
Other KV Store(s)
![Page 42: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/42.jpg)
Storage Hierarchy
HDFS
HBase
Posix filesystem. *nixGeneral purpose FS
- Large Distributed Storage- High aggregate throughput
- Large indexed Tables- Fast Random access- Consistent
Redis- In-memory KV Store- Extremely fast access
Other KV Store(s)
![Page 43: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/43.jpg)
Redis● Redis is an open source, in-memory key-value store, with Disk persistence● Originally written at LLOGG by Salvator Sanfilippo ~2009● Written in ANSI C and works in most Linux Systems● No external dependencies● Very small ~1MB memory per instance● Datatypes can be data-structures: String, Hash, Set, Sorted Set.● Compressed in-memory representation of data● Clients are available in lots of languages. C, C#, Clojure, Scala, Lua...
![Page 44: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/44.jpg)
Redis Key Components
Network
Single Threaded Server
Highly OptimizedNetwork Layer
CPU - 1
Highly OptimizedMemory Storage
CPU - 2
Single Threaded Server
Highly OptimizedNetwork Layer
Highly OptimizedMemory Storage
CPU - N
Single Threaded Server
Highly OptimizedNetwork Layer
Highly OptimizedMemory Storage
Memory
![Page 45: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/45.jpg)
Redis Key Components
Network
Single Threaded Server
Highly OptimizedNetwork Layer
CPU - 1
Highly OptimizedMemory Storage
CPU - 2
Single Threaded Server
Highly OptimizedNetwork Layer
Highly OptimizedMemory Storage
CPU - N
Single Threaded Server
Highly OptimizedNetwork Layer
Highly OptimizedMemory Storage
Memory
![Page 46: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/46.jpg)
Redis Network Layer
ClientTCP Server
- Typical request/response system- For 10K requests, 20K network calls- If each call 1ms, 20secs is lost- Use Batching:: called Pipelining- Send one response for 10K requests- Saving 10 seconds for 10K calls
![Page 47: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/47.jpg)
Redis Network Layer
ClientTCP Server
- Typical request/response system- For 10K requests, 20K network calls- If each call 1ms, 20secs is lost- Use Batching:: called Pipelining- Send one response for 10K requests- Saving 10 seconds for 10K calls1,2,3,4…10000
Response Queue
![Page 48: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/48.jpg)
Redis Network Layer
ClientTCP Server
- Typical request/response system- For 10K requests, 20K network calls- If each call 1ms, 20secs is lost- Use Batching:: called Pipelining- Send one response for 10K requests- Saving 10 seconds for 10K calls1,2,3,4…10000
Response Queue
● Bypass OS socket layer abstraction○ Uses low level epoll(), kqueue(), select() calls
● Low overhead of waiting threads.● Allows, handling of close to 10K concurrent clients
![Page 49: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/49.jpg)
Redis Memory Optimizations● Integer encoding for small values● Small hashes are converted to arrays
○ Leverage CPU caching● Uses 32 bit version when possible● Leads to 5X to 10X memory saving
![Page 50: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/50.jpg)
Redis Enterprise Features
Redis Master
Slave1
Slave2
Async. replication
Client
Redis Master
Slave1
Slave2
Async. replication
Cluster 1
Cluster 2
Shard 1
Shard 2
![Page 51: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/51.jpg)
Redis WrapUp● Super fast in memory KV store● Provides a CLI● Typical apps will require client side coding● Spills to disk for large data-sets, with reduced performance● Upcoming “cluster” feature will keep 3 copies for HA
![Page 52: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/52.jpg)
Storage Hierarchy
HDFS
HBase
Posix filesystem. *nixGeneral purpose FS
- Large Distributed Storage- High aggregate throughput
- Large indexed Tables- Fast Random access- Consistent
Redis- In-memory KV Store- Extremely fast access
Other KV Store(s)
![Page 53: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/53.jpg)
Questions?
![Page 54: Storage Systems for big data - HDFS, HBase, and intro to KV Store - Redis](https://reader034.fdocuments.in/reader034/viewer/2022052522/554bc4dcb4c90594278b5464/html5/thumbnails/54.jpg)
Storage Systems forBig Data
Sameer TiwariHadoop Storage Architect, Pivotal [email protected], @sameertech