Big table
-
Upload
manuel-correa -
Category
Software
-
view
369 -
download
2
description
Transcript of Big table
Presented byKevin Warrick
Manuel Correa
BigTableBigTable
BigTable is a distributed storage system for managing structure data
Designed by Google Inc. in 2006
BigTable was designed to scale to petabytes of data and thousands of machines
What is BigTable? What is BigTable?
Distributed Column – oriented Multidimensional High Availability High Performance Store System Self-managing
What is BigTable? What is BigTable?
A SQL database No joins No query engine No types No SQL Normalized schema Not necessarily a replacement for RDBMS
BigTable is not... BigTable is not...
Google has a lot of data.
Scale of data is too large even for commercial databases. Commercial databases require expensive hardware.
Google’s infrastructure is on arrays of low-cost commodity hardware, not cutting edge mainframes.
Internal database solution can be applied across a large range of Google products.
Absolute control over optimization and customization.
MotivationMotivation
BigTable is composed of several other innovative, distribution oriented components.
GFS (Google file system) - backing store Scheduler - schedules jobs onto machines Lock service - distributed lock manager for workers MapReduce - framework for large scale calculations
Building BlocksBuilding Blocks
BigTable is sparse, distributed, persistent multidimensional sorted Map
The map is index by row key, column key, timestamp. The value is an array of bytes
BigTable ModelBigTable Model
Example: WebTable
BigTable ModelBigTable Model
BigTable data is ordered lexicographically by row key– A row Range is for a table is dynamically partitioned. Each partition is
call tablet– A tablet is the unit of distribution and load balancing
Example: WebTable– Pages in the same domain are group together in contiguous rows– This makes easy to perform analysis, search and data retrieval as well
as distributed data across machines
BigTable Model - RowsBigTable Model - Rows
Columns keys are grouped together in a single unit called column families– A column family is the basic unit of access control– All data within a column family is usually of the same type– A family must be created before to add any column index– The column families rarely change. The column key may change often– Syntax: family:qualifier
Example: WebTable– A family anchor with qualifier cnn.com– Anchor:cnn.com Anchor:mydomain.com
BigTable Model – ColumnsBigTable Model – Columns
Each cell in BigTable are index by timestamp– Maintain different version of the same data– The most recent version will be first. The order of the timestamp is
decreasing– The system implements garbage collector. This takes care of unused
versions
Example: WebTable– The contents family column of a Web page has different versions
BigTable Model – TimestampsBigTable Model – Timestamps
Each cell in BigTable are index by timestamp– Maintain different version of the same data– The most recent version will be first. The order of the timestamp is
decreasing– The system implements garbage collector. This takes care of unused
versions
Example: WebTable– The contents family column of a Web page has different versions
BigTable Model – TimestampsBigTable Model – Timestamps
The implementation has three major components– A library that is linked into every client– One Master server– Many tablet servers
BigTable runs over Google File System
BigTable is store in a structure called SSTable. Each SSTable is divided into 64KB blocks. A Sstable can be loaded to Memory
BigTable ImplementationBigTable Implementation
Chubby File: Provides an namespace to access the root table. This this is the first entry point to locate a user table. The service is distributed. The cubby service is used for: Bootstrap the location of BigTable Discover server tablets Finalize tablets servers deaths
BigTable ImplementationBigTable Implementation
Root Table: contains the access point to METADATA tablet
METADATA Tablets: Provides the access to point to User Tables
The user library contains cache information about the location of the tablet
BigTable ImplementationBigTable Implementation
BigTable DemoBigTable Demo
Questions ?
BigTableBigTable