Extreme computing Databases and cloud computing · Hive and Pig Stratis D. Viglas . Databases and...
Transcript of Extreme computing Databases and cloud computing · Hive and Pig Stratis D. Viglas . Databases and...
Extreme computingDatabases and cloud computing
Stratis D. Viglas
School of InformaticsUniversity of Edinburgh
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Overview
Outline
Databases and cloud computingOverviewRelational databasesRelational data processing on Hadoop MR
NoSQL databasesBigTableHive and Pig
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Overview
Where’s your data?
• Unprecedented dataset scale• Petabyte-scale is ubiquitous (e.g., eBay, Facebook, CERN and scientific
data in general)• Produced at terabytes per day scale• Powerlaw: a few very large and a lot small datasets
scal
e
number of datasets
• Most datasets are structured• Query logs, click logs, sale records, user preferences
• Objective: large-scale data analytics• Relational databases meet MapReduce
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Overview
Relational databases vs. MapReduce
• Designed and optimised for solving different problems• Common ground, but also great differences
Relational DB s• Long- and short-running
queries• Read and write workloads• Transactional semantics (ACID)• Fixed schema, integrity
constraints• 35 years of tools, extensions,
data types• SQL for declarative query
processing, query optimisation
MapReduce
• Cluster-based data processing,fault tolerance
• No schema; up to theapplication to interpret data
• Imperative paradigm• No standard query language;
as long as it maps to the MR
dataflow• Programmer has complete
control
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Overview
Typical database workloads
• Online transaction processing (OLTP)• Real-time, low latency, highly-concurrent• Relatively small set of fixed transactional queries• Data access pattern: random reads, updates, writes (involving relatively
small amounts of data)
• Online analytical processing (OLAP)• Batch workloads, less concurrency• Complex long-running analytical queries, often ad-hoc• Data access pattern: table scans, large amounts of data involved per
query
• Typically, organisations use two DB instances• OLTP frontend→ OLAP backend• Frontend optimised for transactions, backend optimised for analytics
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational databases
Outline
Databases and cloud computingOverviewRelational databasesRelational data processing on Hadoop MR
NoSQL databasesBigTableHive and Pig
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational databases
Three basic building blocks
• Attribute (aka field)• A (name, value) pair
• Tuple (aka record, row)• A set of attributes
• Relation (aka table)• A set of tuples with the same
schema
SID
123-ABC
SID
123-ABC
Name
Mary Jones
...
...
Year
4
SID
123-ABC
Name
Mary Jones
...
...
Year
4
456-DEF John Smith ... 3
... ... ... ...
999-XYZ Jack Black ... 4
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational databases
Data manipulation
• Isolate a subset of a single relation: selection (σ), projection (π)
• Set operations: intersection, union, cross product, set difference
• More complex operations: joins (./), semi-joins, . . .
σyear=3
πname
SID
123-ABC
Name
Mary Jones
Year
4456-DEF John Smith 3999-XYZ Jack Black 4
Student
CID
ADBS
Name
Adv. Databases
Year
4QSX Querying XML 4
Course
CID
ADBS
Name
Adv. Databases
Year
4QSX Querying XML 4
SID
123-ABC
Name
Mary Jones
Year
4123-ABC Mary Jones 4999-XYZ Jack Black 4999-XYZ Jack Black 4
ADBS Adv. Databases 4QSX Querying XML 4
⋈student.year = course.yearStudent × Course
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational databases
What can we do with MapReduce?
• MapReduce is a dataflow framework• But writing a Java program to compute an average is time-consuming,
verbose, and kind of dumb• Can’t ask an analyst to do that; can’t ask an IT department to
implement one on demand
• Lessons from relational DBs
• Declarative query processing: specify what should be retrieved, nothow
• Leave it to the system to optimise processing• High-level data models and processing languages
• Other options, revisiting database issues and more tailored forlarge-scale distribution
• NoSQL: non-relational approaches to storing and retrieving data• BigTable (and HBase): a different data and physical model, more
tailored towards large-scale analytics and distributed processing• Hive and Pig: data processing languages
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Outline
Databases and cloud computingOverviewRelational databasesRelational data processing on Hadoop MR
NoSQL databasesBigTableHive and Pig
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Selections and projections
• Basically free in MR• Scan input and process it during the map phase
• For selections, test predicate; for projections, drop fields
• No need for a reduce phase
• Only limited by how quickly HDFS can stream data• Computational load is minimal; network I/O is the highest cost• Compression also helps
• Kind of like using a nuclear bomb to kill a mosquito• For example, selections are usually evaluated through indexes• Key is not to identify which parts of the input satisfy the predicate, but
not read the irrelevant parts in the first place
• In a schema-less world, however, it makes sense• Difference in σage>25(T ) if we know that ∀t ∈ T has an attribute age in
the 4th position, or we don’t know which position it is in, or whether allrows have it
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Sorting
• One of the most fundamental operations in any type of dataprocessing
• MapReduce will always sort input to reducers by group key• Values within a group are arbitrarily sorted
• What if we want to sort by value also?• For example, k → (v1, r), (v3, r2), (v4, r), . . .
• Easy way out: store values in memory and sort them• Does not scale; what if the elements of a group do not fit in memory?
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Secondary sorting
• Working solution: value to key conversion• Also known as secondary sorting
• Form a composite intermediate key and let the framework do the sorting• Key becomes (k , v) pair and not simply k
• Before: k → (v1, r), (v8, r2), (v4, r), (v3, r) . . .• Values from the same group arrive in arbitrary order
• After:(k , v1)→ (v1, r)(k , v3)→ (v3, r)(k , v4)→ (v4, r)(k , v8)→ (v8, r). . .
• Values from the same group arrive in sorted order
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Aggregation
• Type of query MapReduce has been designed for• In SQL: select url, avg(time) from visits group by url
• Easy to perform in MapReduce• Map over records, use grouping attribute(s) as the key (url in the
previous example)• MapReduce will automatically group values by keys• Compute the aggregate (average in the example) in reduce phase
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Relational joins
• The join operation is ubiquitous in DB query processing• Any single query with two or more sources will need to have a join
(even in the form of a cross product)
• Any DBMS spends most of its time evaluating joins• Probably the most optimised physical operator• Radically different of join evaluation algorithms
• More so when moving to a distributed environment
• MR comes with its own join algorithms• Pretty far from a distributed or parallel DB join algorithm
• Choosing a join algorithm is not straightforward• The choice might depend on the size of the input, its properties,
available memory
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Reduce-side join
• Group by join key• Map over both sets of tuples• Tag each tuple with an input identifier
• So we can identify where each tuple came from
• Emit tuple as value with join key as the intermediate key• Runtime brings together tuples sharing the same key• Perform actual join in reducer
• Similar to a sort-merge join in relational databases terminology
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Reduce-side join
• In this example, assume|R| < |S|
• Everything takes place in thereducer
• Buffer R groups in mainmemory
• Scan corresponding S partitionforward to compute join pairs
• What if groups don’t fit inmemory?
S5
R6
S3
S10
S8
keep in mainmemory
scan forward and cross referencewith records from other set
R9
R12
}
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Map-side join
R2
R1
R4
R10
R8
scan forwardto compute join
R3
R5
S5
S6
S3
S11
S1
S9
S12
• Relational merge-join• If both inputs are sorted on join key, the
join can be computed in one sequentialscan
• Partition and sort both inputs in parallel• Partition inputs consistently in terms of
ranges• E.g., 0− 30, 31− 60, 61− 90, . . .
• If both inputs are already partitioned, joincan be computed in the Map phase
• Reduce phase not necessary
• Keep inputs pre-partitioned on the join key• Similar to clustering (or even indexing) in
relational databases
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
In-memory join
• Scenario: two relations R and S where |R| � |S| and R fits intomain memory
• Typical case: a key-foreign key join in normalised schemata, or afacts-dimensions join in a data warehouse
• MapReduce implementation is a variant of map-side join, based onreplication (no need for a reduce phase)
1 Distribute R to all workers2 Run map phase over S, each mapper loads R in memory and builds a
hash table for it3 For every s ∈ S probe hash table for R for matches and output each
matching 〈s, r〉, r ∈ R pair
• If R does not fit into main memory• Divide it into n subsets Ri , i = 1, 2, 3, . . . n, such that each Ri fits in
main memory• R ./ S =
⋃ni=1 Ri ./ S
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Relational data processing on Hadoop MR
Which join to use when?
• If there is enough memory to hold the smaller relation, usein-memory join
• If both inputs are sorted and pre-partitioned consistently on the joinattributes, use map-side join
• If map-side joins are not applicable, use a reduce-side join since it isthe most general and always applicable
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Outline
Databases and cloud computingOverviewRelational databasesRelational data processing on Hadoop MR
NoSQL databasesBigTableHive and Pig
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Motivation
• Two potential bottlenecks with RDBMSs
1 Schema rigidity: not optimised for evolving and/or non-uniformschemata
2 Scale-out: sharding and partitioning work great but are hard to get right
• Three driving application scenarios1 In the majority of (Web) applications we only need a key-value interface
• The rest of the information is relatively free-form2 Data consistency is not critical
• Critical data will be managed by a persistent transactional engine3 Automatic scale-out
• Adding both data and hardware should be transparent
• Hence, NoSQL stores were introduced1
1Term evolution: started as ‘No means no’, became ‘No means not-only’, now ‘NewSQL’is picking up traction.
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Assumptions and use-cases
• Datasets• Data does not fit in one server or a single rack, and SAN s (Storage
Area Networks2) are too expensive• Data partitioning is imperative
• Reliability• System must be continuously available to serve data• Machines and disks will fail; data and availability should not be
compromised• Data replication is imperative
• Performance and trade-offs• Commodity boxes and disks• Good performance and availability on straightforward setups
2Dedicated network that provides consolidated access to block-level storage.Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Classification
• Key-value stores• Basic association maps
• (Wide-)Column stores• Each key is associated with a large number of attributes (columns)• Provide a relational-like interface• BigTable is the typical example
• Document-centric stores• Semi-structured documents (used to be XML, the hip new kid is JSON)• Implementation is usually coupled with a high-level dataflow engine
(e.g., MapReduce)
• Graph databases• Programming language constructs mapped to persistent objects• Focus is on object interconnections as opposed to lookups• Do not typically scale as well• More RDBMS-like in their use-cases
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Distributed hash tables
• Started from peer-to-peer systems and file-sharing• A lot of your favourite P2P applications work this way
• Optimised for binary objects• Evolved into a general distributed way of storing and retrieving
key-value associations• Best-known example: Chord
• Most other implementations are some permutation of its algorithms• Caters for dynamic data, node joining and leaving, and fault tolerance• Provides performance guarantees
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Distributed hash table basics
Domain
hd (d .id)
hn(n.id)
Closest successor
data node
• Data is assigned to network peers• Hash functions are applied on the
identifiers of both data and peers• Hash functions have a common
domain, (typical domain size is 2160
values)
• The closest successor of a data item inthe domain becomes responsible forthe item
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
The Chord ring
n1 n2
n3
n4
n5n6
n7
n8
• N peers ordered on a ring• Peer n maintains an i-connection to the
2i mod N positions ahead of it on thering
• Any peer can locate the peerresponsible for any data item in log Nhops
• Specialised protocol for peers joiningand leaving the network
• Normal operation: only predecessorsand successors affected
• Heartbeat messages can test theliveness of a peer
• Data is replicated across a node’ssuccessors
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Variants and usage
• Every node in the system can serve a request, so long as it knowswhere to propagate it
• Pure Chord implementation uses a progressive propagation algorithm:send the request to the “farthest” node in the identifier space to whichthere is an immediate connection
• Variants include consistent hashing (only one potential location for ahash) or a directory service
• Amazon’s offerings (SimpleDB and DynamoDB), and LinkedIn’sProject Voldemort3 are the typical examples
3http://project-voldemort.com
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
What about consistency?
• We need to have a consistent view of sequence of updates• Say data item x is avaliable at nodes m and n• Client a updates copy at m; some t time passes• Client b reads copy from n; what value does it read?
• Strict consistency: the system should always return the last write• Either a single node is responsible for each individual data item• Or there is a distributed transaction protocol in place (e.g., 2-phase
commit)• Both options do not scale well• Remember the CAP theorem?
• Eventual consistency: as time t → ∞ all nodes will eventually havethe latest version
• You would never run a production database with this consistency level,but it’s good enough for your list of facebook friends
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Wide-column stores
Row-stores
• Great for locality of access: row read/write is a single I/O
• Bad if only interested in a small subset of columns
John 25 student Juliaenterpreneur Justin 18
joke30
...
Column-stores
• Single-column data stored sequentially
• Single-row scans are problematic ...
John25 student
JuliaenterpreneurJustin
18joke
30
Column families and locality groups
• Columns exist by themselves, but can be organised intoindependent families (or locality groups)
• Row-based within a group
• Column-based across groups
John 25student
Juliaenterpreneur
Justin
18 joke30
...
multi-column family
single-column family
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Document stores
• Assume there is some structureassociated with the dataset
• The dataset is a document• Arbitrarily nested key-value sets• Embedded into the document
• The database is a collection of suchdocuments, indexed by key in aB-tree
• Different portions of the B-tree atdifferent nodes
• Collection partitioned andreplicated at document level
• Ability to index on documentattributes
{’user_id’: objectid(’123456789’),’line_items’: [
{’sku’: ’jc_123’,’name’: ’The best CD ever’,’price’: 1099},
{’sku’: ’mi_0’,’name’: ’Paper maps for iOS 6’,’price’: 395}
],’shipping’: {
’street’: ’Princes street’,’city’: ’Edinburgh’,’country’: ’UK’,’note’: ’First bench on left’
},’subtotal’: 1494,’tax’: 268,’total’: 1763
}
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
(De)normalisation
• The first thing you were taught in your undergraduate databasecourse: schema design
• Normalisation is central to this notion: keep separate thingsseparately
• For instance: students taking courses• If all information about all courses students take are inlined into their
records, then what happens if a course changes information?• Must update all student records refering to that course• Local changes are not localised
• NoSQL systems usually argue for denormalisation• Related things will be retrieved together• Updates will be infrequent
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Data design for NoSQL
• Data design is not based on functional dependencies, as in relationaldatabases
• Workload-driven design• Figure out the use cases and appropriately design your data
• In the previous example, if the workload usually requests the studentsand the courses they take, then embed course list in student record
• If the workload requests the students taking specific courses, thenembed student records in courses
• If both, use both
• Query languages and queries are not as expressive in NoSQL stores• Or rather, if the intention is to retrieve anything other than what the
representation was designed for, you’re in trouble
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Embedded documents
{title: "Schema design",content: "A long post on schema design for NoSQL DBs"comments: [
{username: ’noob’,text: ’How do you add nested comments?’
},{
username: ’expert’,text: ’Hit ctrl+enter at the end of your comment.’
},{
username: ’noob’,text: ’Thanks!’
}]
}
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
Arbitrarily nested embedded documents
{title: "Schema design",content: "A long post on schema design for NoSQL DBs"comments: [
{username: ’noob’,text: ’How do you add nested comments?’comments: [{
username: ’expert’,text: ’Hit ctrl+enter at the end of your comment.’comments: [{
username: ’noob’,text: ’Thanks!’
}]}]
}]
}
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing NoSQL databases
How much should we (de)normalise?
• One extreme is complete normalisation, the other extreme iscomplete denormalisation
• More denormalised design• Larger document size• Harder and inefficient updates• More complex representation• Faster queries
• More normalised design• Maximum flexibility• Maximum update-ability• Simplified representation• More complicated and potentially slower queries
• Most NoSQL databases cannot do joins• And cannot really do much apart from path queries and selections
• The answer is, as usual, “it depends”• With NoSQL databases, data design plays a central role• No clean interface between conceptual and physical design and
querying, as with relational databases
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Outline
Databases and cloud computingOverviewRelational databasesRelational data processing on Hadoop MR
NoSQL databasesBigTableHive and Pig
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
A different data model
• BigTable’s data model is not relational• A table is “a sparse, distributed, persistent multidimensional sorted
map”• The map is indexed by a triplet
• (row:string, column:string, time:int64)
• row and column are keys, time is a timestamp
• Bigtables are mutable at the row level• Support for insertions, deletions, lookups
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Rows and columns in more detail
"<html>..."
"<html>..."
"<html>..."
"CNN" "CNN.com"
t3t5
t6
t9 t8com.cnn.www
contents: anchor:cnnsi.com anchor:my.look.ca
• Rows are maintained in sorted lexicographic order• Applications can exploit this property for efficient row scans• Row ranges dynamically partitioned into tablets
• Columns grouped into column families• Column key = family:qualifier
• Column families provide locality hints• Unbounded number of columns per table
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Building blocks: SSTable
• The smallest and most basic building block• Persistent immutable map from keys to values
• Stored in GFS• Sequence of disk blocks with a (persistent) index for lookup• Memory-mapped for fast operation
• Two supported operations• Given a key, look up the value associated with it• Iterate over key/value pairs within a given key range
64kBblock
64kBblock
64kBblock
Index
SSTable
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Building blocks: Tablets and Tables
• Dynamically partitioned range of rows• Built from multiple SSTables
64kBblock
64kBblock
64kBblock
Index
SSTable
64kBblock
64kBblock
64kBblock
Index
SSTable
Tablet start: aardvark end: apple
• Multiple tablets make up a table• SSTables can be shared beween tablets
SSTable
Tabletaardvark apple
SSTable SSTable SSTable
Tabletapplepie boat
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Notes on the architecture
• Similar to GFS
• Single master server, multiple tablet servers
• BigTable master• Assigns tablets to tablet servers• Detects addition and expiration of tablet servers• Balances tablet server load• Handles garbage collection• Handles schema evolution
• Bigtable tablet servers• Each tablet server manages a set of tablets
• Typically between ten to a thousand tablets• Each 100− 200MB by default
• Handles read and write requests to the tablets• Splits tablets when they grow too large
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Location dereferencing
Chubby file ...
...
...
...
...
...
...
...
...
...
...
Other metadatatablets
Root tablet(1st metadata level)master file
User table 1
User table nchubby: replicated, persistent lock service; maintains tablet server locations
root tablet: root of the metadata tree
at most three levels in the metadata hierarchy
B-tree like structure, indexed by table identifier and end row
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Tablet assignment
• Master keeps track of• Set of live tablet servers• Assignment of tablets to tablet servers• Unassigned tablets
• Each tablet is assigned to one tablet server at a time• Tablet server maintains an exclusive lock on a file in Chubby• Master monitors tablet servers and handles assignment
• Changes to tablet structure• Table creation/deletion (master initiated)• Tablet merging (master initiated)• Tablet splitting (tablet server initiated)
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Tablet serving and I/O flow
SSTable SSTable SSTable
memtable read
write
memory
GFS
tablet log
write operations arelogged (in redo records)
recent updates kept sorted in main memory
memtable and SSTablesare merged to servethe read request
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing BigTable
Tablet management
• Minor compaction• Converts the memtable into an SSTable• Reduces memory footprint and log traffic on restart
• Merging compaction• Reads the contents of a few SSTables and the memtable, and writes
out a new SSTable• Reduces number of SSTables
• Major compaction• Merging compaction that results in only one SSTable• No deletion records, only live data
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Hive and Pig
Outline
Databases and cloud computingOverviewRelational databasesRelational data processing on Hadoop MR
NoSQL databasesBigTableHive and Pig
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Hive and Pig
High-level data processing
• Hive: data warehousing application in Hadoop• Query language is HQL , variant of SQL• Tables stored on HDFS as flat files• Developed by Facebook, now open source
• Pig: large-scale data processing system• Scripts are written in Pig Latin, a dataflow language• Developed by Yahoo!, now open source• Roughly 1/3 of all Yahoo! internal jobs
• Common idea• Provide higher-level language to facilitate large-data processing• Higher-level language is compiled to Hadoop jobs
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Hive and Pig
Hive: background and components
• Started at Facebook4
• Data was collected by nightly cron jobs into Oracle DB• Extract-transform-load (ETL) via hand-coded python• Grew from 10s of GBs (2006) to 1TB/day new data (2007), now 10x that
• Shell: allows interactive queries• Driver: session handles, fetch, execute• Compiler: parse, plan, optimize• Execution engine: DAG of stages (MR, HDFS, metadata processing)• Metastore: schema, location in HDFS, SerDe
4It had to be good for something apart from wasting my PhD students’ timeStratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Hive and Pig
Logical and physical models
• Tables• Typed columns (int, float, string, boolean)• Also: list, map
• Partitions• For example, range-partition tables by date
• Buckets• Hash partitions within ranges (useful for sampling, join optimization)
• Metastore• Database: namespace containing a set of tables• Holds table definitions (column types, physical layout)• Holds partitioning information• Can be stored in Derby, MySQL, and many other relational databases
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Hive and Pig
Hive processing
• Hive uses HQL , a declarative query language close to SQL
• HQL statements are translated into a syntax tree• Syntax tree is compiled into an execution plan of MapReduce jobs,
executed by Hadoop
SELECT s.word, s.freq, k.freqFROM shakespeare s JOIN bible kON (s.word = k.word) WHERE s.freq >= 1 AND k.freq >= 1 ORDER BY s.freq DESC LIMIT 10;
HQL query Abstract Syntax Tree
map
reduce
map
reduce
map
reduce
map
reduce
map
reduce
map
reduce
MapReduce plan
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Hive and Pig
Pig and Pig Latin
• Similar idea to Hive, but more tailored towards efficiency and aDB-like setting
• Script interface to deploy MapReduce jobs• Maintains schema and performs type checking• Rudimentary optimiser to translate Pig scripts into an efficient
physical dataflow• Sequence of one or more MapReduce jobs• Exploit heuristics and cost model to reduce intermediate data
• Dataflow is scheduled and executed• Runtime tracks job progress and any errors
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Hive and Pig
Example Pig Latin script
Visits = load ’/data/visits’ as (user, url, time);
Visits = foreach Visits
generate user, Canonicalize(url), time;
Pages = load ’/data/pages’ as (url, pagerank);
VP = join Visits by url, Pages by url;
UserVisits = group VP by user;
UserPageranks = foreach UserVisits
generate user, AVG(VP.pagerank) as avgpr;
GoodUsers = filter UserPageranks by avgpr > ’0.5’;
store GoodUsers into ’/data/good_users’;
Stratis D. Viglas www.inf.ed.ac.uk
Databases and cloud computing Hive and Pig
Java vs. Pig Latin
20406080
100120140160180
Hadoop Pig
lines
of c
ode
50
100
150
200
250
300
Hadoop Pig
min
utes
• Performance on par with raw Hadoop• But with 1/20 of the lines of code• And with 1/16 of the developement time
Stratis D. Viglas www.inf.ed.ac.uk