Omid Efficient Transaction Management and Incremental Processing for HBase Copyright © 2013 Yahoo!...

Post on 14-Dec-2015

216 views 0 download

Tags:

Transcript of Omid Efficient Transaction Management and Incremental Processing for HBase Copyright © 2013 Yahoo!...

OmidEfficient Transaction Management and

Incremental Processing for HBase

Copyright © 2013 Yahoo! All rights reserved. No reproduction or distribution allowed without express written permission.

Daniel Gómez Ferro

21/03/2013

2Omid - Yahoo! Research

About me

Research engineer at Yahoo!

Main Omid developer

Apache S4 committer

Contributor to:• ZooKeeper• HBase

3Omid - Yahoo! Research

Outline

Motivation

Goal: transactions on HBase

Architecture

Examples

Future work

4Omid - Yahoo! Research

Motivation

Traditional DBMS:• SQL• Transactions• Scalable

NoSQL stores:• SQL• Transactions• Scalable

Some use cases require • Scalability• Consistent updates

5Omid - Yahoo! Research

Motivation

Requested feature• Support for transactions in HBase is a common request

Percolator• Google implemented transactions on BigTable

Incremental Processing• MapReduce latencies measured in hours• Reduce latency to seconds• Requires transactions

6Omid - Yahoo! Research

Incremental Processing

State0 State1SharedState

OfflineMapReduce

OnlineIncremental Processing

Fault toleranceScalability

Fault toleranceScalability

Shared state

Using Percolator, Google dramatically reduced crawl-to-index time

Transactions on HBase

8Omid - Yahoo! Research

Omid

‘Hope’ in Persian

Optimistic Concurrency Control system• Lock free• Aborts transactions at commit time

Implements Snapshot Isolation

Low overhead

http://yahoo.github.com/omid

9Omid - Yahoo! Research

Snapshot Isolation

Transactions read from their own snapshot

Snapshot:• Immutable• Identified by creation time• Values committed before creation time

Transactions conflict if two conditions apply:A) Overlap in time

B) Write to the same row

10Omid - Yahoo! Research

Transaction conflicts

r1 r2 r3 r4Time

Transaction B overlaps with A and C• B ∩ A = ∅• B ∩ C = { r4 }• B and C conflict

Transaction D doesn’t conflict

Write set:

r3 r4

r2 r4

r1 r3

Id:

A

B

C

D

Time overlap:

Omid Architecture

12

OmidClientsOmid

Clients

Architecture

1. Centralized server

2. Transactional metadata replicated to clients

3. Store transaction ids in HBase timestamps

HBaseRegionServers

Status Oracle (SO)

HBaseRegionServers

HBaseRegionServers

HBaseRegionServers

(1) (2)

(3)

Omid - Yahoo! Research

OmidClients

SO

13Omid - Yahoo! Research

Status Oracle

Limited memory• Evict old metadata• … maintaining consistency guarantees!

Keep LowWatermark • Updated on metadata eviction• Abort transactions older than LowWatermark• Necessary for Garbage Collection

14Omid - Yahoo! Research

Transactional Metadata

Status Oracle

Row LastCommitTS

r1 T20

… …

StartTS CommitTS

T10 T20

… …

Aborted

T2 T18 …

LowWatermark

T4

15Omid - Yahoo! Research

Metadata Replication

Transactional metadata is replicated to clients

Status Oracle not in critical path• Minimize overhead

Clients guarantee Snapshot Isolation• Ignore data not in their snapshot• Talk directly to HBase• Conflicts resolved by the Status Oracle

Expensive, but scalable up to 1000 clients

16

OmidClientsOmid

Clients

Architecture recap

HBaseRegionServers

HBaseRegionServers

HBaseRegionServers

HBaseRegionServers

Omid - Yahoo! Research

OmidClients

SO

Row LastCommitTS

r1 T20

… …

StartTS CommitTS

T10 T20

… …

Aborted

T2 T18 …

LowWatermark

T4

Status Oracle

17Omid - Yahoo! Research

Architecture + Metadata

Status Oracle

R LC

r1 20

ST CT

10 20

Abort

LowW

ST CT

10 20

Abort

LowW

Client

Conflict detection(not replicated)

HBase

row TS val

r1 10 hi

HBase values tagged with Start TimeStamp

18Omid - Yahoo! Research

Comparison with Percolator

Omid

ClientsHBase

StatusOracle

Percolator

ClientsBigTable

TS Server

19Omid - Yahoo! Research

Simple API

TransactionManager:Transaction begin();

void commit(Transaction t) throws RollbackException;

void rollback(Transaction t);

TTable:Result get(Transaction t, Get g);

void put(Transaction t, Put p);

ResultScanner getScanner(Transaction t, Scan s);

Example Transaction

21Omid - Yahoo! Research

ST CT

10 20

Abort

LowW

Client

Transaction Example

Status Oracle

R LC

r1 20

ST CT

10 20

Abort

LowW

HBase

row TS val

r1 10 hi

>> |

22Omid - Yahoo! Research

Status Oracle

ST CT

10 20

Abort

LowW

Client

Transaction Example

R LC

r1 20

ST CT

10 20

Abort

LowW

>> t = TM.begin()Transaction { ST: 25 }>>

t ST: 25

|

HBase

row TS val

r1 10 hi

23Omid - Yahoo! Research

Status Oracle

ST CT

10 20

Abort

LowW

Client

Transaction Example

R LC

r1 20

ST CT

10 20

Abort

LowW

>> t = TM.begin()Transaction { ST: 25 }>> TTable.get(t, ‘r1’)‘hi’>>

t ST: 25

|

HBase

row TS val

r1 10 hi

24Omid - Yahoo! Research

Status Oracle

ST CT

10 20

Abort

LowW

Client

Transaction Example

R LC

r1 20

ST CT

10 20

Abort

LowW

>> t = TM.begin()Transaction { ST: 25 }>> TTable.get(t, ‘r1’)‘hi’>> TTable.put(t, ‘r1’, ‘bye’)>>

t ST: 25R: {r1}

|

HBase

row TS val

r1 10 hi

r1 25 bye

25Omid - Yahoo! Research

Status Oracle

ST CT

10 20

Abort

LowW

Client

Transaction Example

R LC

r1 30

ST CT

25 30

Abort

LowW

20

>> TTable.get(t, ‘r1’)‘hi’>> TTable.put(t, ‘r1’, ‘bye’)>> TM.commit(t)Committed{ CT: 30 }>>

t ST: 25R: {r1}

|

HBase

row TS val

r1 10 hi

r1 25 bye

26Omid - Yahoo! Research

Status Oracle

ST CT

10 20

Abort

LowW

Client

Transaction Example

Abort

15

>> t_oldTransaction{ ST: 15 }>> TT.put(t_old, ‘r1’, ‘yo’)>> TM.commit(t_old)Aborted>>

t_oldST: 15R: {r1}HBase

row TS val

r1 15 yo

r1 25 bye

R LC

r1 30

ST CT

25 30

LowW

|

Centralized Approach

27Omid - Yahoo! Research

28Omid - Yahoo! Research

Omid Performance

Centralized server = bottleneck?• Focus on good performance• In our experiments, it never became the bottleneck

512 rows/t 1K TPS10 ms

2 rows/t100K TPS

5 ms

29Omid - Yahoo! Research

Fault Tolerance

Omid uses BookKeeper for fault tolerance• BookKeeper is a distributed Write Ahead Log

Recovery: reads log’s tail (bounded time)

Downtime < 25 s

Example Use Case

News Recommendation System

31Omid - Yahoo! Research

News Recommendation System

Model-based recommendations• Similar users belong to a cluster• Each cluster has a trained model

New article• Evaluate against clusters• Recommend to users

User actions• Train model• Split cluster

32Omid - Yahoo! Research

Transactions Advantages

All operations are consistent• Updates and queries

Needed for a recommendation system?• We avoid corner cases• Implementing existing algorithms is straightforward

Problems we solve• Two operations concurrently splitting a cluster• Query while cluster splits• …

33Omid - Yahoo! Research

Example overview

Index

Articles & User actions

Queries

IndexDataStoreWor

kers

Updates

34Omid - Yahoo! Research

Other use cases

Update indexes incrementally• Original Percolator use case

Generic graph processing

Adapt existing transactional solutions to HBase

Future Work

36Omid - Yahoo! Research

Current Challenges

Load isn’t distributed automatically

Complex logic requires big transactions• More likely to conflict• More expensive to retry

API is low level for data processing

37Omid - Yahoo! Research

Future work

Framework for Incremental Processing

• Simpler API

• Trigger-based, easy to decompose operations

• Auto scalable

Improve user friendliness• Debug tools, metrics, etc

Hot failover

Questions?

All time contributorsBen Reed

Flavio Junqueira

Francisco Pérez-Sorrosal

Ivan Kelly

Matthieu Morel

Maysam Yabandeh

Partially supported byCumuloNimbo project (ICT-257993)

funded by the European Commission

Try it!yahoo.github.com/omid

Backup

39Omid - Yahoo! Research

40Omid - Yahoo! Research

Commit Algorithm

Receive commit request for txn_i

Set CommitTimestamp(txn_i) <= new timestamp

For each modified row:• if LastCommit(row) > StartTimestamp(txn_i) => abort

For each modified row:• Set LastCommit(row) <= CommitTimestamp(txn_i)