Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available...

Post on 24-May-2020

53 views 0 download

Transcript of Omid, Reloaded Scalable and Highly-Available Transaction ... · Scalable and Highly-Available...

Omid, ReloadedScalable and Highly-Available Transaction Processing

Ohad Shacham Francisco PerezSorrosal

Edward Bortnikov Eshcar Hillel

Idit Keidar Ivan Kelly Sameer ParanjpyeMatthieu Morel

Dynamic indexing at Yahoo!

Real-TimeSearch Index

Indexing Pipeline

Content Processing Serving

Que

ries

Crawl feed

Document store

Web indexing pipeline

Crawl DocprocLink

Analysis Queue

Crawlschedule

Content

QueueLinks

STORM

HBase

Transactional API over HBaseSnapshot Isolation ConsistencyScalable, Highly AvailableDatabase-neutral

Apache incubator project

Omid – transactions for key-value store

Transactions and snapshot isolation

Aborts only on write-write conflicts

Read point Write point

begin commitread(x) write(y) write(x) read(y)

Architecture

Transaction ManagerClient

Begin/Commit

Datastore Datastore Datastore Committable

Results/Timestamp

Atomic commitGet commit info

Read/Write

Conflict Detection

Transaction ManagerClient

Begin

Datastore Datastore Datastore Committable

t1

Write (k1, v1, t1)Write (k2, v2, t1)

Read (k’, last committed t’ < t1)

(k1, v1, t1) (k2, v2, t1)

Running exampletr = t1

Transaction ManagerClient

Commit: t1, {k1, k2}

Datastore Datastore Datastore Committable

t2

(k1, v1, t1) (k2, v2, t1)

Write (t1, t2)

(t1, t2)

Running exampletr = t1tc = t2

Transaction ManagerClient

Datastore Datastore Datastore Committable

Read (k1, t3)

(k1, v1, t1) (k2, v2, t1) (t1, t2)

Read (t1)

Running example

tr = t3

Transaction ManagerClient

Datastore Datastore Datastore Committable

t2

(t1, t2)(k1, v1, t1, t2) (k2, v2, t1, t2)

Remove (t1)

Post committr = t1tc = t2

Update commit

cells

Datastore(k1, v1, t1, t2)

Transaction Manager

Datastore Datastore Committable

Read (k1, t3)

(k2, v2, t1, t2)

Commit cells

Client

tr = t3

High availability

Transaction ManagerClient

Begin/Commit/Abort

Datastore Datastore Datastore Committable

Results/Timestamp

Atomic commitGet commit info

Read/Write

Single point of failure

Transaction Manager

Transaction ManagerClient

Datastore Datastore Datastore Committable

Recovery state

Architecture with HA

Transaction Manager

Split brain

Transaction ManagerClient

Datastore Datastore Datastore Committable

Recovery state

Avoid Split Brain

Transaction Manager

Our HA solution

Transaction ManagerClient

Datastore Datastore Datastore Committable

Recovery state

ABORT

Infrequent access

§ No runtime overhead in mainstream execution• Minor overhead after failover

§ Leases for leader election• Local lease status check before/after writing to Commit Table

High availability

050

100150200250300350400450500550

Omid1 Omid1 Non Durable

Omid Omid Non Durable

Tps

* 103

0

500

1000

1500

2000

2500

document inversion duplicate detection out-link processing in-link processing stream to runtime

Late

ncy

(ms)

Commit + CT update BeginCompute ReadUpdate

0

50

100

150

200

250

15 300 500 2000 8000 45000 55000 70000

Tim

e (m

s)

Txn / Sec[Batch size]

Queueing (avg)HBASE write (avg)Network and CD (avg)

[10000][10000][240][240][50][25][1] [10000]

§ OLTP on-top-of HBase

§ Integrating Omid as Phoenix transaction processing engine

§ Augment semantics to support secondary index creation

§ Augment Omid for concurrent transaction execution

Integration with Apache Phoenix

Summary

Transactions – important use case in stream processing

Snapshot Isolation – scalable consistency model

Omid – an apache incubator TP system for HBaseBattle-tested, Web-scale, HA