Databus - LinkedIn's Change Data Capture Pipeline
-
Upload
sunil-nagaraj -
Category
Documents
-
view
2.669 -
download
5
description
Transcript of Databus - LinkedIn's Change Data Capture Pipeline
Recruiting SolutionsRecruiting SolutionsRecruiting Solutions
Databus
LinkedIn’s Change Data Capture Pipeline
Databus Team @ LinkedInSunil Nagarajhttp://www.linkedin.com/in/sunilnagaraj
EventbriteMay 07 2013
Talking Points
Motivation and Use-Cases Design Decisions Architecture Sample Code Performance Databus at LinkedIn Review
The Consequence of Specialization in Data Systems
Data Consistency is critical !!!
Data Flow is essential
Extract changes from database commit log
Tough but possible
Consistent!!!
Application code dual writes to database and pub-sub system
Easy on the surface
Consistent?
Two Ways
5
Change Extract: Databus
PrimaryData Store
Data Change Events
StandardizationStandard
izationStandard
ization
StandardizationStandard
izationSearch Index
StandardizationStandard
izationGraph Index
StandardizationStandard
izationRead
Replicas
Updates
Databus
6
Example: External Indexes
Description– Full-text and faceted search
over profile data
Requirements– Timeline consistency– Guaranteed delivery– Low latency– User-space visibility
Members
Update
skillsRecruiters
Search
Results
Change events
linkedin.com recruiter.linkedin.com
People Search IndexDatabus
A brief history of Databus
2006-2010 : Databus became an established and vital piece of infrastructure for consistent data flow from Oracle
2011 : Databus (V2) addressed scalability and operability issues
2012 : Databus supported change capture from Espresso
2013 : Open Source Databus– https://github.com/linkedin/databus
Databus Eco-system: Participants
Primary Data Store
Source Databus
Consumer
Application
Change Data
Capture
Change Event Stream
events
events
change data
• Support transactions
• Extract changed data of committed transactions
• Transform to ‘user-space’ events
• Preserve atomicity
• Receive change events quickly
• Preserve consistency with source
Databus Eco-System : Realities
Databases
Source Databus
Fast Consumer
Applications
Change Data
Capture
Change Event Stream
Slow Consumer
New Consumer
Every change
Changes since last week
Changes since last 5 seconds
Schemas evolve
• Source cannot be burdened by ‘long look back’ extracts
• Applications cannot be forced to move to latest version of schema at once
change data
events
10
Key Design Decisions : Semantics
Change Data Capture uses logical clocks attached to the source (SCN)– Change data stream is ordered by SCN – Simplifies data portability , change stream is f(SourceState,SCN)
Applications are idempotent– At least once delivery – Track progress reliably (SCN)– Timeline consistency
11
Key Design Decisions : Systems
Isolate fast consumers from slow consumers– Workload separation between online(recent), catch-up (old),
bootstrap (all)
Isolate sources from consumers– Schema changes– Physical layout changes– Speed mismatch
Schema-awareness– Compatibility checks– Filtering at change stream
12
The Components of Databus
DB
ChangeCapture
Event Buffer(In Memory)
change dataConsumer
Relay
Dat
abu
s C
lient
Application
online changes
Bootstrap
New ApplicationConsistent
snapshot
Log Store
Snapshot Store
online changes
Bootstrap Consumer
older changes
SlowApplication
Metadata
Change Data Capture
Contains logic to extract changes from source from specified SCN
Implementations– Oracle
Trigger-based Commit ordering Special instrumentation required
– MySQL Custom-storage-engine based
EventProducer
start(SCN ) //capture changes from specified SCN
SCN getSCN() //return latest SCN
Change Data Capture
SCN
Database Schemas
Databus 14
MySQL : Change Data Capture
MySQLMaster
MySQL Slave
MySql replication
TCP Channel
• MySQL Replication takes care of • bin-log parsing• Protocol between master and slave• Handling restarts
• Relay• Provides a TCP Protocol interface to push events• Controls and Manages MySql Slave
Relay
Publish – Subscribe API
DB
Change Data
Capture
Event Buffer(In Memory)
publish
extract (src,SCN)
Consumersubscribe (src,SCN)
EventBuffer
startEvents() //e.g. new txn
DbusEvent(enc(schema,changeData),src,pk) appendEvent(DbusEvent, ...) endEvents(SCN) //e.g. end of txn; commitrollbackEvents() //abort this window
Consumer
register(source, ‘Callback’)
onStartConsumption() //once
onStartDataEventSequence(SCN)
onStartSource(src,Schema)onDataEvent(DbusEvent e,…) onEndSource(src,Schema)
onEndDataEventSequence(SCN)onRollback(SCN)
onStopConsumption() //once
The Databus Change Event Stream
Event Buffer(In Memory)
Relay
Bootstrap
Log Store
Snapshot Store
online changes
• Provide APIs to obtain change events• Query API specifies logical clock(SCN) and
source• ‘Get change events greater than SCN’• Filtering at source possible
• MOD, RANGE filter functions applied to primary key of the event
• Batching/Chunking to guarantee progress
• Does not contain state of consumers• Contains references to metadata and
schemas• Implementation
• HTTP server• Persistent connection to clients• REST APIChange Event Stream
Meta-data Management
Event definition, serialization and transport– Avro
Oracle, MySQL – Table schema generates Avro definition
Schema evolution– Only backwards-compatible changes allowed
Isolation of applications from changes in source schema Many versions of a source used by applications , but one
version(latest) of the change stream exists
The Databus Relay
ChangeCapture
Event Buffer(In Memory)
Relay
Database Schemas
SrcMeta- data
• Encapsulates change capture logic and change event stream
• Source aware, schema aware
• Multi-tenant: Multiple Event Buffers representing change events of different databases
• Optimizations• Index on SCN exists to quickly
locate physical offset in EventBuffer• Locally stores SCN per source for
efficient restarts
• Large Event Buffers possible (> 2G)
SCN store API
Scaling Databus Relay
DB
Relay Relay Relay
• Peer relays, independent• Increased load on the source
DB with each additional relay instance
DB
RelayLeader
Relay(Follower)
• Relays in leader-follower cluster • Only the leader reads from DB ,
followers from leader• Leadership assigned dynamically• Small period of stream
unavailability during leadership transfer
Relay(Follower)
The Bootstrap Service
Bridges the continuum between stream and batch systems
Catch-all for slow / new consumers Isolate source instance from large scans Snapshot store has to be seeded once
Optimizations– Periodic merge– Filtering pushed down to store– Catch-up versus full bootstrap
Guaranteed progress for consumers via chunking
Multi-tenant - can contain data from many different databases
Implementations– Database (MySQL)– Raw Files
Relay
Bootstrap
Log Store
Snapshot Store
online changes
Bootstrap Consumer
seeding
Database
The Databus Client Library
Glue between Databus Change Stream and business logic in the Consumer
Switches between relay and bootstrap as needed
Optimizations– Change events uses batch write
API without deserialization
Periodically persists SCN for lossless recovery
Built-in support for parallelism– Consumers need to be thread-safe– Useful for scaling large batch processing
(bootstrap)
EventBuffer
Databus Change Stream
Change Stream Client
SCN store API
Dispatcher
Stream Consumer
Bootstrap Consumer
iterate
write
callback
read
Databus Client Library
Databus Applications
Consumer S1
Dat
abu
s C
lien
t
Application
Consumer S2
Consumer Sn
S1 S2
Sn
Change Streams
• Applications can process multiple independent change streams
• Failure of one won’t affect others
• Different logic and configuration settings for bootstrap and online consumption possible
• Processing can be tied to a particular version of schema
• Able to override client library persisted SCN
Client Application
(i=1..k)
Client Application
(k+1..N)
Change Stream
i= pk MOD N
(i=0..k-1)
(i=k..N-1)
• Databus Clients consume partitioned streams• Partitioning strategy: Range or Hash• Partitioning function applied at source• Number of partitions (N) , and list of partitions (i) specified
statically in configuration• Not easy to add/remove nodes
• Needs configuration change on all nodes
Client nodes uniform: can process any partition(s)
Clients distribute processing load
Scaling Applications - I
Client Application
N/m partitions
Application N/m
partitions
Databus Stream
i= pk mod N
Dynamically allocated partitions
N partitions distributed evenly amongst ‘m’ nodes
SCN written to central location
• Databus Clients consume partitioned streams• Partitioning strategy: MOD• Partition function applied at source• Number of partitions (N) , and cluster name specified
statically in configuration• Easy to add or remove nodes
• Dynamic redistribution of partitions • Fault tolerance for client nodes
Scaling Applications - II
Databus: Current Implementation
OS - Linux, written in Java , runs Java 6 All components have http interfaces Databus Client: Java
– Other language bindings possible– All communication with change stream via http
Libraries– Netty , for http client-servers– Avro , for serialization of change events– Helix , for cluster awareness
Sample Code: Simple Application
Sample Code - Consumer
Databus Performance : Relay
Relay– Saturates network with low CPU utilization
CPU utilization increases with more clients Increased poll interval (increase consumer latency ) reduces CPU
utilization
– Scales to 100’s of consumers (client instances)
Databus 29
Performance: Relay Throughput
5/10/13
Databus Performance : Consumer
Consumer– Latency primarily governed by ‘poll interval’– Low overhead of library in event fetch
Spike in latency due to network saturation at relay
Scaling number of consumers Use partitioned consumption (filtering at relay )
– Reduces network utilization , but some increase in latency due to filtering
Increase ‘poll interval’ , tolerate higher latencies
Databus 31
Performance: Consumer Throughput
5/10/13
Databus 32
Performance: End-End Latency
5/10/13
Databus Bootstrap :Performance
Bootstrap– Should we serve from ‘catchup store’ or ‘snapshot store’ – Depends: Traffic patterns in the spectrum ‘all updates’ , ‘all
inserts’ – Tune service depending on fraction of update and inserts
Favour snapshot based serving for update heavy traffic
Databus 34
Bootstrap Performance: Snapshot vs Catch-up
5/10/13
MOracle Change EventStream
MEspresso Change EventEvent Stream
Databus Service
• Databus Change Stream is a managed service
• Applications discover/lookup coordinates of sources
• Multi-tenant , chained relays
• Many sources can be bootstrapped from SCN 0 (beginning of time)
• Automated change stream provisioning is a work in progress
Databus at LinkedIn
Databus at LinkedIn : Monitoring
Available out of the box as JMX Mbean Metrics for health
– lag between update time at DB and the time at which it was received by application
– time of last contact to change event stream and source
Metrics for capacity planning– Event rate/ size – Request rate– Threads/ conns
Databus at LinkedIn: The Good
Source isolation: Bootstrap benefits– Typically, data extracted from sources just once (seeding)– Bootstrap service used during launch of new applications– Primary data store not subject to unpredictable high loads due to
lagging applications
Common Data Format– Avro offers ease-of-use , flexibility and performance
improvements (larger retention periods of change events in Relay)
Partitioned Stream Consumption– Applications horizontally scaled to 100’s of instances
Databus at LinkedIn: Operational Niggles
Oracle Change Capture Performance Bottlenecks– Complex joins– BLOBS and CLOBS– High update rate driven contention on trigger table
Bootstrap: Snapshot store seeding– Consistent snapshot extraction from large sources
Semi-automated change stream provisioning
39
Quick Review
Specialization in Data Systems– CDC pipeline is a first class infrastructure citizen up there with
stores and indexes
Source Independent– Change capture logic can be plugged in
Use of SCN – an external clock attached to source– Makes change stream more ‘portable’ – Easy for applications to reason about consistency with source
Pub-Sub API support atomicity semantics of transactions Bootstrap Service
– Isolates the source from abusive scans– Serves both streaming and batch use-cases
40
Questions
Additional Slides
The Timeline Consistent Data Flow problem
Databus: First attempt (2007)
Issues
Source database pressure caused by slow consumers
Brittle serialization