Bitly // Data Driven NYC // November 2014
-
Upload
firstmark -
Category
Technology
-
view
283 -
download
4
Transcript of Bitly // Data Driven NYC // November 2014
DISTRIBUTION
Messages
NSQ
Worker A
Worker A
Worker A
Worker BAll the Worker A’s share the workload
and process a single copy of all the
messages in aggregate
Scale out Data Processing
DECOUPLING
Worker A, and Worker B each get a
copy of all the messages
Messages
NSQ
Worker A
Worker B
Publish / Subscribe
AKA Multicast
IN PRACTICE @ Bitly
Bitly’s Data Science team wants to
research correlation
between where a brand’s audience
is active and conversion.
Can you set them up to access our Data?
IN PRACTICE @ Bitly
NSQ
Metrics
Archive to Disk
Realtime Data
Analysis
HDFS for Offline Analysis
Decoupling
independent
data needs
makes this
easy to solve
Rob Slide #3● A
○ 1○ 2○ 3
ENRICHMENT
{ .... "bitly_user_hash_identifier": "1xTDx93", "LongURL": http://espn.com/, "timestamp": 1416331248”,…}
{ .... "bitly_user_hash_identifier": "1xTDx93", "LongURL": http://espn.com/, "timestamp": “1416331248”, ”Geo_region":” NY”, ”Topic":”news ,sports”,…}
Raw Decode
Annotated Decode
INTEGRATION
NSQ
NSQ
NSQ
NSQ
Bitly Brand Tools Customers R&D
In House DMP
Third party analytics
Marketing Cloud