Puppy playdate
-
Upload
austin-ouyang -
Category
Data & Analytics
-
view
135 -
download
0
Transcript of Puppy playdate
DEMO
Provide analytics to help dog owners determine the best times and locations to walk their dog
Santa CruzApril?Fridays?
San MateoTuesdays?
Santa ClaraSundays?
AlamedaOctober?
Data pipeline
DatastoreDistributed File
System
Real-timeProcessing
DataIngestion
UserInterface
UserMessages
User N
User 2User 1
Meet-upRequests
Serving LayerBatch and Real-time Layer
Data Ingestion
Data ingestion
DataIngestion
UserMessages
User N
User 2User 1
Meet-upRequests
{"timestamp": [2015, 1, 23, 13, 55, 0], "county": ["Dundy County", "NE"], "creatorID": 15854, "senderID": 844090, "rank": 0, "messageID": 622878, "message": "Let's meet up at 2PM today!"}
Message Topic
JSON format
Real-time processing
Distributed file storage
Batch and real-time layer
Distributed File System
Real-timeProcessing
Datastore
{"timestamp": [2015, 1, 23, 13, 55, 0], "county": ["Dundy County", "NE"], "creatorID": 15854, "senderID": 844090, "rank": 0, "messageID": 622878, "message": "Let's meet up at 2PM today!"}
Batch and real-time layer
Distributed File System
Real-timeProcessing
Datastore
{"timestamp": [2015, 1, 23, 13, 55, 0], "county": ["Dundy County", "NE"], "creatorID": 15854, "senderID": 844090, "rank": 0, "messageID": 622878, "message": "Let's meet up at 2PM today!"}
( (state, county), 1 )
( (state, county, year+month), 1 )( (state, county, year+month+day), 1 )
( (state, county), json(message) )
Batch and real-time layer
Distributed File System
Real-timeProcessing
Datastore
{"timestamp": [2015, 1, 23, 13, 55, 0], "county": ["Dundy County", "NE"], "creatorID": 15854, "senderID": 844090, "rank": 0, "messageID": 622878, "message": "Let's meet up at 2PM today!"}
( (state, county), 1 )
( (state, county, year+month), 1 )( (state, county, year+month+day), 1 )
( (state, county), json(message) )
reduceByKey( _+ _ )
reduceByKey( _+ _ )
Serving layer
Datastore UserInterface
Real-time processing
Batch processing
by_county_day
by_county_rt_msgs
Partition keyClustering columnValue
STATE(VARCHAR)
COUNTY(VARCHAR)
DATE(INT)
TIME(INT)
MESSAGE(VARCHAR)
CA Santa Clara County 20150205 124523 “JSON_msg”
STATE(VARCHAR)
COUNTY(VARCHAR)
DATE(INT)
COUNT(INT)
CA Santa Clara County 20150205 72
by_county_monthSTATE
(VARCHAR)COUNTY
(VARCHAR)DATE(INT)
COUNT(INT)
CA Santa Clara County 201502 2361
Austin Ouyang
Previous employment: Dynetics, Inc.
– RF Systems Engineer
Education: MS Biomedical Engineering (University of Texas Southwestern)
BS Electrical Engineering (University of Illinois – Urbana Champaign)
Hobbies: algorithmic futures trading, rock climbing, and cycling
Contact: [email protected]: http://github.com/aouyang1