Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex...

27
Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017

Transcript of Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex...

Page 1: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Little Big DataAnalyze your own data with the Elastic Stack

inovex Meetup Köln, 18.09.2017

Page 2: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Fun:MarathonTriathlonLactateGarmin

Work (can be fun, too):SearchElastic StackCrawlingGeo Data

Wolfgang SchochData Management & Analyticsinovex [email protected]

Page 3: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

You create a lot of data every day

Page 4: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Some of them on purpose

Page 5: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

So many devices

Page 6: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

So many possibilitiesVendor Apps Third-Party Apps

Page 7: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Not enough?

Page 8: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Reclaim your data!

Page 9: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Just 10 clicks to create a cool biking tour

Thanks to billions of tracks uploaded by millions of users.

Page 10: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Motivation

• Explore your data

• Discover new patterns

• Combine data from different sources

• Customize. Everything.

• Most important: Why not?

Page 11: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Example: Garmin Forerunner 920xt

• GPS Multisport Watch

• Integrates in Garmin Connect online portal and mobile app

• Writes natively .fit

• Integrates in Garmin Connect

• Other formats (tcx, gpx) available via Garmin Connect

Page 12: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Mission

• Pull whole history of activities from Garmin Connect

• Normalize and transform it as needed

• Load it into Elasticsearch

• Visualize the hard work in Kibana

• Gain new insights

Page 13: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Why Elasticsearch?

• The Swiss Army knife for search and analytics

• My everyday technology

• Easy to use

Page 14: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Export activities as CSV

Page 15: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

CSV Export

Fixed Columns

Fixed Length

Page 16: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Export Single Activity

• Original is binary .fit

• GPX is XML, contains only route

• TCX is XML, contains everything

Page 17: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

TCX export works, but…

• there is no bulk export (I created my own with some Python magic)

• can be a huge amount of data (contains GPS data)

• The X in TCX stands for XML, and nobody likes XML (except machines)

Page 18: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

I ❤ XML

Page 19: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Why not use the REST API?

• Did, was every expert should do: googled for help and advice

• Found a couple of semi-official APIs from Garmin

• Endpoint for bulk export of activities

Page 20: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Normalize & Transform

• JSON is a good starting point

• Different units for attributes like speed or cadence per activity type

• Select relevant attributes

Page 21: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Load into Elasticsearch

• Straight forward via _bulk API

• Dynamic mapping is a good starting point

• Define minimal template, e.g. to cover GeoPoints

Page 22: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Kibana Dashboard

Page 23: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Add another data source

• Discover relations between different data

• Gain insights

• This is where the fun begins

Page 24: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Example: Weight Data from Withings• Export CSV from vendor portal

• Import with Logstash into Elasticsearch

• Play with Timelion and Visual Builder

Page 25: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Learnings

• Everyone produces a lot of interesting data every single day

• It is always interesting to have a close look to that

• Start with a small amount of data, avoid confusion

• Try to find answers on questions like „Is there a relation between X and Y?“

Page 26: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Next Steps

• Use Spark / Zeppelin to analyze and preprocess data

• Try to build heat maps on my preferred GPS tracks

Page 27: Little Big Data - inovex.de · Little Big Data Analyze your own data with the Elastic Stack inovex Meetup Köln, 18.09.2017. Fun: Marathon Triathlon Lactate Garmin Work (can be fun,

Questions?