Download - Building a Graph-based Analytics Platform

Transcript
Page 1: Building a Graph-based Analytics Platform

(graphs)-[:are]->(everywhere)

Building����������� ������������������  a����������� ������������������  graph-based����������� ������������������  analytics����������� ������������������  platform

© All Rights Reserved 2014 | Neo Technology, Inc.

@kennybastani

Neo4j����������� ������������������  Developer����������� ������������������  Evangelist

Page 2: Building a Graph-based Analytics Platform

Using Meetup as an example use case

Meetup.com is a valuable source of data for understanding trends around products or brands.

Understanding demand is key for delivering compelling content at meetups.

It sounded like a great use case for Neo4j.

Page 3: Building a Graph-based Analytics Platform

The Problem

Track meetup group growth over time.

Apply tags to meetup groups and report combined growth of all groups over time.

Page 4: Building a Graph-based Analytics Platform

Questions

Page 5: Building a Graph-based Analytics Platform

Question #1

Given a start date and an end date, what is the time series that plots the membership growth of a given meetup group?

Page 6: Building a Graph-based Analytics Platform

Question #2

Given a start date, an end date, and a combination of tags, what is the time series that plots the combined membership growth of all meetup groups with those tags?

Page 7: Building a Graph-based Analytics Platform

Question #3

How do you generate the JSON data of a time series for a basic JS line chart plugin?

Page 8: Building a Graph-based Analytics Platform

The Goal

Page 9: Building a Graph-based Analytics Platform

The GraphGist Project

The GraphGist project is a way to quickly build a graph-based proof of concept on Neo4j.

I started with a GraphGist.

Neo4j for Graph Analytics: Meetup.com Example

Page 10: Building a Graph-based Analytics Platform

Graph Data Model

Page 11: Building a Graph-based Analytics Platform

How are groups connected?

Page 12: Building a Graph-based Analytics Platform

How are locations connected?

Page 13: Building a Graph-based Analytics Platform

How are tags/topics connected?

Page 14: Building a Graph-based Analytics Platform

How are stats connected?

Page 15: Building a Graph-based Analytics Platform

How are days connected?

Page 16: Building a Graph-based Analytics Platform

How are weeks connected?

Page 17: Building a Graph-based Analytics Platform

How are months connected?

Page 18: Building a Graph-based Analytics Platform

How are years connected?

Page 19: Building a Graph-based Analytics Platform

Tackling Time in Neo4j

How do you implement a time series in Neo4j?

For any node that represents a unit of time, use a timestamp. Traversals can be costly for selecting time

series. Expose a REST API that takes a normal date format and then convert it to an integer that allows you to select a

range of dates in your Neo4j Cypher query.

Page 20: Building a Graph-based Analytics Platform

Scale it up!

It started with a GraphGist and then I said “Why not?” let’s build something cool using Neo4j.

Page 21: Building a Graph-based Analytics Platform

Challenges

I decided to take my GraphGist and make a full platform.

There were some challenges.

Page 22: Building a Graph-based Analytics Platform

Challenge #1

How do I get historical Meetup group statistics for all groups?

Page 23: Building a Graph-based Analytics Platform

Challenge #2

How do I handle the data import on a daily basis?

Page 24: Building a Graph-based Analytics Platform

Challenge #3

What kind of reports do I want to create? What do I want to know about Meetup groups?

Page 25: Building a Graph-based Analytics Platform

Challenge #4

How do I safely expose Neo4j to a client-side charting control?

Page 26: Building a Graph-based Analytics Platform

Ask Questions

I decided to start asking some questions about my data model.

Page 27: Building a Graph-based Analytics Platform

What do I want to know?

Assuming I had as much historical Meetup data as I pleased, what kind of questions would I want to ask about that data?

How would I want to present it?

Page 28: Building a Graph-based Analytics Platform

What’s the combined growth percent of Meetup groups having a certain topic?

Page 29: Building a Graph-based Analytics Platform

What’s the cumulative growth of Meetup groups with a specific topic?

Page 30: Building a Graph-based Analytics Platform

What’s the relative growth of Meetup groups with a topic for a date range?

Page 31: Building a Graph-based Analytics Platform

How many groups does a topic have relative to others?

Page 32: Building a Graph-based Analytics Platform

What’s the growth percent of all groups for a topic in a location for a date range?

Page 33: Building a Graph-based Analytics Platform

How do I give users a clean set of controls to filter and search?

Page 34: Building a Graph-based Analytics Platform

Scaling it upDesigning a graph-based analytics platform using Node.js and Neo4j

Page 35: Building a Graph-based Analytics Platform

Architecture

Front-end web-based dashboard in Node.js and bootstrap

REST API via Neo4j Swagger in Node.js

Data import services in Node.js

Data storage in Neo4j graph database

Page 36: Building a Graph-based Analytics Platform

Applications

Analytics REST API(Node.js)

Dashboard"(Node.js)

Analytics Data Import Scheduler"(Node.js)

Web

Web

Console

Page 37: Building a Graph-based Analytics Platform

Neo4j(JVM)

REST API(Node.js)

Dashboard (Node.js)

Import Scheduler (Node.js)

Polls Meetup API

Graph Data Storage Analytical Queries Presentation, Filtering

FilterQuery

Import

Web App Web App

Retrieves Report Data Visualizes Report Data

Page 38: Building a Graph-based Analytics Platform

Analytics Dashboard

Page 39: Building a Graph-based Analytics Platform

Analytics REST API

Page 40: Building a Graph-based Analytics Platform

Data Import Scheduler

Page 41: Building a Graph-based Analytics Platform

REST API

The REST API is a fork of Neo4j Swagger. Swagger is a specification and complete framework implementation for describing, producing, consuming, and visualizing RESTful web services.

Page 42: Building a Graph-based Analytics Platform

Demo

http://meetup-analytics-api.herokuapp.com/

Page 43: Building a Graph-based Analytics Platform

Swagger

The REST API module of this project is based on a fork of Swagger.

Page 44: Building a Graph-based Analytics Platform

The Neo4j Swagger Project

The Swagger project was modified to use Neo4j as its data source. The REST API module of this project is extended from the Neo4j swagger project.

Page 45: Building a Graph-based Analytics Platform

REST API Methods

Get Weekly Growth Get Monthly Growth Get Monthly Growth By Tag Get Monthly Growth By Location Get Cities Get Countries Get Group Count By Tag

Page 46: Building a Graph-based Analytics Platform

Get Weekly Growth

Gets the weekly growth percent of meetup groups as a time series. Returns a set of data points containing the week of the year, the meetup group name, and membership count.

Page 47: Building a Graph-based Analytics Platform

Get Monthly Growth

Gets the monthly growth percent of meetup groups as a time series. Returns a set of data points containing the month of the year, the meetup group name, and membership count.

Page 48: Building a Graph-based Analytics Platform

Get Monthly Growth By Tag

Gets the monthly growth percent of meetup group tags as a time series. Returns a set of data points containing the month of the year, the meetup group tag name, and membership count.

Page 49: Building a Graph-based Analytics Platform

Get Monthly Growth By Location

Gets the monthly growth percent of meetup group locations and tags as a time series. Returns a set of data points containing the month of the year, the meetup group tag name, the city, and membership count.

Page 50: Building a Graph-based Analytics Platform

Get Cities

Gets a list of cities that meetup groups reside in. Returns a distinct list of cities for typeahead.

Page 51: Building a Graph-based Analytics Platform

Get Countries

Gets a list of countries that meetup groups reside in. Returns a distinct list of countries for typeahead.

Page 52: Building a Graph-based Analytics Platform

Get Group Count By Tag

Gets a count of groups by tag. Returns a list of tags and the number of groups per tag.

Page 53: Building a Graph-based Analytics Platform

Analytics Dashboard

The dashboard is a web application that uses client-side JavaScript to communicate with the Neo4j Swagger REST API to populate a series of interactive chart controls with data. This web application uses bootstrap for the front-end styles and highcharts.js for the charting controls.

Page 54: Building a Graph-based Analytics Platform

Demo

http://meetup-analytics-dashboard.herokuapp.com/

Page 55: Building a Graph-based Analytics Platform

Reports

Meetup Tag Growth %

Cumulative Meetup Growth

Category Growth %

Groups By Tag

Meetup Tag Growth By Location

Page 56: Building a Graph-based Analytics Platform

Meetup Tag Growth %

https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#meetup-tag-growth-

Page 57: Building a Graph-based Analytics Platform

Cumulative Meetup Growth

https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#cumulative-meetup-growth

Page 58: Building a Graph-based Analytics Platform

Category Growth %

https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#category-growth-

Page 59: Building a Graph-based Analytics Platform

Groups By Tag

https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#group-count-by-tag

Page 60: Building a Graph-based Analytics Platform

Meetup Tag Growth By Location

Page 61: Building a Graph-based Analytics Platform

Filter & Search

Page 62: Building a Graph-based Analytics Platform

Data Import Scheduler

https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md#data-import-scheduler

Page 63: Building a Graph-based Analytics Platform

GitHub Repository

https://github.com/kbastani/meetup-analytics

Page 64: Building a Graph-based Analytics Platform

Full Documentation

https://github.com/kbastani/meetup-analytics/blob/master/docs/DOCS.md

Page 65: Building a Graph-based Analytics Platform

© All Rights Reserved 2014 | Neo Technology, Inc.

(Next����������� ������������������  steps)

Page 66: Building a Graph-based Analytics Platform

© All Rights Reserved 2014 | Neo Technology, Inc.

Get����������� ������������������  in����������� ������������������  touch

Twitter: @kennybastani

LinkedIn: /in/kennybastani

Email: [email protected]