[Hack.Hydrosphere] Project TIDE

14
Visual DataFlow Constructor

Transcript of [Hack.Hydrosphere] Project TIDE

Visual DataFlow Constructor

Team Q-shke-Q

Nikitafront-end

javascript(GoJS) coderhates bad docs

Iskandarback-end

Apache Spark guruPython coder

Bulatteamleadloves Scala anddoesn’t know it

Alexeyfront-endloves Node.js

team’s DJ

Meet

Bobdata scientistboss said him to learn big data because it’s

cool and trendy

How it was

Apache Spark Mist Web app

Hmm. Okay I guess?

But...

Apache Spark Mist Web app

'A data scientist is someone who is better at statistics than any software engineer and better at software engineering than any statistician.' https://www.quora.com/Do-data-scientists-code

Scala, Python, code

WTH is this?

Our solution is

Apache Spark Mist Tide

Feels good man

TIDEVisual jobs constructor for

Apache SparkProvides an abstraction over code to make life simpler

Still need to use code snippets in Python

Word count

Word count

Sum

How it works?

Objects

AST

pySpark

DB

Mist

Roadmap

● Auth system● Custom data node● R code snippets support● DataFrames instead of RDD● Job as a subprocedure● Jobs management● UI/UX improvements● Static analysis of data flow● Rewrite backend to Scala

(maybe)● … just for now

Front-end● websockets● Flask● Data flow to pySpark compiler

Back-end● websockets● GoJS● materialize.css● JQuery

Thank you for your attention!