AI&Big Data Club спікер Дмитро Сподарець "Які інструменти...

Post on 20-Jan-2017

21 views 2 download

Transcript of AI&Big Data Club спікер Дмитро Сподарець "Які інструменти...

What tools data scientists are using?

Dmitry Spodarets

AI&BigData Club

Who am I

Dmitry Spodarets• Founder and CEO at FlyElephant• PhD candidate at Odessa National

University• Lecturer at Odessa Polytechnic University • Organizer of technical conferences about

AI, BigData, HPC, JS, Web Technologies …

FlyElephant

We automate Data Science and Engineering Simulation

and help teams to work efficiently.

Computing resources

Ready-computing infrastructure

Collaboration & Sharing

Fast Deployment

Expert Community

Data Science Tools Survey

Datasets

less th

an 1 MB

1.1 to 10 M

B

11 to 100 M

B

101 MB to

1 GB

1.1 to 10 GB

11 to 100 GB

101 GB to 1 Terabyte

1.1 to 10 TB

11 to 100 TB

101 TB to 1 Petabyte

1.1 PB to 10 Petabyte

11 to 100 PB

over 100 PB

0

10

20

30

40

50

60

70

Datasets

Datasets

Tools for collecting data

Python 45R 26

Spark 18SQL 15

Excel 13Kafka 11

Pandas 10custom 8Hadoop 5Numpy 5

SAS 5

Tools for storing data

PostgreSQL 37

csv 31

MySQL 21

Hadoop 16

Excel 15

HDFS 15

Mongodb 15

My Server 12

Oracle 11

Hive 8

Programming languages

Python 151R 88

SQL 37Java 32Scala 22bash 17C++ 17

JavaScript 15C# 13vba 8C 6

Libraries

Pandas 88Numpy 68

scikit-learn 48scipy 26dplyr 20

matplotlib 20ggplot2 15keras 14SPARK 13

xgboost 13Tensorflow 12

Tools for the visualization of data

matplotlib 66seaborn 33ggplot2 26Excel 22

Tableau 22R 19

ggplot 14plotly 13bokeh 12

d3 11

Cloud services

aws 77none 41azure 25google 24

digital ocean 9OpenStack 7

Watson 1

Computing power

NVIDIA DGX-1 Deep Learning Supercomputer170/3 TFLOPS (GPU FP16 / CPU FP32)

intel xeon phi processor

nvidia tesla p100~5 TeraFLOPS

~3 TeraFLOPS

FPGA

Dmitry Spodarets

d.spodarets@flyelephant.netwww.flyelephant.net