Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End...

57
© 2016 IBM Corporation Perform End-to-End Data Analysis in the Cloud Building an IoT Ecosystem with Arduino and Bluemix Dale Mumper Open Source Analytics Solution Engineer - Industrial [email protected]

Transcript of Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End...

Page 1: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation

Perform End-to-End Data Analysis in the CloudBuilding an IoT Ecosystem with Arduino and Bluemix

Dale MumperOpen Source Analytics Solution Engineer - Industrial

[email protected]

Page 2: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation2

Disclaimer

© Copyright IBM Corporation 2016. All rights reserved.

U.S. Government Users Restricted Rights - Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.

THE INFORMATION CONTAINED IN THIS PRESENTATION IS PROVIDED FOR INFORMATIONAL PURPOSES ONLY. WHILE EFFORTS WERE

MADE TO VERIFY THE COMPLETENESS AND ACCURACY OF THE INFORMATION CONTAINED IN THIS PRESENTATION, IT IS PROVIDED

“AS IS” WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED. IN ADDITION, THIS INFORMATION IS BASED ON IBM'S CURRENT

PRODUCT PLANS AND STRATEGY, WHICH ARE SUBJECT TO CHANGE BY IBM WITHOUT NOTICE. IBM SHALL NOT BE RESPONSIBLE

FOR ANY DAMAGES ARISING OUT OF THE USE OF, OR OTHERWISE RELATED TO, THIS PRESENTATION OR ANY OTHER

DOCUMENTATION. NOTHING CONTAINED IN THIS PRESENTATION IS INTENDED TO, NOR SHALL HAVE THE EFFECT OF, CREATING ANY

WARRANTIES OR REPRESENTATIONS FROM IBM (OR ITS SUPPLIERS OR LICENSORS), OR ALTERING THE TERMS AND CONDITIONS OF

ANY AGREEMENT OR LICENSE GOVERNING THE USE OF IBM PRODUCTS AND/OR SOFTWARE.

IBM's statements regarding its plans, directions, and intent are subject to change or withdrawal without notice at IBM's sole discretion. Information

regarding potential future products is intended to outline our general product direction and it should not be relied on in making a purchasing decision.

The information mentioned regarding potential future products is not a commitment, promise, or legal obligation to deliver any material, code or

functionality. Information about potential future products may not be incorporated into any contract. The development, release, and timing of any

future features or functionality described for our products remains at our sole discretion.

IBM, the IBM logo, ibm.com, Information Management, DB2, DB2 Connect, DB2 OLAP Server, pureScale, System Z, Cognos, solidDB, Informix,

Optim, InfoSphere, and z/OS are trademarks or registered trademarks of International Business Machines Corporation in the United States, other

countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol (® or

™), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks

may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the Web at “Copyright and

trademark information” at www.ibm.com/legal/copytrade.shtml

Other company, product, or service names may be trademarks or service marks of others.

Page 3: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation3

Agenda

Bio

Solution Overview

Bluemix Overview

Sensor Board

NodeRED

Cloudant

dashDB

Data Science Experience

Watson Analytics

Page 4: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation4

Bio

Dale Mumper

IBM Open Source Analytics Solution Engineer

Consultant and analytics leader for over 20 years

Background in physics and math

Certifications

- Cloudera Certified Administrator

for Apache Hadoop - CCAH

- Cloudera Certified Developer

for Apache Hadoop - CCDH

- Microsoft MCSE – Data Platform

- Microsoft MCSE – Business Intelligence

- Oracle Certified Professional - OCP

Page 5: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation5

IoT Defined

“The network of physical devices, vehicles, building and other items

embedded with electronics, software, sensors, actuators and network

connectivity that enables object to collect and exchange data.”

“The infrastructure of the information society.”

“Every object, device and every familiar part of the traditional home,

is being equipped with smart circuitry.”

“With a trillon sensors embedded in the environment—all connected

by computing systems, software and services—it will be possible to

hear the heartbeat of the Earth, impacting human interaction with the

globe as profoundly as the Internet has revolutionized

communications,”

Page 6: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation6

IoT Market Drivers

USD 157.05 Billion in 2016

USD 661.74 Billion by 2021

Compound Annual Growth Rate (CAGR) of 33%

Impacting all industries

Industry leaders admit they are lack “clear perspective” on the business opportunities

afforded in the IoT arena – the trend remains nascent

2020 could see 30 Billion devices on the global net

Supplier Attention – open source software and open source hardware,

development tool kits, major vendor support

Technological Advances – ARM Cortex (1/10 the power usage), miniaturized

sensors, declining component costs, faster bandwidth

Increasing Demand - demand for 1st gen. will increase as costs decline and

next generations become more advanced; very price sensitive

Emerging Standards – semiconductor, hardware, networking and software

companies have joined with a number of industry associations and

academics consortiums; common APIs

Page 7: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

Phone Sensor Demo

Step 1

• Take out your phone

• Go to the URL on the card

• Write down the Device ID

d:quickstart:phonesensor<Device ID>

Step 2

• ibm.biz/iotqstart

• Enter Device ID

Step 3

• Explore

• Move Phone

Tilt

Rotate

Slow vs. Fast

Page 8: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation8

Environmental Recorder – ER1

Indoor Environmental Monitoring

• Measures and sends dataTemperature (from three different sensors)

Humidity

Air Pressure

Light Levels

LEDs provide operational feedback

• Connects to a local wifi network

Synchronizes time from an NTP source

Gets the real IP address and determines geolocation from IP address

Asks nearest weather station for local forecast

Connect to an MQTT broker and sends data

Page 9: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation9

Use Case for the ER1

Sleep Therapy

Room Monitoring

Remote Property

Easily add sensors and capabilities

• UV and IR Sensor

• Distance (Ultrasonics and Laser)

• Motion

• Shock

• Vibration

• Rotation

• Tension and Flex

• Soil and Moisture

• GPS Module

• LTE Cellular W-Fi

• Solar Power and Battery

Page 10: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation10

Bill of Materials

Arduino MKR1000

• Atmel ATSAMW25 SoC

SAMD21 Cortex M0+ ARM MCU

WINC1500 2.4GHz 801.11 b/g/n Wi-Fi

3.3V

256MB Flash

32KB SRAM

Full-Speed USB w/Embedded Host

Sensors

• Adafruit DS3231

• Adafruit SHT31-D

• Adafruit TSL2691

• Adafruit BMP183

• Adafruit Neopixels

Parts

• LED

• 220ohm resistor

• Full-sized breadboard

• USB A/MicroB Cable

• Jumper Wires, 3”, MM

• Jumper Wires, 6”, MM

Vendors

• adafruit.com

• arduino.cc

• element14.com

• digikeys.com

Page 11: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation11

IoT Analytics Ecosystem IoT + Runtime + Cloudant + dashDB + Spark

REST (HTTP/s) API

IBM dashDB

Schema Discovery

IoT Platform

MQTT

Spark Connector

Page 12: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation12

Arduino MKR1000Combines the Arduino Zero and a Wi-Fi Shield at a Great Price Point

Atmel SAMD21 Cortex-M0+

• 3.3V

• 256KB Flash

• 32KB SRAM

• Clock Speed 48MHz

8 Digital I/O Pins

• 4 with PWM (pulse width modulated)

6 Analog Input Pins

1 Analog Output Pin

USB connection

Reset button

Wi-Fi

Encryption

Li-Po Battery Charger

1. MPC and Memory

2. Wi-Fi

3. Small Form Factor

4. Lower Cost

Page 13: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation13

SHT31-D Sensor

Sensor made by Sensiron

• 2.5 x 2.5 x 0.9 mm3

• temperature range of –40°C to 90°C

• ±2% relative humidity and ±0.3°C accuracy

PCB Board made by Adafruit

• 3V and 5V compliant

• I2C interface

Power Pins

• Vin

2.5 to 5VDC (Volts Direct Current)

• GND

Common Ground

I2C Login Pins

• SCL

I2C clock

• SDA

I2C data pin

Page 14: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation14

TSL2561 Sensor

Sensor made by ams AG

• Light-to-digital converter

• 188ulux to 88,000lux

• Infrared and Full Spectrum diodes

PCB Board made by Adafruit

• 3V and 5V compliant

• I2C interface

Power Pins

• Vin

2.5 to 5VDC (Volts Direct Current)

• GND

Common Ground

I2C Login Pins

• SCL

I2C clock

• SDA

I2C data pin

Page 15: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation15

Adafruit DS3231 Real-Time Clock (RTC)

Chip made by Maxim Integrated

• DS3231 Real-Time Clock (RTC)

• Temperature-compensated crystal oscillator and crystal

• Long-term accuracy

PCB Board made by Adafruit

• I2C interface

• Optional battery maintains time

Power Pins

• Vin

• GND

I2C Login Pins

• SCL - I2C clock

• SDA - I2C data pin

z

Page 16: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation16

BMP183 Sensor

Sensor made by Bosch

• 300 to 1100hPa (+9000m to -500m)

• Enhanced GPS, navigation, weather, vert. velocity

PCB Board made by Adafruit

• 3V and 5V compliant

• SPI interface

Power Pins

• Vin

2.5 to 5VDC (Volts Direct Current)

• GND

Common Ground

SPI Logic Pins

• SCK - Clock

• SDO - Serial Data OUT

• SDI - Serial Data IN

• CS - Chip Select

Page 17: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation17

NeoPixels == TOTALLY COOL

Ring

Jewel

Strips

Stick

Matrix

Page 18: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation18

Turning Sensors into an IoT Device (ER1)

Sensors, Clock and LEDs in Review

Wi-Fi Connectivity

NTP Client

Time and Data Handling

C/C++ Style Floating Point Operations

HTTP Client

MQTT Client

JSON Parsing

ER1 Sketch Version 3.50

• Expects to find the IBM_CLASS 2.4GHz, WPA wireless network

Already has the SSID and the password in the sketch

• Defaults to using the IBM Watson IoT Platform in Quickstart Mode

• Sketch automatically determines the Device ID from the MAC

See your laminated MKR1000 card in your student kit

Page 19: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation19

IaaS

• Virtual Servers

• Bare Metal Servers

• Network

• Storage

• Load Balancers

PaaS

• Database

• Web Server

• Development Tools

• Runtime Containers

SaaS

• eMail

• CRM

• Games

• Virtual Desktop

Cloud Service Models

Page 20: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation20

Who Does What?

On-Premise

Applications

Data

Runtime

Middleware

OS

Virtualization

Servers

Storage

Networking

Managed by Client Managed by Provider

IaaS

Applications

Data

Runtime

Middleware

OS

Virtualization

Compute

Storage

Networking

PaaS

Applications

Data

Runtime

Middleware

OS

Virtualization

Compute

Storage

Networking

SaaS

Applications

Data

Runtime

Middleware

OS

Virtualization

Compute

Storage

Networking

Page 21: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation21

IBM Cloud

Page 22: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation22

Bluemix is an open-standard, cloud-based platform for building, managing, and

running applications of all types (web, mobile, big data, new smart devices…)

Go Live in Seconds

Zero to running in one click.

Development plans deploy in

seconds. Enterprise plans

deploy in 1-2 days.

DevOps

Development, monitoring,

deployment, and logging tools

allow the developer to run the

entire application.

APIs and Services

A catalog of IBM, third party,

and open source API services

allow the developer to stitch an

application together in minutes.

On-Premises Integration

Build hybrid environments.

Connect to on-premises

assets plus other public and

private clouds.

Flexible Pricing

Sign up in minutes. Pay as

you go and subscription

models offer choice and

flexibility.

Layered Security

IBM secures the platform and

infrastructure and provides

you with the tools to secure

your apps.

IBM Bluemix

Page 23: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation23

Demo – Bluemix Overview

Page 24: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation24

We Are Here

MQTT

Page 25: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation25

This Is Our DestinationIoT + Runtime + Cloudant + dashDB + Spark

REST (HTTP/s) API

IBM dashDB

Schema Discovery

IoT Platform

MQTT

Spark Connector

Page 26: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation26

IBM Watson IoT Starter Platform

1. Catalog > Boilerplates > Internet of Things Platform Starter

2. Fill in Name: <Name of App Here>

3. CREATE

Application is created and staged

• http://<hostname>mybluemix.net

• Creates a Node.js SDK Container

• Creates a Cloudant NoSQL Database

Page 27: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation27

• Browser-based UI for creating flows of events

• Deploying action in a light-weight runtime

• Based upon node.js• Event-driven, non-blocking model

• Flows stored as JSON, so super easy to share

• Large library available today

• Suitable for server, network, edge and mobile device placement

• Open source project on GitHub

• IBM is a major contributor

• Benefits• Rapid Development

• Simple to use with JSON

• Simple REST API

• Simple MQTT messaging

• Contributor Nodes• Simple to use other services

Node-RED

A visual tool for wiring the Internet of Things

Page 28: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation28

MQTT

Machine-to-Machine (M2M)/”Internet of Things” (IoT)

• Lightweight connectivity protocol for publish/subscribe messaging transport

• Small code footprint, limited bandwidth, low power usage

• Minimized packets and efficient distribution to multiple receivers

MQTT v3.1.1 now an OASIS Standard

• Invented by Dr. Andy Stanford-Clark (IBM) and Arlen Nipper (Eurotech)

• MQ Telemetry Transport (ISO/IEC PRF 20922)

MQTT Broker/Servers

• IBM Websphere MQ Telemetery, Message Sight, Integration Bus

• Mosquitto, Eclipse Paho, Europtech Everywhere Device Cloud, emqttd,

Xively, Moquette, Yunab.io, m2m.io, RabbitMQ, Apache ActiveMQ, HiveMQ

MQTT Client Methods

• Connect, Disconnect, Subscribe, Unsubscribe, Publish

Page 29: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation29

msg.payload

{

"topic": "iot-2/type/MKR1000/id/f8f005f5f8db/evt/status/fmt/json",

"payload": {

"d": { "IBM_IoT_Workshop": "Arduino_MKR1000", "recordType":

"sensorsRead", "DS3231_epoch": 1471003668, "DS3231_date": "08-13-

2016", "DS3231_time": "13:07:48", "DS3231_tempC": 28, "DS3231_tempF":

82.4, "SHT31_tempC": 27.72, "SHT31_tempF": 81.94, "SHT31_humidity":

45.32, "TSL2561_lux": 9, "BMP183_hPa": 1004.22, "BMP183_tempC":

28.08, "BMP183_tempF": 82.55, "BMP183_altStatic": 78.98,

"BMP183_altComputed": 68.09, "local_IP": "192.168.0.170", "mac_addr":

"f8f005f5f8db" }

},

"deviceId": "f8f005f5f8db",

"deviceType": "MKR1000",

"eventType": "status",

"format": "json",

"_msgid": "4a43bc63.b5bc44”

}

Page 30: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation30

Demo – Node-RED

Page 31: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation31

ER1 Message Payloads

deviceStart

ipapiFetch

localWeather

sensorRead

badJSON

These are all placed into one NoSQL database

Page 32: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation32

deviceStart

Page 33: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation33

ipapiFetch

Page 34: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation34

localWeather

Page 35: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation35

sensorRead

Page 36: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation36

Powerful DBaaS Operational NoSQL JSON store

Master-less architecture for

maximum scalability & availability

Advanced APIs

REST (HTTPS) API

Replication & synchronization

Geo-load balancing

Incremental MapReduce indexes

Military-grade Geospatial indexes

Lucene full-text search

Offline access to mobile apps & data

A fully-managed NoSQL database layer that

can be developed & deployed in days

Cloudant – NoSQL Database as a Service

Cloudant delivers a fully-managed database in service to the Analytics, App, and API economy

SparkIntegration(Spark SQL)

dashDBIntegration

(Analytics)

© 2016 IBM Corporation

Page 37: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation37

Demo – Cloudant

Page 38: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation38

Edge to Warehouse

Cloudant sits on the Edge of Cloud

• Fast, minimal latency, scalable

• Transactional

• Not the place for long-term storage

• Not the place for analytics

Move IoT data to a warehouse

• Basic business intelligence

• Connect to other sources of data

• The start of analytics journey

dashDB on Bluemix

• Data Warehouse as a Service

Page 39: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation39

IBM dashDB – Analytics Warehouse as a Service

For apps that need:

• Elastic scalability

• High availability

• Data model flexibility

• Data mobility

• Text search

• Geospatial

Available as:• Fully managed DBaaS

• On-premises private cloud

• Hybrid architecture

BLU Acceleration

Netezza In-Database

Analytics

Cloudant NoSQL Integration

In-database analytics capabilities for best performance atop a fully-managed warehouse

dashDB MPP

Fully-managed data warehouse on cloud

Choice of SoftLayer or Amazon Web Services

BLU Acceleration columnar technology +

Netezza in-database analytics

BLU in-memory processing, data skipping, actionable

compression, parallel vector processing, “Load & Go”

administration

Netezza predictive analytic algorithms

Fully integrated RStudio & R language

Oracle compatibility

Massively Parallel Processing (MPP)

On disk data encryption and

secure connectivity

for

Analytics

Page 40: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation40

Demo – dashDB

Page 41: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation41

Replicating Cloudant JSON Data into dashDB

Cloudant’s Schema Discovery Process (SDP) translates JSON documents into a schema (or set of tables) that dashDB understands

SDP maintains continuous

synchronization from

Cloudant to dashDB

Page 42: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation42

Demo – Replication and SQL

Page 43: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation43

Tailored Experiences For Users Collaborating Together

Architects how data is

organized & ensures operability

Gets deep into the data to draw

hidden insights for the business

Works with data to apply insights

to the business strategy

Plugs into data and models &

writes code to build apps

Ingest

data

Transform

: clean

Create

and build

model

Evaluate

Deliver

and deploy

model

Communicate

results

Understand

problem and

domain

Explore and

understand

data

Transform:

shape

OUTPUT

ANALYSIS

INPUTData Engineer

Data Scientist

Business Analyst

App Developer

Data Connect

Data Science Experience

Watson Analytics

Bluemix

Page 44: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation44

What is a “Notebook”?

Pen and Paper Pen and paper has long provided the rich

experience that scientists need to document

progress through notes and drawings:– Expressive

– Cumulative

– Collaborative

Notebooks Notebooks are the digital equivalent of the

“pen and paper” lab notebook, enabling data

scientists to document reproducible analysis: Markdown and visualization

Iterative exploration

Easy to share

Page 45: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation45

Web-Based Notebooks…

Notebooks:

“interactive computational environment, in which you can combine

code execution, rich text, mathematics, plots and rich media”

Jupyter

• Based on Ipython

• Supports multiple interpreters

• Python, Scala, R

Zeppelin

• Apache incubator project

• Supports multiple interpreters

• Python, Scala, others

Data Scientist

&

Notebooks

Page 46: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation46

Built-in learning to

get started or go

the distance with

advanced tutorials

Learn

The best of open source

and IBM value-add to

create state-of-the-art

data products

Create

Community and

social features that

provide meaningful

collaboration

Collaborate

http://datascience.ibm.com

Introducing the Data Science Experience - DSXCurrently in Public Beta

Powered by

Page 47: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation47

IBM Data Science Experience

Community Open Source IBM Added Value

Powered by IBM DataWorks in the Cloud

• Find tutorials and datasets

• Connect with Data Scientists

• Ask questions

• Read articles and papers

• Fork and share projects

• Code in Scala/Python/R/SQL

• Jupyter and Zeppelin* Notebooks

• RStudio IDE and Shiny apps

• Apache Spark

• Your favorite libraries

• Data Shaping/Pipeline UI *

• Auto-data preparation and modeling*

• Advanced Visualizations*

• Model management and deployment*

• Documented Model APIs*

• Spark as a Service

* DSX product roadmap items

Core Attributes of the Data Science Experience

Page 48: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation48

Demo – Data Science Experience

Page 49: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation49

IBM Watson Analytics - Smart Data Discovery in the Cloud

Designed to support the business professional’s analytics process so it’s easy to engage

with and find meanings and patterns in your data in minutes.

Data prep made easy

Guided exploration

Understand outcomes

Share insights

All the benefits of advanced analytics without the complexity

Page 50: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation50

Demo – Watson Analytics

Page 51: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation51

IBM investment into Apache Spark

Foster

Community

Educate 1M+ data scientists and engineers

via online courses

Sponsor AMPLab, creators and

evangelists of Spark

Infuse the

Portfolio

Integrate Spark throughout portfolio

3,500 employees working on Spark-related topics

Spark however customers want it –

standalone, platform or products

Source: https://www-03.ibm.com/press/us/en/pressrelease/47107.wss

Launch Spark Technology Cluster

(STC), 300 engineers

Open source SystemML

Partner with databricks

Contribute to

the Core

"It's like Spark

just got blessed

by the enterprise

rabbi."

Ben Horowitz

Andreessen Horowitz

Page 52: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation52

IBM has the largest investment in Spark of any company in the world

IBM Spark

IBM Spark Technology Center

• Launched in June of 2015

• Goal to hire 300 Engineers.

• Goal to Contribute to Apache

Spark Apache community

• Contributed SystemML

technology to Apache community

• STC continues to grow...

IBM Contributes to core Apache Spark Project

www.spark.tc

Page 53: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation53

http://www.spark.tc/blog/

IBM driving SQL and Machine Learning innovation..

Page 54: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2016 IBM Corporation54

Big Data University

http://bigdatauniversity.com/

Foster Community - Free Education

Page 55: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2015 IBM Corporation55

Signup to learn more!

Webinars MeetupsHands-on

Labs

Learning Resources

Email

Twitter: @data_gurus

http://ibm.biz/datagurus

Page 56: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2015 IBM Corporation56

Raffle!

Fill out the paper form

and drop it in the box.

Two books being given away!

Page 57: Perform End-to-End Data Analysis in the Cloudfiles.meetup.com/12718532/Perform End-to-End Data...IBM's statements regarding its plans, directions, and intent are subject to change

© 2015 IBM Corporation57

Dale Mumper Open Source Analytics Solution Engineer - Industrial

[email protected]