Sabrina Kirstein @ RapidMiner

25
Developing Extensions for RapidMiner …rapidly November 17 th , 2014 Sabrina Kirstein

Transcript of Sabrina Kirstein @ RapidMiner

Page 1: Sabrina Kirstein @ RapidMiner

Developing Extensions for RapidMiner …rapidly

November 17th, 2014

Sabrina Kirstein

Page 2: Sabrina Kirstein @ RapidMiner

RapidMiner Company Overview

2

Easy-to-use, blazing fast, and very easy to integrate with any IT infrastructure

Support from a thriving community of contributors creating new extensions and applications

Processes designed in RapidMiner can be one-click deployed to RapidMiner Server or RapidMiner Cloud

A unique Marketplacefor independent developers to publish their innovative extensions

RapidMiner delivers the power of predictive analytics to business users. No programming required.

More than 60 connectors (incl. SAP, Hadoop, Cloud connectors like Twitter and Zapier) allowing easy access to structured and unstructured data.

Page 3: Sabrina Kirstein @ RapidMiner

RapidMiner History

3

Cloud• Cloud• Hadoop

Business Source• Commercial Editions• Community Editions• Client and Server

Open Source • Command Line• Initial Workbench

Open Source • Complete

Workbench• Community

Extensions• Marketplace

Community Growth

2007 2010 2013 2014

5,000 30,000 150,000 250,000

Page 4: Sabrina Kirstein @ RapidMiner

RapidMiner Metrics

4

60+Employees

Worldwide

100+Active

Developers

600+Customers in

over 50 Countries

40,000+Downloads

per Month

35,000+Active Deployments

with over 250,000 Users

Page 5: Sabrina Kirstein @ RapidMiner

Product Overview

5

Page 6: Sabrina Kirstein @ RapidMiner

RapidMiner Studio

• With access to over 1500 operators, the Java-based visual environment of RapidMiner allows for rapid data mining process development

6

Visual Process Design Environment

Page 7: Sabrina Kirstein @ RapidMiner

Accelerators

7

Wizard

• Selection of data and label (e.g. churn) column.

• Label column contains missings values if unknown – those will be predicted

Results

• Predictions (individuals, churn predictions)

• Descriptive model

• Model accuracy and lift chart

Page 8: Sabrina Kirstein @ RapidMiner

RapidMiner Cloud Repository & Execution

8

Page 9: Sabrina Kirstein @ RapidMiner

RapidMiner Server

9

The RapidMiner Server provides enterprise-wide process development and process to web-service conversion with dynamic dashboards and data visualizations.

Page 10: Sabrina Kirstein @ RapidMiner

Extensions and the Marketplace

10

http://marketplace.rapidminer.com

Page 11: Sabrina Kirstein @ RapidMiner

Existing Extensions

11

Edda – Extensions for Binominal Text Classification

Instance selection and Prototype based rules

RapidMiner Finance and Economics Extension

Multimedia Mining Extension

Page 12: Sabrina Kirstein @ RapidMiner

RapidMiner Finance and Economics Extension

Edda – Extensions for Binominal Text Classification

Existing Extensions

Confidential

12

Instance selection and Prototype based rules

Multimedia Mining Extension

Page 13: Sabrina Kirstein @ RapidMiner

Linked Open Data Extension

• Assume a rating system for books giving us an ISBN number and a rating from 1 to 5

• Goal: Predict the popularity of new books

13

Page 14: Sabrina Kirstein @ RapidMiner

Linked Open Data Extension

• Assume a rating system for books giving us an ISBN number and a rating from 1 to 5

• Goal: Predict the popularity of new books

14

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX ontology: <http://dbpedia.org/ontology/> select distinct ?book ?author ?isbn ?country ?abstract ?pages ?languagewhere {

?book rdf:type ontology:Book . ?book ontology:author ?author .?book ontology:abstract ?abstract .?book ontology:isbn ?isbn .?book ontology:numberOfPages ?pages .?book ontology:language ?language .?book ontology:country ?country .

}

Page 15: Sabrina Kirstein @ RapidMiner

Linked Open Data Extension

• Assume a rating system for books giving us an ISBN number and a rating from 1 to 5

• Goal: Predict the popularity of new books

15

……

Page 16: Sabrina Kirstein @ RapidMiner

Text-/Web-Mining Extensions

16

Page 17: Sabrina Kirstein @ RapidMiner

Multimedia Mining Extension

17

Page 18: Sabrina Kirstein @ RapidMiner

WhiBo Extension

18

Page 19: Sabrina Kirstein @ RapidMiner

MLWizard Extension

19

1. Define data location

2. Evaluation of different models

Page 20: Sabrina Kirstein @ RapidMiner

MLWizard Extension

20

3. Load the best model

4. The process will be designed for you

Page 21: Sabrina Kirstein @ RapidMiner

How to extend RapidMiner Studio

Confidential 21

Page 22: Sabrina Kirstein @ RapidMiner

How to extend RapidMiner Studio

Confidential 22

git clone https://github.com/rapidminer/rapidminer-extension-tutorial.gitgradle installExtension

• Live Demo:

– Extension skeleton

– Operators

– Special data objects

– Advanced Extension elements

– Accelerators

• Documentation

http://www.rapidminer.com/documentation

Page 23: Sabrina Kirstein @ RapidMiner

How to integrate RapidMiner

• By web services:

23

Web Service API

1. Export process as a web service in RM Server

2. Select output format(JSON, XML, PNG, …)

3.• HTTP POST to that

URL• Read process results

from HTTP responseor

• <iframe> into other Website

Page 24: Sabrina Kirstein @ RapidMiner

How to integrate RapidMiner

• OEM:

24

Java

1. RapidMiner can be easily invoked

2. Call RapidMiner.init()3. Use the code:

Create processes, run processes or transform data

Page 25: Sabrina Kirstein @ RapidMiner

RapidMiner USA

RapidMiner, Inc. (Headquarters)10 Fawcett StCambridge, MA 02138United States

E-mail [email protected] Phone +1 - 617 - 401 - 7708Fax +1 - 617 - 401 - 7709

THANK YOU

25

RapidMiner Germany

RapidMiner GmbHStockumer Str. 47544227 DortmundGermany

E-mail [email protected] Phone +49 - 231 - 425 786 9-0Fax +49 - 231 - 425 786 9-9

RapidMiner UK

RapidMiner Ltd.Quatro House, Frimley RoadCamberley GU16 7ERUnited Kingdom

E-mail [email protected] Phone +44 1276 804 426Fax +1 - 617 - 401 – 7709

www.rapidminer.com

RapidMiner Hungary

RapidMiner KftIpar utca 51095 BudapestHungary

E-mail [email protected] +44 1276 804 426Fax +1 - 617 - 401 - 7709