Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG...
-
Upload
nadine-schoene -
Category
Data & Analytics
-
view
237 -
download
4
Transcript of Slidedeck Mehr als Reporting - Datenanalysen mit Oracle R Enterprise - DOAG Development and DOAG SIG...
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Mehr als Reporting – Datenanalysen mit Oracle R Enterprise
Dr. Nadine Schöne Sales Consultant Oracle Direct, Sales Consulting Dr. Michael Haupt Principal Member of Technical Staff Oracle Labs, Virtual Machine Research Group 25. September 2014
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracle’s products remains at the sole discretion of Oracle.
3
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Agenda
Mehr als Standard Reporting?
Weiterführende Datenanalysen
R und Oracle R Enterprise (ORE)
Demo
Benefits
Ausblick: Mehr Performance für R
1
2
3
4
5
4
6
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Mehr als Standard Reporting?
5
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Reporting
6
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Weiterführende Datenanalysen
7
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 8
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Sensordaten-Analyse I
9
200.000 Haushalte
3 Jahre
1 Messung/Stunde
5.256 Mrd. Messwerte (2.628 Messwerte/Kunde)
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Sensordaten-Analyse II
10
10 s/Modell
200.000 Haushalte ➔
200.000 Modelle
23 Tage + 4 Stunden 4,3 Stunden
Oracle R Enterprise
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
R Screenshots
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracle Advanced Analytics
• Data Understanding & Visualization – Summary & Descriptive Statistics – Histograms, scatter plots, box plots, bar charts – R graphics: 3-D plots, link plots, special R graph types – Cross tabulations – Tests for Correlations (t-test, Pearson’s, ANOVA) – Selected Base SAS equivalents • Data Selection, Preparation and Transformations – Joins, Tables, Views, Data Selection, Data Filter, SQL time windows, Multiple schemas – Sampling techniques – Re-coding, Missing values – Aggregations – Spatial data – R to SQL transparency and push down • Classification Models – Logistic Regression (GLM) – Naive Bayes – Decision Trees – Support Vector Machines (SVM) – Neural Networks (NNs) • Regression Models – Multiple Regression (GLM) – Support Vector Machines
Große Bandbreite an In-Database Data Mining und statistischen Funktionen
Clustering – Hierarchical K-means – Orthogonal Partitioning – Expectation Maximization
Anomaly Detection – Special case Support Vector Machine (1-Class SVM)
Associations / Market Basket Analysis – A Priori algorithm
Feature Selection and Reduction – Attribute Importance (Minimum Description Length) – Principal Components Analysis (PCA) – Non-negative Matrix Factorization – Singular Vector Decomposition
Text Mining – Most OAA algorithms support unstructured data (i.e. customer
comments, email, abstracts, etc.) Transactional Data
– Most OAA algorithms support transactional data (i.e. purchase transactions, repeated measures over time)
R packages—ability to run open source – Broad range of R CRAN packages can be run as part of database
process via R to SQL transparency and/or via Embedded R mode
* included in every Oracle Database
Deskriptive Datenanalyse & Visualization
Klassifikations- & Regressions Modelle
Clustering
Verwendung von Open Source R packages
Daten Aufbereitung & Transformationen
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Wichtige Themen für Enterprise Data Analytics
1. Skalierbarkeit
2. Performance
3. Entwicklung &
Produktion
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
R und Oracle R Enterprise (ORE)
14
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Aspekte herkömmlicher R/Datenbank-Interaktion
15
R logo © R Foundation, vonhttp://www.r-project.org
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
R Engine andere R-Packages
Oracle R Enterprise Packages
User R Engine (Dektop)
1
User-Tabellen
Oracle DB SQL
Ergebnisse
Datenbank Compute Engine 2 R Engine andere
R-Packages
Oracle R Enterprise Packages
R Engine(s) verwaltet durch Oracle DB
R
Ergebnisse
3
Post-Processing der Ergebnisse
Analysen, die in der Oracle DB nicht verfügbar sind
Ausführung in Collaboration mit der Oracle DB
„Collaborative Execution“-Modell
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Oracles R Technologien
•Oracle R Distribution
•ROracle
•Oracle R Enterprise
•Oracle R Advanced Analytics for Hadoop
Für R Comunity frei verfügbar
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Demo
18
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Benefits
19
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Benefits I
5.881 R-Packages
20
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Benefits II
21
Integration
Performance & Scalability
Performante Enterprise Predictive Analytics Applikationen
Geringe Total Costs of Ownership
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Ausblick: Mehr Performance für R
22
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
FastR
• Neuimplementierung von R in Java
– Verwendung von Graal (Compiler) und Truffle (AST-Interpreter)
– Dynamische Compilierung, Skalierung auf heterogenen Architekturen
– Beteiligt: Oracle Labs (Deutschland, USA, Österreich), JKU Linz, Purdue University, TU Dortmund
23
U
U U
U
U I
I I
G
G I
I I
G
G
Node Rewriting
for Profiling Feedback
AST Interpreter
Rewritten Nodes
AST Interpreter
Uninitialized Nodes
Compilation using
Partial Evaluation
Compiled Code
Node Transitions
S
U
I
D
G
Uninitialized Integer
Generic
DoubleString
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
“R is a powerful and interesting tool for data analysis! ORE brings R into a scalable DB engine (solving problems of data management, analysis and scalability). We actually can obtain information and added value from not so actively used data.”
– Stefano Alberto Russo, Researcher at CERN Openlab
24
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Weitere Informationen
25
ORE-Diskussionsforum: https://community.oracle.com/community/developer/english/business_intelligence/data_warehousing/r
Oracle Advanced Analytics: http://www.oracle.com/technetwork/database/options/advanced-analytics/index.html
ORE-Blog: https://blogs.oracle.com/R/
FastR: https://bitbucket.org/allR/fastR
Graal/Truffle: https://wiki.openjdk.java.net/display/Graal/Main
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. |
Kontakt
Dr. Nadine Schöne| Sales Consultant
Email: [email protected]
Tel: +49 331 200 7190
ORACLE Deutschland B.V. & Co. KG
Schiffbauergasse 14
14467 Potsdam
Copyright © 2014 Oracle and/or its affiliates. All rights reserved. | 27