SAP PPT Template - ScaDS Stefan Bäuerle, Jonathan Dees, SAP August, 2017 SAP HANA and SAP Vora Data...

39
CUSTOMER Stefan Bäuerle, Jonathan Dees, SAP August, 2017 SAP HANA and SAP Vora Data Management for Enterprises

Transcript of SAP PPT Template - ScaDS Stefan Bäuerle, Jonathan Dees, SAP August, 2017 SAP HANA and SAP Vora Data...

CUSTOMER

Stefan Bäuerle, Jonathan Dees, SAP

August, 2017

SAP HANA and SAP VoraData Management for Enterprises

2CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Introduction

SAP HANA

▪ Overview, Scenarios

▪ InMemory & Multi-Model

SAP Vora & SAP DataHub

▪ Motivation

▪ SAP Vora Scale-Out Architecture and Challenges

▪ SAP DataHub Architecture and Challenges

Summary

Agenda

Introduction

4CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SAP is the world leader in enterprise applications in terms of software and software-related service

revenue. Based on market capitalization, we are the world’s third largest independent software

manufacturer.

335,000+Customers in more than 180

countries

87,100+ Employees in 130+

countries

€22.06bnTotal Revenue (IFRS) in

FY2016 (preliminary)

87%Of Forbes Global 2000 are

SAP customers

45 YearsOf history and innovation

100+ Innovation and development

centers

15,000+SAP partner companies

globally

125 Mil.Subscribes in our cloud user

base

*Source: http://go.sap.com/corporate/en/company.html

5CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Big Data & Internet-of-Things

▪ Big Data for Enterprises

▪ Combining big data (devices,

social) with Enterprise data

Bi-Modal IT

▪ Managing two coherent styles for

work

– Focus on predictability,

– Focus on exploration

Digital Transformation

▪ Customer Experience

▪ Operational Processes

▪ Business Models

▪ Paradigm shift in data processing

▪ Mobile, cloud, social, big data

Key Trends

6CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SAP HANA Platform & Database | SAP Vora

INCN

FR

VN

UK NL

ES

DECA

KRUS

Strong Enterprise-level Data Management Background

SAP HANA

SAP Vora

SAP IQ

SAP ASE

SAP MaxDB

SAP Event Stream Proc.

SAP SQL Anywhere

SAP Data Services

SAP Replication Server

SAP Information Steward

SAP Power Designer

SAP Enterprise Architect

SAP HANA

8CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Free IT to Focus on Innovation

Leverage ALL Data for Innovation

Discover Deeper Insight to Drive Innovation

Develop Solutions to Power Innovation

Adopt Innovation at The Pace of Your Business

SAP HANA: The Modern Data Platform Innovation Journey

S/4HANA

Vora

BW/4HANA

SAP HANA,

express edition

SAP BW on

SAP HANA

SAP Business

Suite on SAP

HANA

SAP HANA

Enterprise

Cloud & SAP

Cloud Platform

Powering the

Digital Core

Transform Big Data

Analytics

Next Gen Data

Warehouse

THE Modern

Innovation Platform

Start for free, fast,

and grow

9INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

TECHNOLOGY

TREND(S)

Super Computing

Smarter World

BENEFITS

Cost reduction

Customer satisfaction

Sustainability/CSR

CUSTOMER SITUATION AND PAIN POINTS

thousand taxi cabs, 7,000 buses and 1 million private cars running Nanjing,

once the capital of China, is one of the top 20 cities in China with a

population over 8 million and a GDP of $141.7 billion (with a growth rate of

10% in 2014). It is a highly influential city for all of China.

• The overall traffic volume is immense. Within the city there are about

10,000 g in the city road network.

• Economically, congestions can have a huge impact. For example,

Beijing claimed a 12 billion US$ loss due to congestions in 2013.

• Increasing urbanization makes it difficult for city like Nanjing to control

traffic.

NEED FOR CHANGE

• Plan and optimize Nanjing traffic regulation.

• Proactively manage different traffic and transportation scenarios

to provide better services to citizens and help better urban

planning.

• Manage next phase of sustained growth with the help of

advanced technology from SAP for big data analysis and

predictive analytics, mobile computing and cloud-based

solutions

SITUATION AFTER DIGITIZATION• Together with SAP, the city of Nanjing developed a next-generation

solution, Smart Traffic, which includes the use of sensors and RFID

chips to generate continuous data streams about the status of

transportation systems across the city.

• Adopting smart traffic control technology to crunch the 20 billion data

points captured in the city every year has helped the city to produce

actionable information for predictively responding to traffic congestion.

• Smart Traffic solutions, like electronic tolling, traffic management,

parking guidance & reservation systems, navigation support, real-time

traffic updates and passenger display systems help understand the city’s

dynamic travel patterns and determine the impact estimation to

eventually mitigate congestions or car accidents.

TANGIBLE RESULTS

Smart Traffic Control enables cities to optimize traffic-light

controls and free up additional car lanes during the rush hour to

alleviate congestion.

With SAP’s new approach for zoning -Traffic Analysis Zone

(TAZ) the city is able to create one digital map that allows for a

holistic view on the as-is situation on the spot and generates

recommendations for traffic planning.

The map can tell exactly how many vehicles arrive or leave a

certain city district in the early morning hours. This information

is then used to predict the after-work traffic.

Concrete numbers to be expected.

SOURCES

Smart Cities – How SAP

Helps Clear Traffic Jams

Connected Cars Rev Up

For A Revolution

Nanjing Video

Smart Traffic Innovations

for Cities and

Passenger Transit from

SAP

Smart Traffic in the City of Nanjing

10INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Smart Traffic in the City of Nanjing

Platform used to develop the

application

SAP HANA Platform

Processor primarily used in the

deployment

Intel® Xeon® processor E7

Technical impact of the solution1. The solution processes over 100 million

records of FCD (Floating Car Data) and

FDD (Fix Device Data) every day

2. The solution calculates Traffic

Performance Index, Road Speed and

Road Volume every 5 minutes based on

the data volume in point 1

3. City management and 0.8 million citizen

uses the result of our solution

11INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Zalando

Supporting the Demands of a Business Transformation with SAP HANA

Company Zalando SE

HeadquartersBerlin, Germany

IndustryRetail

Products and ServicesClothing and fashion accessories

Employees10,000

Revenue~€3 billion

SourcesTransformation Study

Objectives

Support transformation to become Europe’s leading online fashion platform

Increase financial processing capacity to handle the demands of the platform business

Develop a smart data strategy

Why SAP

Financial processes that have run with SAP software for many years

Resolution

Migrated a contract accounts receivable and payable (FI-CA) solution from SAP that was running on Oracle to

the SAP HANA platform

Became the first SAP customer in the world to run this solution on SAP HANA

Was selected as a finalist for an SAP HANA Innovation Award

Benefits

Increased financial processing capacity by an order of magnitude

Greatly accelerated monthly closing

Increased employee productivity by eliminating runtime issues

Improved customer and employee satisfaction as well as attractiveness as an employer

With the migration to SAP HANA, the system is now ready to handle the new demands. It can conservatively

process at least 10 times the current volume

40xAcceleration of dunning run performance

25%Reduction in invoice processing time

660 millionDocuments manageable in real time

SAP HANA Analytic ContentData Marts

“With SAP HANA, we have substantially accelerated all financial processes from dunning through invoice

processing to closing activities. Our staff is now much more confident in their day-to-day work, including

interacting with customers.”

Martina Hilzinger, Head of SAP Competence Center, Zalando SE

12INTERNAL© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Objectives Build a real-time supply chain Break down information communication barriers between different enterprises, and

integrate them to enable real-time information sharing and workflow coordination Synchronize collected data in real time via the Internet of Things for production facilities

and detection equipment to enable real-time quality control and progress tracking

Why SAP Good existing partnership with SAP Unmatched performance of the SAP HANA platform

Resolution Implemented SAP HANA Integrated SAP HANA directly with R database Launched the project in three stages: quality management and control, creating an order

system and instant-messaging platform, and logistics management

Benefits Significant cost reduction Greatly improved efficiency Greater quality management and control

3%–5%Improvement in process capability and process variation

2%–3%Increase in product-test

pass rate

30%Reduction in quality cost

0Production delays caused by quality problems

“SAP HANA is a high-performance computing platform. Without it, there is nothing we can do to

extend the supply chain, no matter how superb our ideas and quality-control procedures are.”

Xiangguo Dong, IT Director and Project Lead, Advanced Micro-Fabrication Equipment Inc.

49098 (17/01) This content is approved by the customer and may not be altered under any circumstances.

AMEC

Creating a Real-Time Supply Chain by breaking down Interenterprise Barriers

Company

Advanced Micro-Fabrication

Equipment Inc.

Headquarters

Shanghai, China

IndustryIndustrial machinery and components –semiconductors

Products and Services

Primo D-RIE dielectric tech

tools for production of nano-

scale chips, Primo TSV etch

tools for production of 3D

chips, metal organic chemical

vapor deposition tools for

semiconductor lighting and

power-device chip production

Employees600

Web Site

www.amec-inc.com

13CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SAP HANA Platform: The platform for all applications

Simplify, accelerate, innovate

DATABASE SERVICES

Web Server JavaScript

Graphic Modeler

Data Virtualization ETL & Replication

Columnar OLTP+OLAP

Multi-Core & Parallelization

Advanced Compression

Multi-tenancy Multi-Tier Storage

Graph Predictive Search

DataQuality

SeriesData

Business Functions

SAP Vora, Hadoop & Spark Integration

Streaming Analytics

Application Lifecycle Management

High Availability &Disaster Recovery

OpennessDataModeling

Admin &Security

Remote Data Sync

Spatial

Text Analytics

Fiori UX

ALM

</>

APPLICATION SERVICES INTEGRATION & QUALITY SERVICESPROCESSING SERVICES

SAP, ISV and Custom Applications

All Devices

OLTP + OLAP ONE Open Platform ONE Copy of the Data

S A P H A N A P L A T F O R M

MachineLearning

{.}JSON

14CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SAP HANA Platform: Database Services

Breakthrough innovations

Turns data into real-time information

No database tuning required for complex and ad hoc queries

Run Transactions and Analytics together on one system and one copy of data

Ready for Cloud, Hybrid, or On-premise deployment

Not limited by the size of memory

DATABASE SERVICES

Columnar OLTP+OLAP

Multi-Core & Parallelization

Advanced Compression

Multi-tenancy Multi-Tier Storage

High Availability &Disaster Recovery

OpennessDataModeling

Admin & Security

APPLICATION SERVICES INTEGRATION & QUALITY SERVICESPROCESSING SERVICES

S A P H A N A P L A T F O R M

15CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

DATABASE SERVICES

SAP HANA Platform: Database Services

Comprehensive advanced data processing and analytics

Run applications with dramatically different datatype characteristics in the same system

Optimize graph, planning, and rules applications on the same data

Empower your business via built-in predictive analytics, business functions, and data quality

PROCESSING SERVICES

Graph Predictive Search SeriesData

Business Functions

Streaming Analytics

Spatial Text Analytics

INTEGRATION & QUALITY SERVICES

APPLICATION SERVICES

S A P H A N A P L A T F O R M

MachineLearning

{.}JSON

16CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

DATABASE SERVICES

APPLICATION SERVICES

SAP HANA Platform: Application Services

Web server and database in one system reducing data movements

ALM</>

Web Server JavaScript Graphic Modeler

Application Lifecycle Management

Fiori UX

INTEGRATION &QUALITY SERVICES

PROCESSING SERVICES

Deliver consumer-grade User Experiences for any device, automatically

Support standards and languages that developers already know how to use

– Java, Java Script, C++, Node.JS, SQL, JSON, ADO.NET, JDBC, ODBC, OData, HTML5, MDX, XML/A

Built-in tools to develop, version-control, bundle, transport, and install applications

S A P H A N A P L A T F O R M

17CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

DATABASE SERVICES

APPLICATION SERVICES

INTEGRATION & QUALITY SERVICES

PROCESSING SERVICES

SAP HANA Platform: Integration Services

Data from any source for a complete view of the business

Data Virtualization

ELT & Replication

DataQuality

SAP Vora, Hadoop & Spark Integration

Remote Data Sync

Access information stored in data silos while keeping the data in place

Replicate and move any type of data in real-time to the cloud and on-premise when necessary

Capture and analysis live data streams and route to appropriate storage or dashboard

Synchronize data between HANA and thousands of remote databases (SQL Anywhere, UltraLite)

Multiple access points from HANA to BigData: thru SAP Vora, Spark, Hive, HDFS & MapReduce

S A P H A N A P L A T F O R M

18CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Live Earth Data: Analysis & Insight Through Machine Learning

SAP HANA Demo: Predicting Landslides

19CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Graph▪ A schema-flexible and scalable data

store for storage, processing, combination, exploration, and analysis of irregularly structured data with complex and dynamic relationships

▪ Provides property graph model of Vertices & Edges

▪ Powerful declarative data query and manipulation language for property graphs

▪ Imperative API for developing application-specific graph algorithms directly in the database engine

▪ Cypher for Pattern matching

Document Store▪ Enterprise ready Document Store to

handle JSON documents natively

▪ Full ACID support

▪ Extended processing, e.g. joins between collections, aggregations

▪ Multi-Model Cross-processing, e.g. Joins

▪ High performance (even for analytics)

▪ SQL-like syntax

Multi-Model Data Processing

Graph

NoSQL

Document Store

{.}

Advanced Processing

Search

SeriesData

Streaming Analytics

Spatial

Text Analytics

SAP HANA Platform: NoSQL & Multi-Model Data Processing Capabilities

SAP Vora & SAP DataHub

21CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Big Data? More than a buzzword

Huge Data Volume

Terabytes to multiple Petabyte,

low value data

High Data Processing

Velocity

High speed streaming from a huge

number of different sources

Increasing Data Variety

Diverse sources, many data types:

like text, graph, image, custom..

Data quality / data trust

systematic data errors,

format issues, missing structure,

missing entries, alignment issues,

how can data/source be trusted…

22CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Modern landscapes: an example

Big DataRaw data Enterprise data

Data scientist Data analyst

DW admin/developer

IT department Businessuser

ORACLESAP ERP

Developer

Key challenges and questions

• Diverse technologies

• Diverse consumers

• Diverse requirements

• How can huge and ever-

growing data volumes be

handled?

• Rapidly changing landscape –

where is the next opportunity?

Another set of silos

is not the answer!SAP

S/4HANA

23CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Finalize the vision: flow-based big data management

REFINE STRUCTURE MODEL CONSUMECOLLECT

Data Lake Enterprise Data Warehouse

Apps

Cust.

Data

Distributed Big

Data Management

Master

data

IoT

data

Data

prov.

10101

$

#123

%&?§

JOIN, LOCK-UP,

FILTER, CLEANSE

SCHEMA, CONNECTIONS,

CORRELATION, TYPES

AGGREGATE, FEDERATE,

MODEL, TRANSFORM

ANALYZE, EVENTS,

TRIGGER

24CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SDA

Kubernetes Cluster

DevicesDevices

SAP VoraSAP Data Hub

Pipeline Engine

Applications

Data Ingestion Pipeline

Op1 Op 2 Op 3 Op 4

Toward a cloud-ready and efficient big data processing Architecture

SparkCustom

CodeRules

Cluster Orchestration

Edge / Device Management

External / Cloud Storage

DevicesDevices

DevicesDevices

IoT Service

Message Broker

SAP HANA

In-memory store

Predictive

Maintenance

Real-time Demand/

Supply Forecast

SPARKInterface

Collection

Fraud

DetectionBrand

Sentiment

Product

RecommendationPersonalized

Care

Processing Engine

Engine Engine

Engine Engine

REFINE

STRUCTURE MODEL

CONSUME

COLLECT

25CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Manage Big Data with SAP Vora

SAP VORA SAP HANA

In-memory store

Data lake or cloud provider

Cloud Storage

Spark extensions

Currency conversion

Unit supportHierarchies

Table metadata

Persist

Kubernetes Cluster

Intelligent cluster management

Partition assignment Transaction mgmt Node controlControl

Metadata

In-memory processing engines

Relational Time series

GraphDocuments

Compute

Disk-based

Engine

Abstract

Disk-based

EngineDisk-based

Engine

Disk-based

Engine

Disk-based

EngineStore

Distributed LogSAP HANA

wire (SDA)

Spark

controler

Model

26CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SQL Extensions for (non)-relational data

One holistic experience => Integrate into one single language (SQL)

Graph Engine

create graph mygraph;

select a.name, m.title from actor a, movie m using graph mygraph where m in a.plays_in;

Document Store

create collection mydocs;

insert into mydocs { name: “james”, age: 42 };

select { age: avg(age), name: name } from mydocs group by name;

Timeseries

create table myseries (ts timestamp, i int ) series ( period for timeseries ts start … end ... equidistant increment by 1 second splinerror 1.2 percent);

select median(j) from t1 where period between timestamp ... and timestamp ...;

Relational Main Memory/Disk Store

create table mytab( i int ) store on disk;create table mytab( i int ) store in memory;

select i, count(*) from mytab group by i;

28CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Example: Join Query (1)

• PARTSUPP partitioned: 4 hosts by partkey

• PART partitioned: 6 hosts with ranges by partkey

LOGICAL_TABLE_ACCESSext_type_= LOGICAL

order_source_= 0rel_id_= 0

TABLEschema_id_= 0

table_id_= 0schema_name_= TPCH

table_name_= PARTSUPPsource_type_= TABLE

mat_type_= TABLEexact_size_= 8000

estimated_size_= (8000, 8000, 8000)max_size_= 8000

partition_type_= RANGE

table_

COLUMNschema_id_= 0

table_id_= 0col_id_= 1

name_= PS_PARTKEYis_asc_sorted_= trueexact_null_count_= 0

estimated_null_count_= (0, 0, 0)

columns_[0]

FIX_TYPEsql_type_= INTEGERvalue_type_= INT16

data_type_CONST_INT

value_= 1

min_

CONST_INTvalue_= 2000

max_

data_type_ data_type_

LOGICAL_TABLE_ACCESSext_type_= LOGICAL

order_source_= 1rel_id_= 1

TABLEschema_id_= 0

table_id_= 1schema_name_= TPCH

table_name_= PARTsource_type_= TABLE

mat_type_= TABLEexact_size_= 2000

estimated_size_= (2000, 2000, 2000)max_size_= 2000

partition_type_= RANGE

table_

COLUMNschema_id_= 0

table_id_= 1col_id_= 0

name_= P_PARTKEYis_unique_= true

is_asc_sorted_= trueexact_null_count_= 0

estimated_null_count_= (0, 0, 0)

columns_[0]

FIX_TYPEsql_type_= INTEGERvalue_type_= INT16

data_type_CONST_INT

value_= 1

min_

CONST_INTvalue_= 2000

max_

data_type_ data_type_

LOGICAL_JOINext_type_= LOGICALjoin_type_= INNER

l_child_ r_child_

COMPcomp_type_= EQ

join_pred_

LOGICAL_FIELDrel_id_= 0col_id_= 1

FIX_TYPEsql_type_= INTEGERvalue_type_= INT16

data_type_

LOGICAL_FIELDrel_id_= 1col_id_= 0

FIX_TYPEsql_type_= INTEGERvalue_type_= INT16

data_type_

l_child_ r_child_

FIX_TYPEsql_type_= BOOLEAN

value_type_= INT8

data_type_

LOGICAL_PROJECText_type_= LOGICAL

rel_id_= 2mat_type_= RESULT

child_

LOGICAL_FIELDrel_id_= 1col_id_= 0

exprs_[0]

data_type_

LOGICAL_CURSORext_type_= LOGICAL

child_

NAMED_EXPRname_= P_PARTKEY

exprs_[0]

LOGICAL_FIELDrel_id_= 2col_id_= 0

data_type_

data_type_

child_

SELECT p_partkey FROM partsupp JOIN part ON ps_partkey = p_partkey

30CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Challenges (1): Scalable, Fault Tolerant Algorithms

• Abstraction exchange operator (1:n, n:1, n:m) good enough? (compare: MPI)

• Amdahls law: A single non-scalable part prevents overall scalability (order of magnitudes)

=> Make everything scalable

• Examples

• Order by

• Join

• Aggregation

• Window Functions

• Common Table Expressions, Recursive

• Table Functions

• Non-Relational Operators (for Graph, ... )

• Custom Functions

31CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Data Partitioning - Schemes

• Different partitioning types:

hash, range, block, and interval

• Multi-dimensions partitioning

• Consistent hashing for partition to node

assignment

• Split partitions for node utilization

• Load pre-partitioned files

32CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Challenges (2): Data Partitioning

• Incremental partition creation (for range and interval)

• Statistical facts of new data

• Multi-dimensions partitioning

• Outliers processing

• Graph partitioning

• Changing partition criteria while staying high available

(no table lock)

33CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SDA

Kubernetes Cluster

DevicesDevices

SAP VoraSAP Data Hub

Pipeline Engine

Applications

Data Ingestion Pipeline

Op1 Op 2 Op 3 Op 4

Toward a cloud-ready and efficient big data processing Architecture

SparkCustom

CodeRules

Cluster Orchestration

Edge / Device Management

External / Cloud Storage

DevicesDevices

DevicesDevices

IoT Service

Message Broker

SAP HANA

In-memory store

Predictive

Maintenance

Real-time Demand/

Supply Forecast

SPARKInterface

Collection

Fraud

DetectionBrand

Sentiment

Product

RecommendationPersonalized

Care

Processing Engine

Engine Engine

Engine Engine

REFINE

STRUCTURE MODEL

CONSUME

COLLECT

34CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SAP Data HubIntegrate, orchestrate and manage diverse big data infrastructures

Existing Systems

Monitoring

Orchestration

Data Management &

Preparation

Hadoop

Cloud Storage

Machine Learning

SAP

Vora

Distr ibuted Data Systems

SAP

HANAData Driven App

Data Driven App

Data Driven App

SAP Data Hub

SAP Cloud Platform

Data Flow

Pipeline Engine

35CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SAP Datahub Pipeline Engine

• Data driven execution

• Operators with In- and Output ports

• Data transfer via NATS messaging

• Easy and flexible to define

(Python, Scala, Go, C++, …)

• Run and scale in docker container

• Comes with existing operators and other frameworks

(Tensorflow, Spark, R-lang, SAP Vora, SAP HANA,…)

Flow Engine

Flow Editor

Flow Repository

Op

Lib Lib Lib Lib

Op Op Op

Flow Image Composer

Docker Image

Docker Image

Run

Flow Execution

36CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Challenge (3): SAP Pipeline Engine

• Operators stateless

• + easier scalable and deployment

• - makes certain implementations more complicated

• Efficient and flexible type specification between operators

• support runtime + compile time

• support inheritance

• multiple languages

• detect errors early

• Flexible edges without interfering with scalability

Summary

38CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

SAP Data Hub

SAP HANA

▪ Digital foundation for next-generation applications using latest technology innovations

▪ Keeping typically enterprise data and other hot data

▪ Data access layer to additional data like big data handled by e.g. SAP Vora

SAP Vora

▪ Big Data management infrastructure for the enterprise

▪ Operating on clusters – providing different engines and storage options

SAP HANA & SAP Vora

▪ Integrated data management across enterprise Data and Big Data/IoT data

SAP Data Hub

▪ Combination of core SAP product and platform engagements

▪ The foundation: SAP HANA Platform + SAP VORA

▪ Simplifying Enterprise and Big Data processes across distributed data landscapes

Key Take-aways

Spark / Hadoop / HDFS

SAP Vora…

Optional: Raw storage: Swift / S3

HANA

InMemory

Dynamic Tiering

39CUSTOMER© 2017 SAP SE or an SAP affiliate company. All rights reserved. ǀ

Selected list of research topics

▪ Accelerators

– General: Use specific hardware technologies to improve SQL processing

– FPGA

▫ Accelerating text search algorithms using FPGAs

▫ Leveraging NVMe and FPGAs as a cold store in SAP HANA

– GPU

▫ Re-visiting GPU-based database kernel acceleration with latest advancements in GPU architectures

– SGX: Evaluation of Flexibility and Scalability of Intel® SGX

▪ Aging / Data tiering

– Efficient Query Processing on Hot and Cold Data in In-Memory Databases

▪ Machine Learning

▪ Scalable Distributed Algorithms

▪ Efficient Partitioning

▪ Dataflow Processing Framework

Research Areas

Thank you.

Contact Information:

Stefan Baeuerle: [email protected]

Jonathan Dees [email protected]