IBM Storage Reference Architecture for AI applied to ...on- Transparent HDFS OpenStack Cinder Glance

download IBM Storage Reference Architecture for AI applied to ...on- Transparent HDFS OpenStack Cinder Glance

of 16

  • date post

    04-Jul-2020
  • Category

    Documents

  • view

    1
  • download

    0

Embed Size (px)

Transcript of IBM Storage Reference Architecture for AI applied to ...on- Transparent HDFS OpenStack Cinder Glance

  • © 2018 IBM Corporation

    Frank Kraemer IBM Systems Architect mailto:kraemerf@de.ibm.com Nvidia GTC 2018 10/2018

    IBM Storage Reference Architecture for AI applied to Autonomous Driving (AD)

    mailto:kraemerf@de.ibm.com

  • © 2018 IBM Corporation

    Autonomous Driving = See + Think + Act

    1 32

    https://autoware.ai/

    The Automotive Industry has to solve this highly complex problem.

    https://autoware.ai/

  • © 2018 IBM Corporation

    Automotive Sensor Setup for AD

    3 http://currencyobserver.com/2017/12/global-automotive-sensors-market-2017-2022/

    Each data source: ~ 2 Gbit/s Sensors sets: ~ 30 Gbit/s Data collection volume: ~ 12-15 TB/h

    http://currencyobserver.com/2017/12/global-automotive-sensors-market-2017-2022/

  • © 2018 IBM Corporation

    Automotive Industry generates large amounts of data

    Sources: Images from https://www.youtube.com/watch?v=4jW0fJ80VG8 https://www.youtube.com/watch?v=dhEgD6ZFlQE https://www.youtube.com/watch?t=21&v=39QMYkx89j0

    ▪ Storage of data (sensor /

    video) is very costly.

    ▪ Handling of these data is

    difficult i.e. due to high

    required bandwidth.

    ▪ For testing purposes sensor /

    video data are much more

    complex in comparison to

    discrete bus signals,

    electronic values, etc.

    Sensor / video data must be synchronously captured, stored, modified and executed with other

    testing data such as CAN, FlexRay, Radar, LiDAR, HiSonic, etc. – most common formats are:

    ADTF v2/3 (digitalwerk) RTMaps (Intempora) MDF4 and ROS/rosbag.

    https://www.youtube.com/watch?v=4jW0fJ80VG8 https://www.youtube.com/watch?v=dhEgD6ZFlQE https://www.youtube.com/watch?t=21&v=39QMYkx89j0

  • © 2018 IBM Corporation

    Data Management for ADAS/AD development and test is challenging

    Test Drives

    50-70 TB / day / car

    R&D Labs: tagging

    R&D Labs: developing & testing & (re-)simulation & AI training

    ▪ >5 PB of data for each car project ▪ 300-500 PB data in total

    > 200h / 1h driving

    o Europe o USA o China o Japan o Asia o Africa

    Training Data as a Service (TDaaS)

  • © 2018 IBM Corporation

    The IBM AD Solution Approach

    4. How to analyze sensor and video data with fast analytics and modern BigData tools?

    2. How to distribute data globally within an enterprise and partners?

    1. How to implement & operate an efficient storage, workflow and management system?

    „The Data Foundation“

    3. How to preserve digital data for decades with optimized costs?

    IBM Analytics HDFS

    Hortonworks HDP, DSX, Spark,…

    IBM AREMA

    IBM High-Speed WAN File Transfer IBM Aspera / Mass Data Migration / Cloud

    IBM Spectrum Computing

    IBM Object Storage (COS)

    6. How to do efficient IT workload and resource scheduling?

    IBM ‘Cold’ Archiving IBM Spectrum Protect / Cold / Low Cost / Tape

    5. How to run Machine Learning (ML) and AI training with Nvidia GPU technology at scale?

    IBM Enterprise-Class AI

    Power9 AC922, PowerAI, AI Vision

    IBM Spectrum Discover (MetaOcean)

  • © 2018 IBM Corporation

    • Tiering from flash, to disk, to tape, to cloud. • Cloud appears as external storage pool. • Auto Tiering & migration. • High performance Read/Write operations. • Public cloud-ready. • Support of multi cloud environments.

    ICP

    AWS S3

    Azure

    Private CloudReplicated

    Compressed

    Encrypted

    Integrity Validated

    Transparent Cloud Tiering

    Backup

    DR

    Tiering

    Archive

    Data sharing

    IBM Cloud

    The IBM storage architecture based on Spectrum Scale, COS and Tape

    IBM Spectrum Scale (HOT) • File based storage with Object & HDFS support

    • High End I/O performance

    • Information Lifecycle Management (ILM)

    • Sub Micro-seconds access time

    IBM Cloud Object Storage (S3) (WARM) • Site Fault Tolerant

    • Geo Dispersed and WW scale

    • Easy to Deploy

    • Milli-seconds access time

    IBM Spectrum Archive & Tape (COLD) • Lowest TCO

    • Tape ILM target – especially frozen archive

    • Long term retention and Minutes access time

    • Access as files via LTFS

    • Reduced floor space requirements and energy consumption

    • Up to 260PB native capacity in a single Tape Library

  • © 2018 IBM Corporation

    Building-block ”HOT” High Performance I/O File Storage

    Block

    iSCSI

    Client workstations Users, Containers

    and applications

    HPC & HTC Compute farm

    Traditional applications

    GLOBAL Namespace

    Analytics

    Transparent HDFS

    OpenStack

    Cinder

    Glance

    Manila

    Object

    Swift S3

    Transparent Cloud

    Powered by IBM Spectrum Scale

    Automated data placement and data migration

    Disk Tape Shared Nothing Cluster (FPO)

    Flash NVMe

    New Gen applications

    Transparent Cloud Tier (TCT)

    Worldwide File Data Distribution (AFM)

    Site B

    Site A

    Site C

    SMBNFS

    POSIX

    File

    Encryption File Audit Logging Immutability

    DR Site

    AFM-DR

    JBOD/JBOF

    ESS

    Spectrum Scale RAID

    Compression

    DGX / AC922

    S3 Data Cloud

    Management API Advanced GUI RESTful API

    Cloud Data Sharing

  • © 2018 IBM Corporation

    IBM Analytics & Hortonworks (HDP) / Hadoop

    https://developer.ibm.com/dwblog/2017/ibm-hortonworks-expand-partnership-help-businesses-accelerate-data-driven-decision-making/

    Automotive Customer Use Case:

    ➢ Major automotive OEM was experiencing significant difficulties and costs associated with storing and processing huge volumes of Video, Radar and Lidar files within legacy Network Attached Storage (NAS) system.

    ➢ Data necessary for development of Autonomous Vehicle machine learning algorithms.

    ➢ Today, storing multiple Petabytes of video and binary data with HDP Data Lake, aiming to grow to the tens of Petabytes.

    ➢ Dramatically reduced data management costs and user productivity.

    ➢ Provided foundation for Autonomous Driving research.

    ➢ IBM Reference customer for Spectrum Scale and HDP.

    https://developer.ibm.com/dwblog/2017/ibm-hortonworks-expand-partnership-help-businesses-accelerate-data-driven-decision-making/

  • © 2018 IBM Corporation

    2nd Generation IBM Elastic Storage Server (ESS) Family

    10

    Model GL4S: 4 Enclosures, 20U

    334 NL-SAS, 2 SSD

    Model GL6S: 6 Enclosures, 28U

    502 NL-SAS, 2 SSD

    Model GL2S: 2 Enclosures, 12U

    166 NL-SAS, 2 SSD

    Capacity

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    36 GB/s12 GB/s 24 GB/s

    System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

    System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

    EXP3524

    8

    9

    16

    17

    Model GS1S 24 SSD

    EXP3524

    8

    9

    16

    17

    System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

    System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

    EXP3524

    8

    9

    16

    17

    Model GS2S 48 SSD

    EXP3524

    8

    9

    16

    17

    System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

    System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

    EXP3524

    8

    9

    16

    17

    EXP3524

    8

    9

    16

    17

    EXP3524

    8

    9

    16

    17

    Model GS4S 96 SSD

    Speed

    40 GB/s

    14 GB/s

    26 GB/s

    Model GL1S: 1 Enclosures, 9U

    82 NL-SAS, 2 SSD

    ESS 5U84 Storage

    6 GB/s

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    ESS 5U84 Storage

    38 GB/s 40 GB/s

    Model GH14S: 1 2U24 Enclosure SSD 4 5U84 Enclosure HDD 334 NL-SAS, 24 SSD

    Model GH24S: 2 2U24 Enclosure SSD 4 5U84 Enclosure HDD 334 NL-SAS, 48 SSD

  • © 2018 IBM Corporation

    Presentation at ATZ Live 04/2018 in Wiesbaden, Germany

    „Artifical Intelligence is key to understand Sensor Data“

    „Relevant data is needed to finalize the Software Development.“

    Dr. Michael Hafner, Head of Automated Driving and Active Safety at Mercedes-Benz, talks about sensors, safety, and the road map that developers are following.

    https://www.daimler.com/innovation/autonomous-driving/expert-interview.html

    https://www.daimler.com/innovation/autonomous-driving/expert-interview.html

  • © 2018 IBM Corporation

    Workload and data flow for AI flow is complex

    Traditional Business Data

    Sensor Data

    Data from collaboration

    partners

    Data from mobile app and social media

    Le