e2e Hana Hadoop Technical Demo

download e2e Hana Hadoop Technical Demo

of 12

Transcript of e2e Hana Hadoop Technical Demo

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    1/12

    DOCUMENT CLASSIFICATION: INTERNAL

    General Information: HANA

    HADOOP

    Data Services 4.1

    Authors: Lakshmi Narasimhan (I040723)

    Date Last Updated: 8/24/2012

    DEMO SCRIPT

    E2E HANA HADOOP

    TECHNICAL DEMO

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    2/12

    COPYRIGHT 2012 SAP AG. ALL RIGHTS RESERVED.

    No part of this publication may be reproduced or transmitted in any form or for any purpose without the express

    permission of SAP AG. The information contained herein may be changed without prior notice. Some software

    products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.

    Microsoft, Windows, Excel, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.

    IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x, System z, System z10,

    System z9, z10, z9, iSeries, pSeries, xSeries, zSeries, eServer, z/VM, z/OS, i5/OS, S/390, OS/390, OS/400, AS/400,S/390 Parallel Enterprise Server, PowerVM, Power Architecture, POWER6+, POWER6, POWER5+, POWER5,

    POWER, OpenPower, PowerPC, BatchPipes, BladeCenter, System Storage, GPFS, HACMP, RETAIN, DB2 Connect,

    RACF, Redbooks, OS/2, Parallel Sysplex, MVS/ESA, AIX, Intelligent Miner, WebSphere, Netfinity, Tivoli and Informix

    are trademarks or registered trademarks of IBM Corporation. Linux is the registered trademark of Linus Torvalds in the

    U.S. and other countries.

    Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe

    Systems Incorporated in the United States and/or other countries.

    Oracle is a registered trademark of Oracle Corporation.

    UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.

    Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered

    trademarks of Citrix Systems, Inc.HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium,

    Massachusetts Institute of Technology.

    Java is a registered trademark of Sun Microsystems, Inc.

    JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and

    implemented by Netscape.

    SAP, R/3, xApps, xApp, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP Business ByDesign, and other SAP

    products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of

    SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned

    are the trademarks of their respective companies. Data contained in this document serves informational purposes only.

    National product specifications may vary.

    These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated

    companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP

    Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group

    products and services are those that are set forth in the express warranty statements accompanying such products and

    services, if any. Nothing herein should be construed as constituting an additional warranty.

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    3/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    1.Demo Story:

    Its a technical demo to show E2E Hadoop HANA integration process using Data Services.

    This demo is shorter version ofReal Time Big Data Retail POS HANA HADOOP Integration

    Scenariobut focused on end to end technical process of Hadoop Map/Reduce job and integration

    with HANA using Data Services 4.1.Instead of 90TB of weblogs we deal with 130MB weblogsin this technical demo.

    Two parts for the demo:

    i. Hadoop Map/Reduce Job converting the weblogs to structured data in HDFS

    ii. DS 4.1 which loads the data from HDFS to HANA

    http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316
  • 7/27/2019 e2e Hana Hadoop Technical Demo

    4/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    2.Client Tools required for demo:1. FileZila -http://filezilla-project.org/download.php?type=client

    Download the zip file and extract to your local desktop. Required to show the weblogs and

    output of Hadoop Map/Reduce job

    2. Putty -http://www.putty.org/. Required to run the Hadoop Map/Reduce Job

    3. HANA studio Revision 26 recommended for this demo. Required to show the data in

    HANA.

    http://filezilla-project.org/download.php?type=clienthttp://filezilla-project.org/download.php?type=clienthttp://filezilla-project.org/download.php?type=clienthttp://www.putty.org/http://www.putty.org/http://www.putty.org/http://www.putty.org/http://filezilla-project.org/download.php?type=client
  • 7/27/2019 e2e Hana Hadoop Technical Demo

    5/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    3.System details and User Access:

    i. HANA (require HANA Studio):

    Host: usphlhana06b.phl.sap.corp

    Instance Number: 13UID/PASSWORD: HADOOP/Welcome123

    Or just import the xml and update the password mentioned above.

    hadoop_hana_landscape.xml

    ii. HADOOP( require putty):

    Host: usphlvm1939.phl.sap.corp

    UID:

    PWD:

    iii. HDFS(require FileZila):

    Host Name: usphlvm1939.phl.sap.corp

    Username:

    Password: Port: 22222

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    6/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    4.Demo Preparation:

    Yu may need to delete only the data in the HANA table using studio before your execute the DS

    job.

    Tip: Set SYSTEM as Filters for Catalog and ITEM_SESSIONS2 as Filters for table before thedemo.

    Right click on the table ITEM_SESSIONS2 table from SYSTEM schema > Delete. On thepop up select Delete All Rows

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    7/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    5.Run the demo:

    PART 1: Hadoop Map/Reduce Process

    1. Simulated weblogs (HDFS file system) 1 week worth of data 130MB:

    Login to FTP client like FileZila to just show the weblogs location. Copy over to local

    desktop to show the unstructured data in weblogs.

    /user/i040723/sessionLogDemo/access.log

    If required copy to local desktop and show the unstructured data.

    2. MapReduce job-The script to run it is (this script is on the Linux file system):/usr/local/mrJob/run.sh

    Login to Hadoop server using putty to execute this job.

    This Hadoop job converts the unstructured data (access.log) and creates a output in HDFS.

    Wait until the job is executed.

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    8/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    3. Output in HDFS this file is on the HDFS, not on the Linux file system: 475340 structured

    records

    /user/i040723/sessionLogDemo/sessionItems/affinity/part-r-00000

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    9/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    PART 2: Run Data Services JobTo load the structured records from HDFS into HANA run the Data Services job

    1. Launch DS Management Console

    http://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=true

    2. User Name/ Password : hadoop/welcome -> Log On

    3. Click on Administrator

    4. Click on HANA repository on the status tab

    http://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=truehttp://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=truehttp://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=true
  • 7/27/2019 e2e Hana Hadoop Technical Demo

    10/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    5. Select Batch Job Configuration and Execute

    6. On execute Batch Job click on Execute

    7. Click here to check logs and see if the job is completed successfully

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    11/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    8. Wait for 2 minutes for the job to complete successfully

    9. Show the results in HANA Studio

  • 7/27/2019 e2e Hana Hadoop Technical Demo

    12/12

    SAP AG 2011 / INTERNAL / SCENARIO ID:

    10.Right click on ITEM_SESSIONS2 table and Open Data Preview

    11.This data can be easily consumed by BO client tools for Analysis.