e2e Hana Hadoop Technical Demo
-
Upload
harikiran1729 -
Category
Documents
-
view
227 -
download
0
Transcript of e2e Hana Hadoop Technical Demo
-
7/27/2019 e2e Hana Hadoop Technical Demo
1/12
DOCUMENT CLASSIFICATION: INTERNAL
General Information: HANA
HADOOP
Data Services 4.1
Authors: Lakshmi Narasimhan (I040723)
Date Last Updated: 8/24/2012
DEMO SCRIPT
E2E HANA HADOOP
TECHNICAL DEMO
-
7/27/2019 e2e Hana Hadoop Technical Demo
2/12
COPYRIGHT 2012 SAP AG. ALL RIGHTS RESERVED.
No part of this publication may be reproduced or transmitted in any form or for any purpose without the express
permission of SAP AG. The information contained herein may be changed without prior notice. Some software
products marketed by SAP AG and its distributors contain proprietary software components of other software vendors.
Microsoft, Windows, Excel, Outlook, and PowerPoint are registered trademarks of Microsoft Corporation.
IBM, DB2, DB2 Universal Database, System i, System i5, System p, System p5, System x, System z, System z10,
System z9, z10, z9, iSeries, pSeries, xSeries, zSeries, eServer, z/VM, z/OS, i5/OS, S/390, OS/390, OS/400, AS/400,S/390 Parallel Enterprise Server, PowerVM, Power Architecture, POWER6+, POWER6, POWER5+, POWER5,
POWER, OpenPower, PowerPC, BatchPipes, BladeCenter, System Storage, GPFS, HACMP, RETAIN, DB2 Connect,
RACF, Redbooks, OS/2, Parallel Sysplex, MVS/ESA, AIX, Intelligent Miner, WebSphere, Netfinity, Tivoli and Informix
are trademarks or registered trademarks of IBM Corporation. Linux is the registered trademark of Linus Torvalds in the
U.S. and other countries.
Adobe, the Adobe logo, Acrobat, PostScript, and Reader are either trademarks or registered trademarks of Adobe
Systems Incorporated in the United States and/or other countries.
Oracle is a registered trademark of Oracle Corporation.
UNIX, X/Open, OSF/1, and Motif are registered trademarks of the Open Group.
Citrix, ICA, Program Neighborhood, MetaFrame, WinFrame, VideoFrame, and MultiWin are trademarks or registered
trademarks of Citrix Systems, Inc.HTML, XML, XHTML and W3C are trademarks or registered trademarks of W3C, World Wide Web Consortium,
Massachusetts Institute of Technology.
Java is a registered trademark of Sun Microsystems, Inc.
JavaScript is a registered trademark of Sun Microsystems, Inc., used under license for technology invented and
implemented by Netscape.
SAP, R/3, xApps, xApp, SAP NetWeaver, Duet, PartnerEdge, ByDesign, SAP Business ByDesign, and other SAP
products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of
SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned
are the trademarks of their respective companies. Data contained in this document serves informational purposes only.
National product specifications may vary.
These materials are subject to change without notice. These materials are provided by SAP AG and its affiliated
companies ("SAP Group") for informational purposes only, without representation or warranty of any kind, and SAP
Group shall not be liable for errors or omissions with respect to the materials. The only warranties for SAP Group
products and services are those that are set forth in the express warranty statements accompanying such products and
services, if any. Nothing herein should be construed as constituting an additional warranty.
-
7/27/2019 e2e Hana Hadoop Technical Demo
3/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
1.Demo Story:
Its a technical demo to show E2E Hadoop HANA integration process using Data Services.
This demo is shorter version ofReal Time Big Data Retail POS HANA HADOOP Integration
Scenariobut focused on end to end technical process of Hadoop Map/Reduce job and integration
with HANA using Data Services 4.1.Instead of 90TB of weblogs we deal with 130MB weblogsin this technical demo.
Two parts for the demo:
i. Hadoop Map/Reduce Job converting the weblogs to structured data in HDFS
ii. DS 4.1 which loads the data from HDFS to HANA
http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316http://iwdfvm3337.wdf.sap.corp:8010/xads.asp?ydemodatabase/extern.htm?scenarioid=010000008316 -
7/27/2019 e2e Hana Hadoop Technical Demo
4/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
2.Client Tools required for demo:1. FileZila -http://filezilla-project.org/download.php?type=client
Download the zip file and extract to your local desktop. Required to show the weblogs and
output of Hadoop Map/Reduce job
2. Putty -http://www.putty.org/. Required to run the Hadoop Map/Reduce Job
3. HANA studio Revision 26 recommended for this demo. Required to show the data in
HANA.
http://filezilla-project.org/download.php?type=clienthttp://filezilla-project.org/download.php?type=clienthttp://filezilla-project.org/download.php?type=clienthttp://www.putty.org/http://www.putty.org/http://www.putty.org/http://www.putty.org/http://filezilla-project.org/download.php?type=client -
7/27/2019 e2e Hana Hadoop Technical Demo
5/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
3.System details and User Access:
i. HANA (require HANA Studio):
Host: usphlhana06b.phl.sap.corp
Instance Number: 13UID/PASSWORD: HADOOP/Welcome123
Or just import the xml and update the password mentioned above.
hadoop_hana_landscape.xml
ii. HADOOP( require putty):
Host: usphlvm1939.phl.sap.corp
UID:
PWD:
iii. HDFS(require FileZila):
Host Name: usphlvm1939.phl.sap.corp
Username:
Password: Port: 22222
-
7/27/2019 e2e Hana Hadoop Technical Demo
6/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
4.Demo Preparation:
Yu may need to delete only the data in the HANA table using studio before your execute the DS
job.
Tip: Set SYSTEM as Filters for Catalog and ITEM_SESSIONS2 as Filters for table before thedemo.
Right click on the table ITEM_SESSIONS2 table from SYSTEM schema > Delete. On thepop up select Delete All Rows
-
7/27/2019 e2e Hana Hadoop Technical Demo
7/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
5.Run the demo:
PART 1: Hadoop Map/Reduce Process
1. Simulated weblogs (HDFS file system) 1 week worth of data 130MB:
Login to FTP client like FileZila to just show the weblogs location. Copy over to local
desktop to show the unstructured data in weblogs.
/user/i040723/sessionLogDemo/access.log
If required copy to local desktop and show the unstructured data.
2. MapReduce job-The script to run it is (this script is on the Linux file system):/usr/local/mrJob/run.sh
Login to Hadoop server using putty to execute this job.
This Hadoop job converts the unstructured data (access.log) and creates a output in HDFS.
Wait until the job is executed.
-
7/27/2019 e2e Hana Hadoop Technical Demo
8/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
3. Output in HDFS this file is on the HDFS, not on the Linux file system: 475340 structured
records
/user/i040723/sessionLogDemo/sessionItems/affinity/part-r-00000
-
7/27/2019 e2e Hana Hadoop Technical Demo
9/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
PART 2: Run Data Services JobTo load the structured records from HDFS into HANA run the Data Services job
1. Launch DS Management Console
http://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=true
2. User Name/ Password : hadoop/welcome -> Log On
3. Click on Administrator
4. Click on HANA repository on the status tab
http://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=truehttp://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=truehttp://ideshana04:8080/DataServices/launch/logon.do?LOGOUT=true -
7/27/2019 e2e Hana Hadoop Technical Demo
10/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
5. Select Batch Job Configuration and Execute
6. On execute Batch Job click on Execute
7. Click here to check logs and see if the job is completed successfully
-
7/27/2019 e2e Hana Hadoop Technical Demo
11/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
8. Wait for 2 minutes for the job to complete successfully
9. Show the results in HANA Studio
-
7/27/2019 e2e Hana Hadoop Technical Demo
12/12
SAP AG 2011 / INTERNAL / SCENARIO ID:
10.Right click on ITEM_SESSIONS2 table and Open Data Preview
11.This data can be easily consumed by BO client tools for Analysis.