© 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj...

59
© 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee

description

© 2009 Wipro Ltd - Confidential 3 ETL – Basic Concept

Transcript of © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj...

Page 1: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential

ETL TESTING Handling Heterogeneous Data Formats

Rajasimman SelvarajSimanchal SahuTithi Mukherjee

Page 2: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential2 © 2009 Wipro Ltd - Confidential2

Agenda

3

5 GENERAL CASES OF DATA COMPARISON

1 ETL – Basic Concept

2 SOURCE & TARGET SYSTEMS

Interpretation of Mapping Document

4 Creation of DSN

Page 3: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential3

ETL – Basic Concept

Page 4: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential4 © 2009 Wipro Ltd - Confidential4

ETL is the automated and auditable data acquisition process from heterogeneous source systems that involves one or more sub processes listed below:

• Data extraction• Data transportation• Data transformation• Data consolidation• Data integration• Data cleaning• Data loading

ETL – Basic Concept

Page 5: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential5 © 2009 Wipro Ltd - Confidential5

Contd…• Source System can be any application or data store that creates

or stores data and acts as a data source to other systems. Will cover this topic in details later.

• Automation is critical without which the very purpose of ETL will be defeated. ETL is no good if processes need to be manually scheduled, executed or manually monitored.

• Extraction is first major step in physical implementation of ETL. Extraction initiates or triggers further downstream processes. Needless to say, once data is extracted it has to be hauled and transported to target, because the physical location of the source system might be different from the target warehouse.

• Data Cleansing is very essential as the data pulled from various source systems can have some unwanted data, unprintable characters, extra blank spaces, etc. This might cause some absurd result while loading the data into the Data warehouse.

Page 6: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential6 © 2009 Wipro Ltd - Confidential6

Contd…• Transformation is the series of tasks that prepares the data

for loading into the warehouse. Once data is secured, you have worry about its format or structure. Because it will be not be in the format needed for the target. Example the grain level, data type, might be different. Data cannot be used as it is. Some rules and functions need to be applied to transform the data.

• One of the purposes of ETL is to consolidate the data in a central repository or to bring it at one logical or physical place. Data can be consolidated from similar systems, different subject areas, etc.

• ETL must support data integration for the data coming from multiple sources and data coming at different times. This has to be seamless operation. This will avoid overwriting existing data, creating duplicate data or even worst simply unable to load the data in the target.

Page 7: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential7 © 2009 Wipro Ltd - Confidential7

Contd…• Loading part of the process is critical to integration and

consolidation. Loading process decides the modality of how the data is added in the warehouse or simply rejected. Methods like addition, Updating or deleting are executed at this step. What happens to the existing data?  Should the old data be deleted because of new information? Or should the data be archived? Should the data be treated as additional data to the existing one?

• Data should be loaded with lots of care. Does that that means data loaded in the Warehouse is incorrect? What is the confidence level in the data? A data auditing process can only establish the confidence level. This auditing process normally happens after the loading of data.

Page 8: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential8 © 2009 Wipro Ltd - Confidential8

CONTD…A generic pictorial representation of ETL Process :

Page 9: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential9

SOURCE & Target SYSTEMS

Page 10: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential10 © 2009 Wipro Ltd - Confidential10

• SAP• RDBMS

– Oracle– SQL Server– DB2– Teradata

• FLAT FILES– .TSV– .TXT– .CSV

• MS-ACCESS– .MDB– Temporary Storage

SOURCE SYSTEMS

Page 11: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential11 © 2009 Wipro Ltd - Confidential11

TARGET SYSTEMS• RDBMS

– Oracle– Teradata– SQL Server

• FLAT FILE

Page 12: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential12

Interpretation of Mapping Document

Page 13: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential13 © 2009 Wipro Ltd - Confidential13

Mapping document is an excel sheet which acts as a referencedocument for the testing team to understand the data flow and basedon this understanding the test scripts are prepared.

A Mapping document generally provides the following information:

• Details about the Source and Target systems (Location, Connection, etc.)

• Details of Source and Target tables involved• Various attributes of the Source and Target fields (Field Name,

Data type, Size, etc)• Dependencies between Source systems/tables for fetching the

source data• All transformation rules to be applied on the data before loading

them into the Target tables

Interpretation of Mapping Document

Page 14: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential14 © 2009 Wipro Ltd - Confidential14

CONTD…

A sample Mapping sheet looks like…..

Sample mapping sheet

Page 15: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential15

Creating a data source name (dsn)

Page 16: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential16 © 2009 Wipro Ltd - Confidential16

Creation of DSNStep1: Go to START RUN. Type odbcad32 and click OK.Step2: An ODBC Data Source Administrator will open in which, select system DSN and Click ADD button. Another window “create a new data source” will open.

Page 17: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential17 © 2009 Wipro Ltd - Confidential17

Contd…Step3: Select SQL SERVER or “Microsoft ODBC for Oracle “from the list. Click OK. A small window will open

Page 18: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential18 © 2009 Wipro Ltd - Confidential18

CONTD… Step4: Enter any name in” Data Source Name “text field.

Enter your User Name for that data base. Enter the name of the server as such given in tns.ora file. Click ok

Page 19: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential19

General Cases of Data Comparison

Page 20: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential20 © 2009 Wipro Ltd - Confidential20

General Cases of Data Migration• Case-1:• Source: Oracle• Target: Oracle• Other Tools: Edit Plus, Beyond Compare

Page 21: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential21 © 2009 Wipro Ltd - Confidential21

CONTD…Executing the Source Query in PL/SQL Developer

Page 22: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential22 © 2009 Wipro Ltd - Confidential22

CONTD…Executing the Target Query in PL/SQL Developer

Page 23: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential23 © 2009 Wipro Ltd - Confidential23

Methods of comparing the data• Excel Macro or third party tool verificationSRC:

Select * From (SELECT VNDR_KEY, NVL2(STR_ADDR,ltrim(rtrim(STR_ADDR),NVL2(PO_BOX,'PO Box'||' '||ltrim(rtrim(PO_BOX)),ltrim(rtrim(STR_ADDR))))  FROM DW_R0001_T.VNDR V WHERE VNDR_KEY >= '0000100000' AND VNDR_KEY < '0000400000' AND DEL_F is NULL Order by VNDR_KEY) SRC_VALUE

TGT:Select * From (SELECT VNDR_NUM,  STR_ADDR  FROM AMB_CARE_T.MSS_VENDOR_MAST_STGOrder BY VNDR_NUM) TGT_VALUE

Page 24: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential24 © 2009 Wipro Ltd - Confidential24

CONTD…• Using MINUS

Select * From (SELECT VNDR_KEY, NVL2(STR_ADDR,ltrim(rtrim(STR_ADDR),NVL2(PO_BOX,'PO Box'||' '||ltrim(rtrim(PO_BOX)),ltrim(rtrim(STR_ADDR))))  FROM DW_R0001_T.VNDR V WHERE VNDR_KEY >= '0000100000' AND VNDR_KEY < '0000400000' AND DEL_F is NULL Order by VNDR_KEY) SRC_VALUEMINUSSelect * From (SELECT VNDR_NUM,  STR_ADDR  FROM AMB_CARE_T.MSS_VENDOR_MAST_STGOrder BY VNDR_NUM) TGT_VALUE

Page 25: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential25 © 2009 Wipro Ltd - Confidential25

CONTD…• Using Full Outer Join

Select SRC.VNDR_KEY AS SRC_VNDR_KEY, TGT.VNDR_NUM AS TGT_VNDR_NUM,SRC. STR_ADDR AS SRC_ STR_ADDR, TGT. STR_ADDR AS TGT_ STR_ADDRFrom (SELECT VNDR_KEY, NVL2(STR_ADDR,ltrim(rtrim(STR_ADDR),NVL2(PO_BOX,'PO Box'||' '||ltrim(rtrim(PO_BOX)),ltrim(rtrim(STR_ADDR)))) AS STR_ADDR  FROM DW_R0001_T.VNDR V WHERE VNDR_KEY >= '0000100000' AND VNDR_KEY < '0000400000' AND DEL_F is NULL Order by VNDR_KEY) SRCFULL OUTER JOINSelect * From (SELECT VNDR_NUM,  STR_ADDR  FROM AMB_CARE_T.MSS_VENDOR_MAST_STGOrder BY VNDR_NUM) TGTONltrim(rtrim(SRC.VNDR_KEY)) = TGT.VNDR_NUM OR (ltrim(rtrim(SRC.VNDR_KEY) ) IS NULL AND TGT.VNDR_NUM IS NULL) ANDltrim(rtrim(SRC. STR_ADDR) ) = TGT. STR_ADDR OR(ltrim(rtrim(SRC. STR_ADDR) ) IS NULL AND TGT. STR_ADDR IS NULL)

Page 26: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential26 © 2009 Wipro Ltd - Confidential26

CONTD…Case-2:

• Source: RDBMS\Flat file• Target: SQL Server• Other Tools: Edit Plus, Beyond Compare, VIM Editor

Page 27: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential27 © 2009 Wipro Ltd - Confidential27

Importing Oracle Table into AccessStep1: Click the menu “NEW” and a window by name “NEW TABLE” will

open. Select the option “import table” as shown in the slide. Click OK

Page 28: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential28 © 2009 Wipro Ltd - Confidential28

CONTD…Step2: A window named “Import” will open. In that select “ODBC Data

Sources” from the drop down list of “Files of Type” combo box. Click ok

Page 29: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential29 © 2009 Wipro Ltd - Confidential29

CONTD…• Step3:”Select Data Source” window will open. Select “Machine • Data Source” tab.• Select the name of the data source, from which you want to import

a table. Click Ok

Page 30: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential30 © 2009 Wipro Ltd - Confidential30

CONTD…Step4: A login window will open. Enter your login credentials

for that database. Click ok.

Page 31: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential31 © 2009 Wipro Ltd - Confidential31

CONTD…Step5: A window name “Import Objects” will open. Select the table from the

list. Click “OK” and the table will start getting imported into MS_ACCESS..

Page 32: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential32 © 2009 Wipro Ltd - Confidential32

Importing flat-file into AccessAfter clicking import table menu in MS_ACCESS, the following screen appears

Select the appropriate text file and click “IMPORT”

Page 33: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential33 © 2009 Wipro Ltd - Confidential33

CONTD…Select the appropriate radio button based on whether the text file is delimited or fixed width

Page 34: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential34 © 2009 Wipro Ltd - Confidential34

CONTD…Select proper radio button based on the type of delimitation of the flat file.For flat files having the column names as first record, select the check box-‘First Row Contains Field Names’

Page 35: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential35 © 2009 Wipro Ltd - Confidential35

CONTD…Select new table

Page 36: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential36 © 2009 Wipro Ltd - Confidential36

CONTD…Click on individual columns and give the name and data type of the field in thedesignated text fields. Click Finish once you are done and the table will be imported

Page 37: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential37 © 2009 Wipro Ltd - Confidential37

Exporting table from ACCESS to SQL ServerStep1: Right click on the Table name in MS ACCESS and select

export from the pop-up menu. A browse window will open

Page 38: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential38 © 2009 Wipro Ltd - Confidential38

CONTD…Step2: Select ODBC Databases from the “save as type”

combo box. A small window with a table name text field will open. You may keep the table name as it is or rename it by entering the desired table name in the text field. Click OK.

Page 39: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential39 © 2009 Wipro Ltd - Confidential39

CONTD…

Page 40: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential40 © 2009 Wipro Ltd - Confidential40

CONTD… Step3: A “Select Data Source” window will open. Go to

“machine data source” tab and select the sql server database DSN name from the available list, and click “OK” and the table will start getting transferred from MS ACCESS to Sql Server.

Step4: Login into Sql server and refresh the database to which you have exported a table. Now you will find the table in sql server.

Page 41: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential41 © 2009 Wipro Ltd - Confidential41

Methods of comparing the data• Using EXCEPT

Select * From (SELECT VNDR_KEY, NVL2(STR_ADDR,ltrim(rtrim(STR_ADDR),NVL2(PO_BOX,'PO Box'||' '||ltrim(rtrim(PO_BOX)),ltrim(rtrim(STR_ADDR))))  FROM DW_R0001_T.VNDR V WHERE VNDR_KEY >= '0000100000' AND VNDR_KEY < '0000400000' AND DEL_F is NULL Order by VNDR_KEY) SRC_VALUEEXCEPTSelect * From (SELECT VNDR_NUM,  STR_ADDR  FROM AMB_CARE_T.MSS_VENDOR_MAST_STGOrder BY VNDR_NUM) TGT_VALUE

Note: Except works in SQL SERVER 2005 but does not work in SQL SERVER 2000.

Page 42: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential42 © 2009 Wipro Ltd - Confidential42

CONTD…• Using Full Outer Join

Select SRC.VNDR_KEY AS SRC_VNDR_KEY, TGT.VNDR_NUM AS TGT_VNDR_NUM,SRC. STR_ADDR AS SRC_ STR_ADDR, TGT. STR_ADDR AS TGT_ STR_ADDRFrom (SELECT VNDR_KEY, NVL2(STR_ADDR,ltrim(rtrim(STR_ADDR)),NVL2(PO_BOX,'PO Box'||' '||ltrim(rtrim(PO_BOX)),ltrim(rtrim(STR_ADDR)))) AS STR_ADDR  FROM DW_R0001_T.VNDR V WHERE VNDR_KEY >= '0000100000' AND VNDR_KEY < '0000400000' AND DEL_F is NULL Order by VNDR_KEY) SRCFULL OUTER JOINSelect * From (SELECT VNDR_NUM,  STR_ADDR  FROM AMB_CARE_T.MSS_VENDOR_MAST_STGOrder BY VNDR_NUM) TGTONltrim(rtrim(SRC.VNDR_KEY)) = TGT.VNDR_NUM OR (ltrim(rtrim(SRC.VNDR_KEY)) IS NULL AND TGT.VNDR_NUM IS NULL) ANDltrim(rtrim(SRC. STR_ADDR)) = TGT. STR_ADDR OR(ltrim(rtrim(SRC. STR_ADDR)) IS NULL AND TGT. STR_ADDR IS NULL)

Page 43: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential43 © 2009 Wipro Ltd - Confidential43

CONTD…Case-3:

• Source: SAP• Target: Oracle• Other Tools: Edit Plus, Beyond Compare, VIM Editor

Page 44: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential44 © 2009 Wipro Ltd - Confidential44

CONTD…Login Screen :

Page 45: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential45 © 2009 Wipro Ltd - Confidential45

CONTD…Querying for the table you need to fetch data from:

Page 46: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential46 © 2009 Wipro Ltd - Confidential46

CONTD…Finding the total number of records in the table:

Page 47: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential47 © 2009 Wipro Ltd - Confidential47

CONTD…Choosing the fields to be displayed:

Page 48: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential48 © 2009 Wipro Ltd - Confidential48

CONTD…By Default all the fields will be checked

Page 49: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential49 © 2009 Wipro Ltd - Confidential49

CONTD…Uncheck the fields which are not required:

Page 50: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential50 © 2009 Wipro Ltd - Confidential50

CONTD…Number of rows and columns to be fetched:

Page 51: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential51 © 2009 Wipro Ltd - Confidential51

CONTD…Displaying the output:

Page 52: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential52 © 2009 Wipro Ltd - Confidential52

CONTD…Selecting the records:

Page 53: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential53 © 2009 Wipro Ltd - Confidential53

CONTD…Downloading or Exporting data from SAP:

Page 54: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential54 © 2009 Wipro Ltd - Confidential54

CONTD…Selecting the format of the downloaded data:

Note: If the result set is more than 65535 records then the data can be saved as Unconverted format and it will get saved as a “.txt” file

Page 55: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential55 © 2009 Wipro Ltd - Confidential55

CONTD…Selecting the location for saving the downloaded data:

Page 56: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential56 © 2009 Wipro Ltd - Confidential56

CONTD…Downloaded data displayed in spreadsheet format:

Page 57: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential57 © 2009 Wipro Ltd - Confidential57

Methods of comparing the dataWe have two methods of comparing the SAP Data (Source)

and the Oracle Data (Target):

1. If the record count is less than 65535 then we can download the SAP data and the Oracle data in .xls format. The data in the excel sheets can then be compared either with the help of Excel macro or any third party tool like Beyond Compare.

2. Alternatively, we have to download the SAP data in unconverted format (.txt) which can be imported/linked to MS ACCESS. Also the Target table (Oracle) can be imported/Linked to MS ACCESS where the Source and Target data can be compared.

Page 58: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential58 © 2009 Wipro Ltd - Confidential58

Q&A

QUESTIONS ??

Page 59: © 2009 Wipro Ltd - Confidential ETL TESTING Handling Heterogeneous Data Formats Rajasimman Selvaraj Simanchal Sahu Tithi Mukherjee.

© 2009 Wipro Ltd - Confidential