PC8LID 20061204 Large for Printing

424
Informatica ® PowerCenter ® 8 Level I Developer Student Guide Version - PC8LID 20061113

Transcript of PC8LID 20061204 Large for Printing

Page 1: PC8LID 20061204 Large for Printing

Informatica® PowerCenter® 8

Level I Developer

Student GuideVersion - PC8LID 20061113

Page 2: PC8LID 20061204 Large for Printing

Informatica PowerCenter 8 Level I Developer Student GuideVersion 8.1November 2006

Copyright (c) 1998–2006 Informatica Corporation.All rights reserved. Printed in the USA.

This software and documentation contain proprietary information of Informatica Corporation and are provided under a license agreement containing restrictions on use and disclosure and are also protected by copyright law. Reverse engineering of the software is prohibited. No part of this document may be reproduced or transmitted in any form, by any means (electronic, photocopying, recording or otherwise) without prior consent of Informatica Corporation.

Use, duplication, or disclosure of the Software by the U.S. Government is subject to the restrictions set forth in the applicable software license agreement and as provided in DFARS 227.7202-1(a) and 227.7702-3(a) (1995), DFARS 252.227-7013(c)(1)(ii) (OCT 1988), FAR 12.212(a) (1995), FAR 52.227-19, or FAR 52.227-14 (ALT III), as applicable. The information in this document is subject to change without notice. If you find any problems in the documentation, please report them to us in writing. Informatica Corporation does not warrant that this documentation is error free. Informatica, PowerMart, PowerCenter, PowerChannel, PowerCenter Connect, MX, and SuperGlue are trademarks or registered trademarks of Informatica Corporation in the United States and in jurisdictions throughout the world. All other company and product names may be trade names or trademarks of their respective owners.

Portions of this software are copyrighted by DataDirect Technologies, 1999-2002.

Informatica PowerCenter products contain ACE (TM) software copyrighted by Douglas C. Schmidt and his research group at Washington University and University of California, Irvine, Copyright (c) 1993-2002, all rights reserved.

Portions of this software contain copyrighted material from The JBoss Group, LLC. Your right to use such materials is set forth in the GNU Lesser General Public License Agreement, which may be found at http://www.opensource.org/licenses/lgpl-license.php. The JBoss materials are provided free of charge by Informatica, “as-is”, without warranty of any kind, either express or implied, including but not limited to the implied warranties of merchantability and fitness for a particular purpose.

Portions of this software contain copyrighted material from Meta Integration Technology, Inc. Meta Integration® is a registered trademark of Meta Integration Technology, Inc.

This product includes software developed by the Apache Software Foundation (http://www.apache.org/). The Apache Software is Copyright (c) 1999-2005 The Apache Software Foundation. All rights reserved.

This product includes software developed by the OpenSSL Project for use in the OpenSSL Toolkit and redistribution of this software is subject to terms available at http://www.openssl.org. Copyright 1998-2003 The OpenSSL Project. All Rights Reserved.

The zlib library included with this software is Copyright (c) 1995-2003 Jean-loup Gailly and Mark Adler.

The Curl license provided with this Software is Copyright 1996-200, Daniel Stenberg, <[email protected]>. All Rights Reserved.

The PCRE library included with this software is Copyright (c) 1997-2001 University of Cambridge Regular expression support is provided by the PCRE library package, which is open source software, written by Philip Hazel. The source for this library may be found at ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre.

InstallAnywhere is Copyright 2005 Zero G Software, Inc. All Rights Reserved.

Portions of the Software are Copyright (c) 1998-2005 The OpenLDAP Foundation. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted only as authorized by the OpenLDAP Public License, available at http://www.openldap.org/software/release/license.html.

This Software is protected by U.S. Patent Numbers 6,208,990; 6,044,374; 6,014,670; 6,032,158; 5,794,246; 6,339,775 and other U.S. Patents Pending.

DISCLAIMER: Informatica Corporation provides this documentation “as is” without warranty of any kind, either express or implied,including, but not limited to, the implied warranties of non-infringement, merchantability, or use for a particular purpose. The information provided in this documentation may include technical inaccuracies or typographical errors. Informatica could make improvements and/or changes in the products described in this documentation at any time without notice.

Page 3: PC8LID 20061204 Large for Printing

Table of Contents

List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix

About This Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xx

Course Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xx

Audience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xx

Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xx

Other Informatica Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

Obtaining Informatica Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

Visiting Informatica Customer Portal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

Visiting the Informatica Web Site . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

Visiting the Informatica Developer Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

Visiting the Informatica Knowledge Base . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

Obtaining Informatica Professional Certification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi

Providing Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii

Obtaining Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii

Unit 1: Data Integration Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Lesson 1-1. Introducing Informatica . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Lesson 1-2. Data Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

Lesson 1-3. Mappings and Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Lesson 1-4. Tasks and Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Lesson 1-5. Metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Unit 2: PowerCenter Components and User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Lesson 2-1. PowerCenter Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Lesson 2-2. PowerCenter Client Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Designer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Workflow Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Unit 2 Lab: Using the Designer and Workflow Manager . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Step 1: Launch the Designer and Log Into the Repository . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Step 2: Navigate Folders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Step 3: Navigating the Designer Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Table of Contents

Informatica PowerCenter 8 Level I Developer iii

Page 4: PC8LID 20061204 Large for Printing

Step 4: Create and Save Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Step 5: Launch the Workflow Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Step 6: Navigating the Workflow Manager Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Step 7: Workflow Manager Task Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Step 8: Database Connection Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Unit 3: Source Qualifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Lesson 3-1. Source Qualifier Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Source Qualifier Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Datatype Conversion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Lesson 3-2. Velocity Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Lab Project . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Architecture and Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Unit 3 Lab A: Load Payment Staging Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Section 1: Pass-Through Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Step 1: Launch the Designer and Review the Source and Target Definitions . . . . . . . . . . . . . . 38

Step 2: Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Step 3: Create a Workflow and a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Step 4: Run the Workflow and Monitor the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Lesson 3-3. Source Qualifier Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

Unit 3 Lab B: Load Product Staging Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Section 2: Homogeneous Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Step 1: Import the Source Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Step 2: Import the Relational Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Step 3: Create the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Step 4: Create the Session and Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Step 5: Run the Workflow and Monitor the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Lesson 3-4. Source Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Unit 3 Lab C: Load Dealership and Promotions Staging Table . . . . . . . . . . . . . . . . . . . . 59

Section 3: Two Pipeline Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Step 1: Import the Source Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Step 2: Import the Target Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Step 3: Create the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Step 4: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Table of Contents

iv Informatica PowerCenter 8 Level I Developer

Page 5: PC8LID 20061204 Large for Printing

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler . . . . . . . . . . . . . . . . . . . . 67

Lesson 4-1. Expression Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Lesson 4-2. Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Lesson 4-3. File Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Lesson 4-4. Workflow Scheduler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Run Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

Unit 4 Lab: Load the Customer Staging Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Step 1: Create a Flat File Source Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Step 2: Create a Relational Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Step 3: Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Step 4: Create a Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Step 5: Create an Expression Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Step 6: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Step 7: Schedule a Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Unit 5: Joins, Features and Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Lesson 5-1. Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

Joiner Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

Join Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Joiner Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

Lesson 5-2. Shortcuts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

Unit 5 Lab A: Load Sales Transaction Staging Table . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Step 1: Create a Flat File Source Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Step 2: Create a Relational Source Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Step 3: Create a Relational Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Step 4: Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

Step 5: Create a Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Step 6: Link the Target Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

Step 7: Create a Workflow and Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

Step 8: Start the Workflow and View Results in the Workflow Monitor . . . . . . . . . . . . . . . . 113

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Unit 5 Lab B: Features and Techniques I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Open a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Feature 1: Auto Arrange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Feature 2: Remove Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Table of Contents

Informatica PowerCenter 8 Level I Developer v

Page 6: PC8LID 20061204 Large for Printing

Feature 3: Revert to Saved . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Feature 4: Link Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Feature 5: Propagating Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Feature 6: Autolink by Name and Position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Feature 7: Moving Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Feature 8: Shortcut to Port Editing from Normal View . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Feature 9: Create Transformation Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Feature 10: Scale-to-Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

Feature 11: Object Shortcuts and Copies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

Feature 12: Copy Objects Within and Between Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . 130

Unit 6: Lookups and Reusable Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

Lesson 6-1. Lookup Transformation (Connected) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

Lesson 6-2. Reusable Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

Unit 6 Lab A: Load Employee Staging Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Step 1: Create a Flat File Source Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Step 2: Create a Relational Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Step 3: Step Three: Create a Reusable Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

Step 4: Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Step 5: Create a Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Step 6: Add a Reusable Expression Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Step 7: Link Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Step 8: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Unit 6 Lab B: Load Date Staging Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Step 1: Create a Flat File Source Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Step 2: Create a Relational Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Step 3: Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Step 4: Create a Workflow and a Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

Step 5: Run the Workflow and Monitor the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

Unit 7: Debugger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Lesson 7-1. Debugging Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

Unit 7 Lab: Using the Debugger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

Step 1: Copy and Inspect the Debug Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

Step 2: Step Through the Debug Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

Table of Contents

vi Informatica PowerCenter 8 Level I Developer

Page 7: PC8LID 20061204 Large for Printing

Step 3: Use the Debugger to Locate the Error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

Step 4: Fix the Error and Confirm the Data is Correct . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

Unit 8: Sequence Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

Lesson 8-1. Sequence Generator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

Unit 8 Lab: Load Date Dimension Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Step 1: Create a Shortcut to a Shared Relational Source Table . . . . . . . . . . . . . . . . . . . . . . . 187

Step 2: Create a Shortcut to a Shared Relational Target Table . . . . . . . . . . . . . . . . . . . . . . . 187

Step 3: Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Step 4: Create a Sequence Generator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Step 5: Link the Target Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Step 6: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Unit 9: Lookup Caching, More Features and Techniques. . . . . . . . . . . . . . . . . . . . . . . . 193

Lesson 9-1. Lookup Caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache) . . . . 197

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

Step 1: Create a Shortcut to a Shared Relational Source Table . . . . . . . . . . . . . . . . . . . . . . . 200

Step 2: Create a Shortcut to Shared Relational Target Table . . . . . . . . . . . . . . . . . . . . . . . . 200

Step 3: Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

Step 4: Create Lookups for the Start and Expiry Date Keys . . . . . . . . . . . . . . . . . . . . . . . . . 200

Step 5: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

Unit 9 Lab B: Features and Techniques II . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

Open a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

Feature 1: Find in Workspace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

Feature 2: View Object Dependencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

Feature 3: Compare Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

Feature 4: Overview Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213

Unit 10: Sorter, Aggregator and Self-Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

Lesson 10-1. Sorter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215

Sorter Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217

Lesson 10-2. Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218

Table of Contents

Informatica PowerCenter 8 Level I Developer vii

Page 8: PC8LID 20061204 Large for Printing

Aggregator Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

Lesson 10-3. Active and Passive Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

Lesson 10-4. Data Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223

Lesson 10-5. Self-Join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

Unit 10 Lab: Reload the Employee Staging Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Step 1: Copy an Existing Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Step 2: Examine Source Data to Determine a Key for Self-Join . . . . . . . . . . . . . . . . . . . . . . 232

Step 3: Prepare the New Mapping for Modification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Step 4: Create a Sorter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Step 5: Create a Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

Step 6: Create an Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Step 7: Create a Joiner Transformation for the Self-Join . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Step 8: Get Salaries from the Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

Step 9: Connect the Joiner and Lookup to the Target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

Step 10: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240

Unit 11: Router, Update Strategy and Overrides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Lesson 11-1. Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243

Lesson 11-2. Update Strategy Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

Lesson 11-3. Expression Default Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 248

Lesson 11-4. Source Qualifier Override . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

Lesson 11-5. Target Override . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251

Lesson 11-6. Session Task Mapping Overrides . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

Unit 11 Lab: Load Employee Dimension Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

Step 1: Copy the Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

Step 2: Edit the Expression Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

Step 3: Create a Router Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

Step 4: Create an Update Strategy for INSERTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

Step 5: Create Lookup to DIM_DATES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

Step 6: Link upd_INSERTS and lkp_DIM_DATES_INSERTS to Target DIM_EMPLOYEE_INSERTS262

Step 7: Create an Update Strategy for UPDATES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

Step 8: Create Second Lookup to DIM_DATES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

Step 9: Link upd_UPDATES and lkp_DIM_DATES_UPDATES to Target DIM_EMPLOYEE_UPDATES262

Step 10: Link ERRORS Router Group to DIM_EMPLOYEES_ERR . . . . . . . . . . . . . . . . . . 263

Step 11: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

Table of Contents

viii Informatica PowerCenter 8 Level I Developer

Page 9: PC8LID 20061204 Large for Printing

Step 12: Prepare, Run, and Monitor the Second Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

Unit 12: Dynamic Lookup and Error Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

Lesson 12-1. Dynamic Lookup Cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 271

Lesson 12-2. Error Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

Error Log Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

Log Row Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

Log Source Row Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279

Unit 12 Lab: Load Customer Dimension Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Step 1: Create a Relational Source Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Step 2: Create a Relational Target Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Step 3: Create a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Step 4: Create a Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

Step 5: Create a Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

Step 6: Create an Update Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

Step 7: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

Error Log Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

Unit 13: Unconnected Lookup, Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . 293

Lesson 13-1. Unconnected Lookup Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293

Connected versus Unconnected Lookup Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . 296

Joins versus Lookups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

Lesson 13-2. System Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

Lesson 13-3. Mapping Parameters and Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299

Unit 13 Lab: Load Sales Fact Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

Step 1: Create an Internal Relationship Between two Source Tables . . . . . . . . . . . . . . . . . . . 309

Step 2: Create a Mapping Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309

Step 3: Step Three: Create an Unconnected Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

Step 4: Add Unconnected Lookup Test to Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

Step 5: Create Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311

Step 6: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

Data Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

Unit 14: Mapplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

Lesson 14-1. Mapplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

Mapplets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

Table of Contents

Informatica PowerCenter 8 Level I Developer ix

Page 10: PC8LID 20061204 Large for Printing

Mapping Input Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

Mapping Output Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323

Unit 14 Lab: Create a Mapplet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 327

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

Step 1: Create the Mapplet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

Step 2: Add Mapplet to Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329

Step 3: Create and Run the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 330

Unit 15: Mapping Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

Lesson 15-1. Designing Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333

High Level Process Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

Mapping Specifics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

Unit 15 Workshop: Load Promotions Daily Aggregate Table . . . . . . . . . . . . . . . . . . . . . 337

Workshop Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

Sources and Targets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

Mapping Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

Workflow Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 341

Run Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342

Unit 16: Workflow Variables and Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

Lesson 16-1. Link Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345

Lesson 16-2. Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

Lesson 16-3. Assignment Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348

Lesson 16-4. Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349

Lesson 16-5. Email Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

Unit 16 Lab: Load Product Weekly Aggregate Table . . . . . . . . . . . . . . . . . . . . . . . . . . . 353

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

Step 1: Copy the Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

Step 2: Copy the Existing Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355

Step 3: Create the Assignment Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356

Step 4: Create the Decision Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

Step 5: Create the Session Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

Step 6: Create the Email Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359

Step 7: Start the Workflow and Monitor the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

Unit 17: More Tasks and Reusability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365

Lesson 17-1. Event Wait Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365

Pre-Defined Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 366

Table of Contents

x Informatica PowerCenter 8 Level I Developer

Page 11: PC8LID 20061204 Large for Printing

User-Defined Event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367

Lesson 17-2. Event Raise Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 368

Lesson 17-3. Command Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369

Lesson 17-4. Reusable Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

Lesson 17-5. Reusable Session Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

Lesson 17-6. Reusable Session Configurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372

Lesson 17-7. pmcmd Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 373

Unit 18: Worklets and More Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375

Lesson 18-1. Worklets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375

Lesson 18-2. Timer Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 378

Lesson 18-3. Control Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379

Unit 18 Lab: Load Inventory Fact Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383

Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Step 1: Copy the Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Step 2: Create a Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Step 3: Create a Timer Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 385

Step 4: Create an Email Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386

Step 5: Create a Control Task . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

Step 6: Create the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388

Step 7: Start the Workflow and Monitor the Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

Unit 19: Workflow Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

Lesson 19-1. Designing Workflows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393

Workflow Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

Workflow Specifics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 394

Unit 19 Workshop: Load All Staging Tables in Single Workflow . . . . . . . . . . . . . . . . . . . . . . . . 397

Workshop Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397

Unit 20: Beyond This Course. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401

Table of Contents

Informatica PowerCenter 8 Level I Developer xi

Page 12: PC8LID 20061204 Large for Printing

Table of Contents

xii Informatica PowerCenter 8 Level I Developer

Page 13: PC8LID 20061204 Large for Printing

List of Figures

Figure 2-1. Navigator Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Figure 2-2. DEV_SHARED Folder and Subfolders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

Figure 2-3. Designer Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Figure 2-4. DEV_SHARED Target subfolder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Figure 2-5. Student folder with new objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Figure 2-6. Application Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Figure 2-7. Task Toolbar Default Position . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Figure 2-8. Task Toolbar After Being Moved . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Figure 2-9. Relational Connection Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Figure 3-1. Normal view of the payment flat file definition displayed in the Source Analyzer . . . . . . . . . . . . . 38

Figure 3-2. Mapping with Source and Target Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Figure 3-3. Normal view of the completed mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Figure 3-4. Completed Session Task Target Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Figure 3-5. Completed Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

Figure 3-6. Successful Run of a Workflow Depicted in the Task View of the Workflow Monitor . . . . . . . . . . 43

Figure 3-7. Properties for the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Figure 3-8. Source/Target Statistics for the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Figure 3-9. Data Preview of the STG_PAYMENT Target Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Figure 3-10. Source Definitions with a PK/FK Relationship Displayed in the Source Analyzer . . . . . . . . . . . . 52

Figure 3-11. Normal View of the Completed Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Figure 3-12. Generated SQL for the m_Stage_Product Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Figure 3-13. Properties of the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Figure 3-14. Source/Target Statistics for the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Figure 3-15. Data Preview of the STG_PRODUCT Target Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

Figure 3-16. Normal view of the promotions flat file definition displayed in the Source Analyzer . . . . . . . . . . 62

Figure 3-17. Iconic View of the Completed Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Figure 3-18. Properties of the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

Figure 3-19. Source/Target Statistics for the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Figure 3-20. Data Preview of the STG_DEALERSHIP Target Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

Figure 3-21. Data Preview of the STG_PROMOTIONS Target Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

Figure 4-1. Source Analyzer View of the customer_layout Flat File Definition . . . . . . . . . . . . . . . . . . . . . . . . 82

Figure 4-2. Target Designer View of the STG_CUSTOMERS Table Relational Definition. . . . . . . . . . . . . . . 83

Figure 4-3. Mapping with Source and Target Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Figure 4-4. Mapping with Newly Added Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Figure 4-5. Properties Tab of the Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Figure 4-6. Completed Properties Tab of the Filter Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Figure 4-7. Filter Transformation Linked to the Expression Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 86

Figure 4-8. Sample Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Figure 4-9. Iconic View of the Completed Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Figure 4-10. Session Task Source Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Figure 4-11. Contents of the customer_list.txt File List . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Figure 4-12. Properties for the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Figure 4-13. Source/Target Statistics for the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

Figure 4-14. Data Preview of the STG_CUSTOMERS Target Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Figure 4-15. General Properties for the Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

Figure 4-16. Customized Repeat Selections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Figure 4-17. Completed Schedule Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

List of Figures

Informatica PowerCenter 8 Level I Developer xiii

Page 14: PC8LID 20061204 Large for Printing

Figure 5-1. Normal View of the Heterogeneous Sources, Source Qualifiers and Target . . . . . . . . . . . . . . . . . 108

Figure 5-2. Joiner Transformation Button . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

Figure 5-3. Normal View of Heterogeneous Sources Connected to a Joiner Transformation . . . . . . . . . . . . . 109

Figure 5-5. Edit View of the Condition Tab for Joiner Transformation Without a Condition . . . . . . . . . . . . 110

Figure 5-4. Edit View of the Ports Tab for the Joiner Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

Figure 5-6. Edit View of the Condition Tab for the Joiner Transformation with Completed Condition . . . . 111

Figure 5-7. Normal View of Completed Mapping Heterogeneous Sources Not Displayed . . . . . . . . . . . . . . . 112

Figure 5-8. Task Details of the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Figure 5-9. Source/Target Statistics for the Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

Figure 5-10. Data Preview of the STG_TRANSACTIONS Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

Figure 5-11. View of an Unorganized Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Figure 5-12. Arranged View of a Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

Figure 5-13. Iconic View of an Arranged Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Figure 5-14. Selecting Multiple Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

Figure 5-15. Designer Warning Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

Figure 5-16. Selecting the forward link path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Figure 5-17. Highlighted forward link path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Figure 5-18. Highlighted link path going forward and backward . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Figure 5-19. Selecting to propagate the attributes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

Figure 5-20. Propagation attribute dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

Figure 5-21. Autolink dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Figure 5-22. Defining a prefix in the autolink dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Figure 5-23. Expression after the AGE port has been moved . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Figure 5-24. Click and drag method of moving ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Figure 5-25. Creating a transformation using the menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Figure 5-26. Create Transformation dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

Figure 5-27. Normal View of the Newly Created Aggregator Transformation . . . . . . . . . . . . . . . . . . . . . . . 128

Figure 5-28. Zoom options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

Figure 5-29. Navigator window in the Designer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

Figure 6-1. Source Analyzer view of the employees_layout flat file definition . . . . . . . . . . . . . . . . . . . . . . . . 144

Figure 6-2. Target Designer view of the STG_EMPLOYEES relational table definition . . . . . . . . . . . . . . . . 144

Figure 6-3. Transformation edit dialog box showing how to make a transformation reusable . . . . . . . . . . . . 145

Figure 6-4. Question box letting you know the action is irreversible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Figure 6-5. Transformation edit dialog box of a reusable transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 145

Figure 6-6. Navigator window depicting the Transformations node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Figure 6-7. Partial mapping with source and target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

Figure 6-8. Transformation Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Figure 6-9. Lookup Transformation table location dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Figure 6-10. Dialog box 1 of the 3 step Flat File Import Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

Figure 6-11. Normal view of the newly created Lookup Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

Figure 6-12. Lookup Transformation condition box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149

Figure 6-13. Source properties for the employee_list file list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150

Figure 6-14. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Figure 6-15. Source/Target Statistics of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

Figure 6-16. Data Preview of the STG_EMPLOYEES target table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

Figure 6-17. Mapping with Source and Target definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Figure 6-18. Completed Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160

Figure 6-19. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

Figure 6-20. Source/Target Statistics for the session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

Figure 6-21. Data preview of the STG_DATES table - screen 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163

List of Figures

xiv Informatica PowerCenter 8 Level I Developer

Page 15: PC8LID 20061204 Large for Printing

Figure 6-22. Data preview of the STG_DATES table - screen 2 scrolled right . . . . . . . . . . . . . . . . . . . . . . . 163

Figure 7-1. Debug Session creation dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Figure 7-2. Debug Session connections dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

Figure 7-3. Designer while running a Debug Session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

Figure 7-4. Customize Toolbars Dialog Box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

Figure 7-5. Debugger Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

Figure 8-1. Expanded view of m-DIM_DATES_LOAD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Figure 8-2. Sequence Generator Transformation icon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

Figure 8-3. Normal view of the sequence generator NEXTVAL port connected to a target column . . . . . . . . 188

Figure 8-4. Normal view of connected ports to the target . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

Figure 8-5. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

Figure 8-6. Source/Target statistics for the session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

Figure 8-7. Data Preview of the DIM_DATES table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

Figure 9-1. m_DIM_PROMOTIONS_LOAD mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

Figure 9-2. m_DIM_DATES from the previous lab that populated the DIM_DATES table . . . . . . . . . . . . . 201

Figure 9-3. Select Lookup Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201

Figure 9-4. Lookup Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

Figure 9-5. m_DIM_POROMOTIONS_LOAD completed mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

Figure 9-6. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

Figure 9-7. Source/Target Statistics of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

Figure 9-8. Data Preview of the DIM_PROMOTIONS target table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

Figure 9-9. Preview files created when Persistent Cache is set on Lookup Transformation . . . . . . . . . . . . . . . 205

Figure 9-10. Find in workspace dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

Figure 9-11. View Dependencies dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209

Figure 9-12. Transformation compare objects dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

Figure 9-13. Compare Transformation objects Properties details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211

Figure 9-14. Target comparison dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

Figure 9-15. Column differences between two target tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

Figure 10-1. m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD mapping . . . . . . . . . . . . . . . . . . . . . . . 232

Figure 10-2. Employee_central.txt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232

Figure 10-3. Renaming an instance of a Reusable Transformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Figure 10-4. m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD after most links removed . . . . . . . . . . . 233

Figure 10-5. Sorter Transformation Icon on Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233

Figure 10-6. Aggregator Transformation Icon on Toolbar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235

Figure 10-7. Partial mapping flow depicting the flow from the Sorter to the Filter to the Aggregator . . . . . . . 235

Figure 10-8. Split data stream joined back together . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

Figure 10-9. Iconic view of the completed self-join mapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

Figure 10-10. Source properties for the employee_list.txt file list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

Figure 10-11. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

Figure 10-12. Source/Target Statistics of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239

Figure 10-13. Data preview of the self-join of Managers and Employees in the STG_EMPLOYEES target table - screen 1240

Figure 10-14. Data preview of the STG_EMPLOYEES target table - screen 2 scrolled right . . . . . . . . . . . . . 240

Figure 11-1. Mapping copy Target Dependencies dialog box . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

Figure 11-2. Iconic view of the m_DIM_EMPLOYEES_MAPPING . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259

Figure 11-3. Router Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

Figure 11-4. Update Strategy set to INSERT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261

Figure 11-5. Iconic view of the completed mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263

Figure 11-6. Source Filter Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264

Figure 11-7. Writers section of Target schema . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

List of Figures

Informatica PowerCenter 8 Level I Developer xv

Page 16: PC8LID 20061204 Large for Printing

Figure 11-8. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

Figure 11-9. Source/Target Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265

Figure 11-10. Data Results for DIM_EMPLOYEES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

Figure 11-11. Data Results for the Error Flat File (Located on the Machine Hosting the Integration Service Process266

Figure 11-12. Task Details tab results for second run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

Figure 11-13. Source/Target Statistics for second run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

Figure 11-14. Data preview showing updates to the target table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268

Figure 12-1. Port tab view of a dynamic Lookup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

Figure 12-2. Port to Port Association . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 286

Figure 12-3. Iconic View of the Completed Mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

Figure 12-4. Error Log Choice Screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287

Figure 12-5. Task Details of the Completed Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

Figure 12-6. Source/Target Statistics for the Session Run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288

Figure 12-7. Data preview of the DIM_CUSTOMERS table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289

Figure 12-8. Flat file error log. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290

Figure 13-1. Source Analyzer view of the STG_TRANSACTIONS and STG_PAYMENT tables . . . . . . . . . . 309

Figure 13-2. Declare Parameters and Variables screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

Figure 13-3. Parameter entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310

Figure 13-4. Lookup Ports tab showing input, output and return ports checked/unchecked . . . . . . . . . . . . . 311

Figure 13-5. Aggregator ports with Group By ports checked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312

Figure 13-6. Finished Aggregator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313

Figure 13-7. Aggregator to Target Links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

Figure 13-8. Iconic view of the completed mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314

Figure 13-9. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

Figure 13-10. Source/Target Statistics of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315

Figure 13-11. Data Preview of the FACT_SALES target table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 316

Figure 14-1. Mapplet Designer view of mplt_AGG_SALES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328

Figure 14-2. Mapplet Designer view of MPLT_AGG_SALES with Input and Output transformations . . . . . 329

Figure 14-3. Iconic view of the m_FACT_SALES_LOAD_MAPPLET_xx mapping . . . . . . . . . . . . . . . . . . . 330

Figure 15-1. Source table definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

Figure 15-2. Target table definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

Figure 15-3. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342

Figure 15-4. Source/Target Statistics of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342

Figure 15-5. Data Preview of the FACT_PROMOTIONS_AGG_DAILY table . . . . . . . . . . . . . . . . . . . . . . 342

Figure 16-1. Workflow variable declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356

Figure 16-2. Link condition testing if a session run was successful . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

Figure 16-3. Assignment Task expression declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357

Figure 16-4. Decision Task Expression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358

Figure 16-5. Link condition testing for a Decision Task condition of TRUE . . . . . . . . . . . . . . . . . . . . . . . . 359

Figure 16-6. Email Task Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360

Figure 16-7. Completed Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 360

Figure 16-8. Gantt chart view of the completed workflow run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

Figure 16-9. View Workflow Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361

Figure 16-10. Value of the $$WORKFLOW_RUNS variable after first run . . . . . . . . . . . . . . . . . . . . . . . . . 362

Figure 16-11. Gantt chart view of the completed workflow run after the weekly load runs . . . . . . . . . . . . . . 362

Figure 16-12. Task Details of the completed session run . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362

Figure 18-1. Timer Task Relative time setting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386

Figure 18-2. Email Task Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387

Figure 18-3. Control Task Properties Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388

Figure 18-4. Completed Worklet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 388

List of Figures

xvi Informatica PowerCenter 8 Level I Developer

Page 17: PC8LID 20061204 Large for Printing

Figure 18-5. Completed Workflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389

Figure 18-6. Gantt chart view of the completed workflow run. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

Figure 18-7. Gantt chart view of the completed workflow run. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 390

List of Figures

Informatica PowerCenter 8 Level I Developer xvii

Page 18: PC8LID 20061204 Large for Printing

List of Figures

xviii Informatica PowerCenter 8 Level I Developer

Page 19: PC8LID 20061204 Large for Printing

Preface

Welcome to the PowerCenter 8 Level I Developer course. Data integration is a large undertaking with many potential areas of concern. The PowerCenter infrastructure will greatly assist you in your data integration efforts and alleviate much of your risk. This course will prepare the developers for that challenge by teaching you the most commonly used components of the product.

You will build a small data warehouse using PowerCenter to extract from source tables and files, transform the data, load it into a staging area and finally into the data warehouse. The instructor will teach you about mappings, transformations, sources, targets, workflows, sessions, workflow tasks, connections and the Velocity methodology.

Preface

Informatica PowerCenter 8 Level I Developer xix

Page 20: PC8LID 20061204 Large for Printing

About This Guide

Course Objectives

Welcome to the PowerCenter 8 Level I Developer course.

After completing this course, you should be able to:

♦ Use PowerCenter developer tools to:

♦ Create and debug mappings

♦ Create, run, monitor and troubleshoot workflows

♦ Design basic mappings and workflows

Audience

This course is designed for data integration and data warehousing implementers. You should be familiar with data integration and data warehousing terminology and in using Microsoft Windows.

Document Conventions

This guide uses the following formatting conventions:

If you see… It means… Example

> Indicates a submenu to navigate to. Click Repository > Connect.

In this example, you should click the Repository menu or

button and choose Connect.

boldfaced text Indicates text you need to type or enter. Click the Rename button and name the new source definition S_EMPLOYEE.

UPPERCASE Database tables and column names are

shown in all UPPERCASE.T_ITEM_SUMMARY

italicized text Indicates a variable you must replace with

specific information.Connect to the Repository using the assigned login_id.

Note: The following paragraph provides

additional facts.Note: You can select multiple objects to import by using the Ctrl key.

Tip: The following paragraph provides

suggested uses or a Velocity best practice.Tip: The m_ prefix for a mapping name is…

Preface

xx Informatica PowerCenter 8 Level I Developer

Page 21: PC8LID 20061204 Large for Printing

Other Informatica Resources

In addition to the student guides, Informatica provides these other resources:

♦ Informatica Documentation

♦ Informatica Customer Portal

♦ Informatica web site

♦ Informatica Developer Network

♦ Informatica Knowledge Base

♦ Informatica Professional Certification

♦ Informatica Technical Support

Obtaining Informatica Documentation

You can access Informatica documentation from the product CD or online help.

Visiting Informatica Customer Portal

As an Informatica customer, you can access the Informatica Customer Portal site at http://my.informatica.com. The site contains product information, user group information, newsletters, access to the Informatica customer support case management system (ATLAS), the Informatica Knowledge Base, and access to the Informatica user community.

Visiting the Informatica Web Site

You can access Informatica’s corporate web site at http://www.informatica.com. The site contains information about Informatica, its background, upcoming events, and locating your closest sales office. You will also find product information, as well as literature and partner information. The services area of the site includes important information on technical support, training and education, and implementation services.

Visiting the Informatica Developer Network

The Informatica Developer Network is a web-based forum for third-party software developers. You can access the Informatica Developer Network at the following URL:

http://devnet.informatica.com

The site contains information on how to create, market, and support customer-oriented add-on solutions based on interoperability interfaces for Informatica products.

Visiting the Informatica Knowledge Base

As an Informatica customer, you can access the Informatica Knowledge Base at http://my.informatica.com. The Knowledge Base lets you search for documented solutions to known technical issues about Informatica products. It also includes frequently asked questions, technical white papers, and technical tips.

Obtaining Informatica Professional Certification

You can take, and pass, exams provided by Informatica to obtain Informatica Professional Certification. For more information, go to:

http://www.informatica.com/services/education_services/certification/default.htm

Preface

Informatica PowerCenter 8 Level I Developer xxi

Page 22: PC8LID 20061204 Large for Printing

Providing Feedback

Email any comments on this guide to [email protected].

Obtaining Technical Support

There are many ways to access Informatica Technical Support. You can call or email your nearest Technical Support Center listed in the following table, or you can use our WebSupport Service.

Use the following email addresses to contact Informatica Technical Support:

[email protected] for technical inquiries

[email protected] for general customer service requests

WebSupport requires a user name and password. You can request a user name and password at http://my.informatica.com.

North America / South

America Europe / Middle East / Africa Asia / Australia

Informatica Corporation

Headquarters

100 Cardinal Way

Redwood City, California

94063

United States

Toll Free

877 463 2435

Standard Rate

United States: 650 385 5800

Informatica Software Ltd.

6 Waltham Park

Waltham Road, White Waltham

Maidenhead, Berkshire

SL6 3TN

United Kingdom

Toll Free

00 800 4632 4357

Standard Rate

Belgium: +32 15 281 702

France: +33 1 41 38 92 26

Germany: +49 1805 702 702

Netherlands: +31 306 022 797

United Kingdom: +44 1628 511

445

Informatica Business

Solutions Pvt. Ltd.

301 & 302 Prestige Poseidon

139 Residency Road

Bangalore 560 025

India

Toll Free

Australia: 00 11 800 4632 4357

Singapore: 001 800 4632 4357

Standard Rate

India: +91 80 5112 5738

Preface

xxii Informatica PowerCenter 8 Level I Developer

Page 23: PC8LID 20061204 Large for Printing

Unit 1: Data Integration Concepts

After completing this unit, you should be able to:

♦ Describe Informatica corp and its place in the data integration marketplace

♦ Define basic data integration terms and concepts

Lesson 1-1. Introducing Informatica

Informatica provides data integration tools for both batch and real-time applications.

Unit 1: Data Integration Concepts

Informatica PowerCenter 8 Level I Developer 1

Page 24: PC8LID 20061204 Large for Printing

Informatica is affiliated with many standards organizations, including:

♦ Integration Consortium. www.eaiindustry.org

♦ Object Management Group (OMG). www.omg.org.

♦ Common Warehouse Metamodel (CWM). www.omg.org/cwm

♦ Enterprise Grid Alliance. www.gridalliance.org

Unit 1: Data Integration Concepts

2 Informatica PowerCenter 8 Level I Developer

Page 25: PC8LID 20061204 Large for Printing

♦ Global Grid Forum (GGF). www.gridforum.org

♦ XML.org. www.xml.org

♦ Web Services Interoperability Organization. www.ws-i.org

♦ Supply-Chain Council. www.supply-chain.org

♦ Carnegie-Mellon Software Engineering Institute (SEI). www.sei.cmu.edu

♦ APICS Educational and Research Foundation. www.apics.org

♦ Shared Services and Business Process Outsourcing Association (SBPOA). www.sharedxpertise.org.

Additional resources about Informatica can be found on the following websites:

♦ www.informatica.com—provides information on Professional Services and Education Services

♦ my.informatica.com—provides access to Technical Support, product documentation, Velocity methodology, knowledge base, and mapping templates

♦ devnet.informatica.com—the Informatica Developers Network offers discussion forums, web seminars, and technical papers.

Unit 1: Data Integration Concepts

Informatica PowerCenter 8 Level I Developer 3

Page 26: PC8LID 20061204 Large for Printing

Lesson 1-2. Data Integration

Traditionally, data integration is a batch process—to extract, transform and load (ETL) data from transactional systems to data warehouses.

The ETL process can be imagined as an assembly line.

Unit 1: Data Integration Concepts

4 Informatica PowerCenter 8 Level I Developer

Page 27: PC8LID 20061204 Large for Printing

Informatica PowerCenter is deployed for a variety of batch and real-time data integration purposes:

♦ Data Migration. ERP consolidation, legacy conversion, new application implementation, system consolidation

♦ Data Synchronization. application integration, business to business data transfer

♦ Data Warehousing. Business intelligence reporting, data marts, data mart consolidation, operational data stores

♦ Data Hubs. master data management; reference data hubs; single view of customer, product, supplier, employee, etc.

♦ Business Activity Monitoring. business process improvement, real-time reporting

Informatica partners with Composite Software for Enterprise Information Integration (EII): on-the-fly federated views and real-time reporting of information spread across multiple data sources, without moving the data into a centralized repository.

Lesson 1-3. Mappings and Transformations

Mappings

Unit 1: Data Integration Concepts

Informatica PowerCenter 8 Level I Developer 5

Page 28: PC8LID 20061204 Large for Printing

Transformations

Transformations change the data they receive.

PowerCenter includes the following types of transformations:

♦ Passive: The number of rows entering and exiting the transformation are the same.

♦ Active: The number of rows exiting the transformation may not be the same as the number of rows entering the transformation.

Unit 1: Data Integration Concepts

6 Informatica PowerCenter 8 Level I Developer

Page 29: PC8LID 20061204 Large for Printing

Commonly used PowerCenter transformations include:

♦ Source Qualifier - reads sources

♦ Filter - filters data conditionally

♦ Sorter - sorts data

♦ Expression - performs logical/mathematical functions on data

♦ Aggregator - sums, averages, maximum, minimum

♦ Joiner - joins two data flows

♦ Lookup - looks up a corresponding value from a table or flat file

Unit 1: Data Integration Concepts

Informatica PowerCenter 8 Level I Developer 7

Page 30: PC8LID 20061204 Large for Printing

Lesson 1-4. Tasks and Workflows

Tasks

A task is an executable set of actions, functions, or commands.

A sequence of tasks defines the runtime behavior of a data integration process.

Unit 1: Data Integration Concepts

8 Informatica PowerCenter 8 Level I Developer

Page 31: PC8LID 20061204 Large for Printing

Workflows

A workflow is a set of ordered tasks that describe runtime ETL processes. Tasks can be sequenced serially, in parallel and conditionally. Each linked icon represents a task.

Unit 1: Data Integration Concepts

Informatica PowerCenter 8 Level I Developer 9

Page 32: PC8LID 20061204 Large for Printing

Lesson 1-5. Metadata

Metadata, which means “data about data,” is information that describes data. Common contents of metadata include the source or author of a dataset, how the dataset should be accessed, and its limitations.

18

Unit 1 Quiz

1. ETL

2. Mapping

3. Workflow

4. Metadata

5. Transformation

6. Task

a. An executable set of actions, functions or commands

b. Defines data and processes

c. Generates or manipulates data

d. Logically defines the ETL process

e. A collection of ordered tasks

f. Extract, transform and load data

Match the terms and explanations:

Unit 1

Unit 1: Data Integration Concepts

10 Informatica PowerCenter 8 Level I Developer

Page 33: PC8LID 20061204 Large for Printing

Unit 2: PowerCenter Components and User Interface

After completing this unit, you should be able to:

♦ Name the main PowerCenter components and describe their user interfaces

Lesson 2-1. PowerCenter Architecture

The following screenshot shows the PowerCenter Architecture:

♦ Sources—Can be relational tables or heterogeneous files (flat files, VSAM files and XML)

♦ Note: XML is an advanced topic and is not covered in this course.

♦ Targets—Can be relational tables or heterogeneous files

♦ Integration Service—The engine that performs all of the extract, transform and load logic

♦ Repository Service—Manages connectivity to the metadata repositories that contain mapping and workflow definitions

♦ Repository Service Process—Multi-threaded process that retrieves, inserts and updates repository metadata

♦ Repository—Contains all of the metadata needed to run ETL processes

♦ Client Tools—Desktop tools used to populate the repository with metadata, execute workflows on the Integration Service, monitor the workflows and manage the repository

Unit 2: PowerCenter Components and User Interface

Informatica PowerCenter 8 Level I Developer 11

Page 34: PC8LID 20061204 Large for Printing

Lesson 2-2. PowerCenter Client Tools

Client tools run on Microsoft Windows.

All tools access the repository through the Repository Service.

Workflow Manager and Workflow Monitor connect to Integration Service.

Each client application has its own interface. The interfaces have toolbars, a navigation window to the left, a workspace to the right, and an output window at the bottom.

Unit 2: PowerCenter Components and User Interface

12 Informatica PowerCenter 8 Level I Developer

Page 35: PC8LID 20061204 Large for Printing

Designer

Unit 2: PowerCenter Components and User Interface

Informatica PowerCenter 8 Level I Developer 13

Page 36: PC8LID 20061204 Large for Printing

Within the Designer, you can display transformations in the following views:

♦ Iconized. Shows the transformation in relation to the rest of the mapping. This also minimizes the screen space needed to display a mapping.

♦ Normal. Shows the flow of data through the transformation. This view is typically used when copying/linking ports to other objects.

♦ Edit. Shows transformation ports and properties; allows editing. This view is used to add, edit, or delete ports and to change any of the transformation attributes or properties.

Unit 2: PowerCenter Components and User Interface

14 Informatica PowerCenter 8 Level I Developer

Page 37: PC8LID 20061204 Large for Printing

Workflow Manager

Unit 2: PowerCenter Components and User Interface

Informatica PowerCenter 8 Level I Developer 15

Page 38: PC8LID 20061204 Large for Printing

In the Workflow Manager, you can display tasks in the following views:

♦ Iconized (Session task example)

♦ Edit (Session task example)

Unit 2: PowerCenter Components and User Interface

16 Informatica PowerCenter 8 Level I Developer

Page 39: PC8LID 20061204 Large for Printing

Unit 2 Lab: Using the Designer and Workflow Manager

Business Purpose

You have been asked to learn how to use Informatica PowerCenter in order to more efficiently accomplish your ETL objectives and automate the development process. Because you have limited or no prior exposure to this software, this exercise will serve to orient you to the basic development interfaces.

Technical Description

PowerCenter includes two development applications: the Designer, which you will use to create mappings, and the Workflow Manager, which you will use to create and start workflows. This exercise is designed to serve as your first hands-on experience with PowerCenter, and supplement the instructor demonstrations. You will import source and target definitions from a shortcut folder into your own folder.

Goals

♦ Learn how to navigate the repository folder structure.

♦ Understand the purpose of the tools accessed from the Designer and Workflow Manager.

♦ Create and save source and target shortcuts.

♦ Learn how to access and edit the database connection objects

Duration

30 minutes

Unit 2 Lab: Using the Designer and Workflow Manager

Informatica PowerCenter 8 Level I Developer 17

Page 40: PC8LID 20061204 Large for Printing

Instructions

Step 1: Launch the Designer and Log Into the Repository

1. Launch the Designer client application from the desktop icon. If no desktop icon is present, select Start > Programs > Informatica PowerCenter … > Client > PowerCenter Designer.

2. Maximize the Designer window.

Note: Notice the Navigator window on the left side, which should resemble Figure 2-1. However, you may see additional or fewer repositories, depending on your classroom environment.

3. Log into the PC8_DEV repository with the user name studentxx, where xx represents your student number as assigned by the instructor. The password is the same. Passwords are always case-sensitive.

Step 2: Navigate Folders

1. Double-click the folder DEV_SHARED. This opens the folder and shows you the subfolders associated with it. Figure 2-2 shows the Navigator:

Figure 2-1. Navigator Window

Tip: The user name to log into the repository is an application-level user name—it allows PowerCenter to admit you to the repository with a specific set of application privileges. It is not a database user name.

Figure 2-2. DEV_SHARED Folder and Subfolders

Unit 2 Lab: Using the Designer and Workflow Manager

18 Informatica PowerCenter 8 Level I Developer

Page 41: PC8LID 20061204 Large for Printing

Note: Notice that the DEV_SHARED folder has a small blue arm holding it. This icon denotes that DEV_SHARED is a shortcut folder. As you will see later in this lab, objects dragged from a shortcut folder into an open folder create shortcuts to the object.

2. Expand some of the subfolders to see the objects they hold.

Note that some subfolders are empty. When a new object, such as a target definition, is created within a folder, it automatically goes into the appropriate subfolder.

Note: Notice that within the Sources subfolder, the source objects are organized under individual “nodes” (branches in the hierarchy), such as FlatFile, ODBC_EDW, etc. These are based on the type of source and the name of the Data Source Name that was used to import the source definition (more on this later). Very Important: You will need to click on these source nodes to locate source definitions that may be “hiding” from view.

Each PowerCenter application, such as the Designer, shows only subfolders related to the objects that can be created and modified by that application. For example, in the Designer you only see subfolders for sources, targets, mappings, etc.

3. Double-click on your individual student folder.

For the remainder of the class, you will create and modify objects in this folder. Some pre-made objects have been provided as well.

Note: Your student folder is now the “open” folder. Only one folder at a time can be open. The DEV_SHARED folder is now “expanded.” This distinction is important, as you will see later in this lab.

Step 3: Navigating the Designer Tools

1. Select the menu option Tools > Source Analyzer. The workspace to the right of the Navigator window changes to an empty space.

Note: Note the small toolbar directly to the right of the Navigator window, at the top. These are the five Designer tools. Each tool allows you to create and modify one specific type of object, such as sources. Figure 2-3 shows the Designer tools with the first tool (the Source Analyzer) selected.

Tip: Technically, all folders are “shared” with all users who have the appropriate folder permissions, regardless whether it has the blue arm or not. Do not confuse repository folders with the directories visible in Windows Explorer. The folders are PowerCenter repository objects and are not related to Windows directories.

Tip: Subfolders are created and managed automatically. Users cannot create, delete, nest, or rename subfolders.

Figure 2-3. Designer Tools

Unit 2 Lab: Using the Designer and Workflow Manager

Informatica PowerCenter 8 Level I Developer 19

Page 42: PC8LID 20061204 Large for Printing

2. With your left mouse button, alternately toggle between the five tools.

The name of each tool is displayed in the upper left corner of the workspace when that tool is active.

Note: The main menu bar (very top of your screen) changes depending on which tool is active. Because these menus are context-sensitive to which tool is active, you must already be in the appropriate tool to create or modify a specific type of object.

♦ The Source Analyzer tool is used to create or modify source objects. They may be relational, flat file, XML or COBOL sources.

♦ The Target Designer tool is used to create or modify target objects. They may be relational, flat file, or XML. It does not matter whether these targets are part of an actual data warehouse.

♦ The Transformation Developer tool is used to create or modify reusable transformations. Non-reusable transformations are created directly in a mapping or mapplet. This distinction will be covered later in the class.

♦ The Mapplet Designer tool is used to create or modify mapplets.

♦ The Mapping Designer tool is used to create or modify mappings.

Step 4: Create and Save Shortcuts

1. Ensure that the Target Designer is active and that your student folder is open.

2. To help view which folder is active, choose View > Workbook to view the PowerCenter Client in Workbook view.

The PowerCenter Client displays tabs for each folder at the bottom of the Main window:

Important: In order to copy/shortcut any object into a folder, the destination folder (the folder you are adding to) must be the open folder. If the destination folder is not open, the copy/shortcut will not work.

PowerCenter Client shows tabs in Workfbook view.

Unit 2 Lab: Using the Designer and Workflow Manager

20 Informatica PowerCenter 8 Level I Developer

Page 43: PC8LID 20061204 Large for Printing

3. In the DEV_SHARED folder, expand the Targets subfolder by clicking on the + sign to the left of the subfolder. Figure 2-4 shows the Navigator window:

4. Drag and drop the STG_PAYMENT target from the Navigator into the Target Designer workspace.

You will see the confirmation message, “Create a shortcut to the target table STG_PAYMENT?”

5. Click Yes at the confirmation message.

6. Expand the Targets subfolder in your Student folder. Note that you have added a shortcut of the STG_PAYMENT staging target table to your own folder.

7. Open the Source Analyzer tool in your student folder.

8. In the DEV_SHARED folder, expand the Sources subfolder and expand the FlatFile container.

9. Add shortcuts to your folder to the two source definitions listed below.

♦ PROMOTIONS

♦ PAYMENT

Figure 2-4. DEV_SHARED Target subfolder

Tip: PowerCenter shortcuts are “pointers” to the original object. They can be used but they cannot be modified as shortcuts. The original object can be modified, and any changes will immediately affect all shortcuts to that object.

Unit 2 Lab: Using the Designer and Workflow Manager

Informatica PowerCenter 8 Level I Developer 21

Page 44: PC8LID 20061204 Large for Printing

10. Confirm that your student folder appears similar to Figure 2-5:

11. Use the menu option Repository > Save to save these objects in your student folder.

Step 5: Launch the Workflow Manager

1. Left-click the toolbar icon for the Workflow Manager shown in Figure 2-6. This toolbar is usually above the Navigator window.

2. Confirm that the Workflow Manager launches and you are automatically logged into the repository the same way as you were in the Designer.

3. Maximize the Workflow Manager application.

Figure 2-5. Student folder with new objects

Tip: You should periodically save changes to the repository when using the Designer or the Workflow Manager. The keyboard shortcut Ctrl+S can also be used. There is no “auto-save” feature.

Figure 2-6. Application Toolbar

Tip: Avoid having two or more “instances” of the same PowerCenter application (such as the Workflow Manager) running on a machine at the same time. There is no benefit in doing this, and it can result in confusion when editing objects.

Workflow Manager Button

Unit 2 Lab: Using the Designer and Workflow Manager

22 Informatica PowerCenter 8 Level I Developer

Page 45: PC8LID 20061204 Large for Printing

4. Browse through the various folders and subfolders in the Workflow Manager Navigator window as you did in the Designer. Note that only subfolders for the objects that can be created with the Workflow Manager are present, Tasks, Sessions, Worklets, and Workflows.

Note: Although a session object is a type of task, it gets its own subfolder because you will typically have many more sessions than the other types of tasks. Only reusable sessions will appear in the Sessions subfolder. Likewise, only reusable tasks (except for sessions) will appear in the Tasks subfolder.

Step 6: Navigating the Workflow Manager Tools

1. Select the menu option Tools > Task Developer.

Just as in the Designer, you will see the workspace clear itself and a toolbar appear to the right of the Navigator window. The idea is the same as with the Designer, except there are three tools instead of five.

2. With your left mouse button, alternately toggle between the three tools.

Note that the name of each tool is displayed in the upper left corner of the workspace when that tool is active. Note also the context-sensitive menus, as we did in the Designer.

♦ The Task Developer tool is used to create or modify reusable tasks.

♦ The Worklet Designer tool is used to create or modify worklets.

♦ The Workflow Designer tool is used to create or modify workflows

Step 7: Workflow Manager Task Toolbar

The Workflow Manager is equipped with a toolbar that shows an icon for each type of task that can be created. This toolbar is visible by default, but the default location of is at the top right-hand corner of the screen. We will move the toolbar to a more central location.

1. Locate the “vertical stripe” at the far-left hand side of the task bar, as shown in Figure 2-7:

Figure 2-7. Task Toolbar Default Position

Unit 2 Lab: Using the Designer and Workflow Manager

Informatica PowerCenter 8 Level I Developer 23

Page 46: PC8LID 20061204 Large for Printing

2. With your left mouse button, drag the toolbar toward the left and drop it in a convenient location so that all of the buttons are visible. The top of your Workflow Manager should appear similar to Figure 2-8:

Step 8: Database Connection Objects

Later in the class, we will create sessions that will read data from database source and target tables. In order to open a connection to the respective databases, PowerCenter needs the database log-in and the designation (i.e., connection string, database name or server name).

Instead of requiring the user to type this information each time a session is created, PowerCenter allows us to create reusable and sharable database connection objects. These objects contain properties describing one database connection. The objects can be associated with multiple sessions to describe either source, target, or lookup connections.

1. In the Workflow Manager, select the menu option Connections > Relational.

Figure 2-8. Task Toolbar After Being Moved

Unit 2 Lab: Using the Designer and Workflow Manager

24 Informatica PowerCenter 8 Level I Developer

Page 47: PC8LID 20061204 Large for Printing

You will see the Relational Connection Browser similar to Figure 2-9.

Note: Note that each connection object is organized under a database type.

2. Double-click on the NATIVE_TRANS connection object to display its properties.

3. You will not have write privileges. Click OK.

Note: Note that the connection NATIVE_TRANS will log into the database with the user name sdbu. The connection object will be shared among the students in the class.

4. Double-click on any of the other objects that have your student number. The NATIVE_STG07 connection, for example, will have the user name tdbu07. These are the individual student connections to be used to read from and write to your individual staging tables and the enterprise data warehouse (EDW) tables.

It’s intuitive to create additional connection objects. Experiment if you have extra time.

Figure 2-9. Relational Connection Browser

Tip: Database connection objects are not associated with a specific folder.

Unit 2 Lab: Using the Designer and Workflow Manager

Informatica PowerCenter 8 Level I Developer 25

Page 48: PC8LID 20061204 Large for Printing

13

Unit 2 Quiz

1. Integration Service

2. Repository

3. Repository Service

4. Administration Console

5. Repository Manager

6. Designer

7. Workflow Manager

8. Workflow Monitor

9. Repository Service Process

a. Centralized management of repository connections

b. Perform domain and service administrative tasks

c. Monitor and control workflows

d. Create and start workflows

e. ETL processing engine

f. Multi-threaded process that retrieves/inserts/updates repository metadata

g. Collection of tables that contains PowerCenter metadata

h. Create mapping objects

i. Perform repository security

Match the terms and explanations:

Unit 2

Unit 2 Lab: Using the Designer and Workflow Manager

26 Informatica PowerCenter 8 Level I Developer

Page 49: PC8LID 20061204 Large for Printing

Unit 3: Source Qualifier

When you have completed this unit, you should be able to:

♦ Describe when and how to use:

♦ Source Qualifier transformation

♦ Source Qualifier join

♦ Create and run pass-through mappings

Lesson 3-1. Source Qualifier Transformation

Source Qualifier Transformation

Type

Active (may change number of rows)

Description

A Source Qualifier transformation:

♦ Selects records from flat file and relational table sources. Only those fields or columns used in the mapping are selected, based on the output connections.

♦ Converts the data from the source’s native datatype to the most compatible PowerCenter transformation datatype.

♦ Generates a SQL query for relational sources.

♦ Can perform homogeneous joins between relational tables on the same database.

Unit 3: Source Qualifier

Informatica PowerCenter 8 Level I Developer 27

Page 50: PC8LID 20061204 Large for Printing

Properties

The following table describes the Source Qualifier transformation properties:

Business Purpose

The use of a Source Qualifier is a product requirement; other types of sources require equivalent transformations (XML Source Qualifier, etc.). It provides an efficient way to filter input fields/columns and to perform homogeneous joins.

Property Description

Sql Query Allows you to override the default SQL query that PowerCenter creates at runtime.

User Defined Join Allows you to specify a join that replaces the default join created by PowerCenter.

Source Filter Allows you to create a where clause that will be inserted into the SQL query that is generated at

runtime. The “where” portion of the statement is not required. For example:

Table1.ID = Table2.ID

Number of Sorted Ports PowerCenter will insert an order by clause in the generated SQL query. The order by will be on the

number of ports specified, from the top down. For example, in the sq_Product_Product_Cost Source

Qualifier, if the number of sorted ports = 2, the order by will be:

ORDER BY PRODUCT.PRODUCT_ID, PRODUCT.GROUP_ID.

Tracing Level Specifies the amount of detail written to the session log.

Select Distinct Allows you to select distinct values only.

Pre SQL Allows you to specify SQL that will be run prior to the pipeline being run. The SQL will be run using the

connection specified in the session task.

Post SQL Allows you to specify SQL that will be run after the pipeline has been run. The SQL will be run using

the connection specified in the session task.

Unit 3: Source Qualifier

28 Informatica PowerCenter 8 Level I Developer

Page 51: PC8LID 20061204 Large for Printing

Datatype Conversion

Data can be converted from one datatype to another by:

♦ Passing data between ports with different datatypes

♦ Passing data from an expression to a port

♦ Using transformation functions

♦ Using transformation arithmetic operators

Supported conversions are:

♦ Numeric datatypes <=> Other numeric datatypes

♦ Numeric datatypes <=> String

♦ Date/Time <=> Date or String (to convert from string to date the string must be in the default PowerCenter data format MM/DD/YYYY HH24:MI:SS)

Similarly, when writing to a target the Integration Service converts the data to the target’s native datatype.

For further information, see the PowerCenter Client Help > Index > port-to-port data conversion.

Unit 3: Source Qualifier

Informatica PowerCenter 8 Level I Developer 29

Page 52: PC8LID 20061204 Large for Printing

Lesson 3-2. Velocity Methodology

In labs, we will use Informatica's Velocity methodology.

This methodology includes:

♦ Templates

♦ Mapping specification templates

♦ Source to target field matrix

♦ Naming conventions

♦ Object type prefixes: m_, exp_, agg_, wf_, s_, …

♦ Best practices

Velocity covers the entire data integration project life cycles:

Phase 1: Manage

Phase 2: Architect

Phase 3: Design

Phase 4: Build

Phase 5: Test

Phase 6: Deploy

Phase 7: Operate

For more information, see http://devnet.informatica.com (requires registration).

Unit 3: Source Qualifier

30 Informatica PowerCenter 8 Level I Developer

Page 53: PC8LID 20061204 Large for Printing

Lab Project

The Mersche Motors data model consists of the following star schemas. The labs predominately use the Sales star schema.

Data is moved first to the staging area and from there to the data warehouse and target flat files.

The labs can source from flat files and/or a relational database.

Unit 3: Source Qualifier

Informatica PowerCenter 8 Level I Developer 31

Page 54: PC8LID 20061204 Large for Printing

Source Tables and Files

The source system has the following relational tables:

DEALERSHIP

PRODUCT

PRODUCT_COST

The source system has the following flat files:

customer_layout

dates

employees_layout

inventory

payment

promotions

sales_transactions

Staging Area

The staging area has the following tables:

STG_CUSTOMERS

STG_DATES

STG_DEALERSHIP

STG_EMPLOYEES

STG_INVENTORY

STG_PAYMENT

STG_PRODUCT

STG_PROMOTIONS

STG_TRANSACTIONS

Data Warehouse

The data warehouse has the following tables:

DIM_CUSTOMERS

DIM_DATES

DIM_DEALERSHIP

DIM_EMPLOYEES

DIM_PAYMENT

DIM_PRODUCT

DIM_PROMOTIONS

FACT_INVENTORY

FACT_PRODUCT_AGG_DAILY

FACT_PRODUCT_AGG_WEEKLY FACT_PROMOTIONS_AGG_DAILY

FACT_SALES

Unit 3: Source Qualifier

32 Informatica PowerCenter 8 Level I Developer

Page 55: PC8LID 20061204 Large for Printing

Architecture and Connectivity

Architecture

The labs use the following architecture and connections:

Integration Service: PC_IService

Repository Name: PC8_DEV

Folders: Student 01 - 20

User Names: student01 - 20

Passwords: student01 - 20

Connectivity

ODBC Connections:

Native Connections:

Source Tables ODBC_TRANS

Staging Area ODBC_STG (01 - 20)

Data Warehouse ODBC_EDW (01 - 20)

Source Tables NATIVE_TRANS

Staging Area NATIVE_STG (01 - 20)

Data Warehouse NATIVE_EDW (01 - 20)

Relational Source sdbu with password sdbu

Relational Targets tdbu01 - 20

Passwords tdbu01 - 20

Unit 3: Source Qualifier

Informatica PowerCenter 8 Level I Developer 33

Page 56: PC8LID 20061204 Large for Printing

Unit 3: Source Qualifier

34 Informatica PowerCenter 8 Level I Developer

Page 57: PC8LID 20061204 Large for Printing

Unit 3 Lab A: Load Payment Staging Table

Section 1: Pass-Through Mapping

Business Purpose

The staging area of the Mersche Motors data warehouse contains a table that assigns payment type descriptions for each payment ID. Because these descriptions may change, the table must be synchronized daily with the corresponding data located in the operational system. The operational system administrator uses a simple flat file to record and edit these descriptions.

Technical Description

PowerCenter will source from a delimited flat file and insert the data into a database table without performing data transformations. In order to avoid duplicate records in subsequent loads, we will configure PowerCenter to truncate the target table before each load.

Goals

♦ Open the Designer Tools and switch between Workspaces

♦ Create a simple pass-through mapping

♦ Create a Session task to run the mapping and configure connectivity

♦ Create a Workflow to run the Session task

♦ Run the Workflow and monitor the results

Duration

35 minutes

Unit 3 Lab A: Load Payment Staging Table

Informatica PowerCenter 8 Level I Developer 35

Page 58: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

This is a pass-through mapping with no data transformation.

Mapping Name m_Stage_Payment_Type_xx

Source System Flat file Target System Oracle Table

Initial Rows 5 Rows/Load 5

Short Description Simple pass-through mapping, comma-delimited flat-file to Oracle table

Load Frequency

Preprocessing Target truncate

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

PAYMENT_ID, PAYMENT_TYPE_DESC

Tables

Table Name Schema/Owner Selection/Filter

Files

File Name File Location Fixed/Delimited Additional File Info

payment.txt C:\pmfiles\SrcFiles Delimited Comma delimiter

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

STG_PAYMENT X

Source Target

Unit 3 Lab A: Load Payment Staging Table

36 Informatica PowerCenter 8 Level I Developer

Page 59: PC8LID 20061204 Large for Printing

SO

UR

CE

TO

TA

RG

ET

FIE

LD

MA

TR

IX

Target Table

Target Column

Datatype

Source

Table

Source

Column

Datatype

Expression

Default

Value if

Nulls

Data Issues/

quality

STG_PAYMENT

Payment_id

Number(3,0)

PAYMENT

Payment_id

Decimal(3,0)

STG_PAYMENT

Payment_type_desc

Varchar2(20)

PAYMENT

Payment_type_desc

String(10)

Velocity Best Practice: T

his

is t

he V

eloc

ity

Sour

ce t

o T

arge

t Fi

eld

Mat

rix.

It

is d

ispl

ayed

her

e fo

r yo

u re

fere

nce.

In

futu

re la

bs w

e w

ill b

e us

ing

a sh

orte

ned

vers

ion

of t

he m

atri

x.

Unit 3 Lab A: Load Payment Staging Table

Informatica PowerCenter 8 Level I Developer 37

Page 60: PC8LID 20061204 Large for Printing

Instructions

Step 1: Launch the Designer and Review the Source and Target Definitions

1. If necessary, launch the Designer application by selecting Start > Programs > Informatica PowerCenter … > Client > PowerCenter Designer.

2. Log into the PC8_DEV repository with the user name studentxx and password studentxx where xx represents your student number as assigned by the instructor.

3. Open your student folder by double-clicking on it.

4. Open the Source Analyzer selecting the menu option Tools > Source Analyzer.

5. Drag the Shortcut_to_payment source file from the Sources subfolder into the Source Analyzer workspace.

Confirm that your source definition appears the same as displayed in Figure 3-1. You may have to drag the box wider to see the Length column.

6. Open the Target Designer by clicking the respective icon in the toolbar. The icon is shown highlighted below:

7. Click the name of the “Shortcut_to_STG_PAYMENT” target table definition and drag it from the Targets subfolder into the Target Designer workspace.

8. Review the target definition.

Step 2: Create a Mapping

1. Open the Mapping Designer by clicking the respective icon in the toolbar. The icon is shown highlighted below:

Tip: If an instance of the Designer is already running on your workstation, do not launch another instance. It is unnecessary and potentially confusing to run more than one instance per workstation.

Figure 3-1. Normal view of the payment flat file definition displayed in the Source Analyzer

Unit 3 Lab A: Load Payment Staging Table

38 Informatica PowerCenter 8 Level I Developer

Page 61: PC8LID 20061204 Large for Printing

2. Select the menu option Mappings > Create

a. Delete the default mapping name and enter the name m_Stage_Payment_Type_xx, xx refers to your student number.

b. Click OK.

3. Perform the following steps in the Navigator window:

a. Expand the Sources subfolder.

b. Expand the FlatFile node

c. Drag-and-drop the source Shortcut_to_payment into the mapping.

4. Expand the Targets subfolder, and drag-and-drop the target Shortcut_to_STG_PAYMENT into the mapping.

5. Select the menu option View > Navigator.

This will temporarily remove the Navigator window from view in order to increase your mapping screen space.

Your mapping should appear as displayed in Figure 3-2.

6. Select the Source Qualifier transformation.

a. Drag-and-drop the port PAYMENT_ID from the Source Qualifier to the PAYMENT_ID field in the target definition.

b. Drag-and-drop the Source Qualifier port PAYMENT_TYPE_DESC to the PAYMENT_TYPE_DESC field in the target definition.

7. Right-click in a blank area within the mapping and choose the menu option Arrange All.

Velocity Best Practice: The m_ as a prefix for a mapping name is specified in the Informatica Velocity Best Practices. Mapping names should be clear and descriptive so that others can immediately understand the purpose of the mapping. Velocity suggests either the name of the targets being accessed or a meaningful description of the function of the mapping.

Figure 3-2. Mapping with Source and Target Definitions

Tip: When linking ports in a mapping as described above, ensure that the tip of your mouse cursor is touching a letter in the name or datatype or any property for the port you are dragging.

Unit 3 Lab A: Load Payment Staging Table

Informatica PowerCenter 8 Level I Developer 39

Page 62: PC8LID 20061204 Large for Printing

Your mapping should appear the same as displayed in Figure 3-3.

8. Type Ctrl+S to save your work to the repository.

9. Confirm that your Output Window displays the following message:

*******Mapping m_Stage_Payment_Type is VALID *******

mapping m_Stage_Payment_Type inserted.

-----------------------------------------------------

Step 3: Create a Workflow and a Session Task

1. Launch the Workflow Manager by clicking on the respective icon in the toolbar. The icon is shown highlighted below:

2. Open the Workflow Designer by clicking the respective icon in the toolbar. The icon is shown highlighted below:

3. Select the menu option Workflows > Create.

a. Delete the default Workflow name and enter wkf_Load_STG_PAYMENT_xx (xx refers to your student number).

b. Click OK.

4. Select the menu option Tasks > Create.

a. Select session from the Select the task type to create drop-box.

b. Enter the Session name s_m_Stage_Payment_Type_xx (xx refers to your student number).

Figure 3-3. Normal view of the completed mapping

Tip: The Output Window displays messages about the results of an action taken in the Designer.

Velocity Best Practice: The wkf_ as a prefix for a Workflow name is specified in the Informatica Velocity Methodology.

Velocity Best Practice: The s_ as a prefix for a session name is specified in the Informatica Velocity Methodology. The Velocity recommendation for a session name is s_mappingname.

Unit 3 Lab A: Load Payment Staging Table

40 Informatica PowerCenter 8 Level I Developer

Page 63: PC8LID 20061204 Large for Printing

c. Click the Create button.

d. The Mappings list box shows the mappings saved in your folder. Select the m_Stage_Payment_Type_xx mapping and click OK.

e. Click Done.

5. Drag the newly created session to middle of the screen.

6. Double-click on the session task that you just created to open it in edit mode.

a. Select the Mapping tab (not the Property tab).

b. Select the Source Qualifier icon SQ_Shortcut_to_payment (in the Session properties navigator window).

c. In the Properties area scroll down and confirm the source file name and location.

i. Source file directory: $PMSourceFileDir\

ii. Source filename: payment.txt.

d. Select the target Shortcut_to_STG_PAYMENT (in the Session properties navigator window).

e. Using the Connections list box, select the NATIVE_STGXX connection object, where XX represents your student number assigned by the instructor.

f. In the Properties area, confirm that the load type is Bulk.

g. In the Properties area, scroll down until the property Truncate target table option is visible. Select the check-box.

Tip: If you select the session icon from the task toolbar instead of using the Tasks > Create menu option the client tool will name the session for you, using the correct velocity standard name: s_mappingname. It will also place the session where you click in the workspace.

Tip: When the Integration Service process runs on UNIX or Linux, the filename is case sensitive.

Tip: Setting the load type to bulk will use the target RDBMS bulk loading facility.

Unit 3 Lab A: Load Payment Staging Table

Informatica PowerCenter 8 Level I Developer 41

Page 64: PC8LID 20061204 Large for Printing

Your session task information should appear similar to that displayed in Figure 3-4.

h. Click OK.

7. Type Ctrl+S to save your work to the repository.

8. Confirm that your Output Window displays the message:

*******Workflow wkf_Load_STG_PAYMENT is INVALID*******

Workflow wkf_Load_STG_PAYMENT inserted

------------------------------------------------------

9. Click the Link Tasks icon in the Tasks Toolbar shown below.

10. Holding down the left mouse button, drag from the Start Task to the s_m_Stage_Payment_Type_xx Session Task and release the mouse. This will establish a link from the Start Task to the Session Task.

Figure 3-4. Completed Session Task Target Properties

Tip: For this section you have created a non-reusable session within the workflow. This session exists only within the context of the workflow.

Unit 3 Lab A: Load Payment Staging Table

42 Informatica PowerCenter 8 Level I Developer

Page 65: PC8LID 20061204 Large for Printing

Your workflow should appear as displayed in Figure 3-5.

11. Type Ctrl+S to save your work to the repository.

Confirm that your Output Window displays the message:

...Workflow wkf_Load_STG_PAYMENT tasks validation completed with no errors.

******* Workflow wkf_Load_STG_PAYMENT is VALID *******

Workflow wkf_Load_STG_PAYMENT updated.

---------------------------------------

Step 4: Run the Workflow and Monitor the Results

1. Right-click on a blank area near the Workflow and inside the workspace, select Start Workflow.

2. If Workflow Monitor is already opened, the workflow and session will automatically display. However, if the Monitor opens new:

a. Right click on the PC8_DEV repository and choose Connect.

b. Log in with your studentxx id and password.

c. Right click on PC_IService and choose Connect.

d. Right click on your Studentxx folder and choose Open.

e. Right click on wkf_Load_STG_PAYMENT_xx and select Open Latest 20 Runs.

3. Maximize the Workflow Monitor.

Note there are two tabs above the Output window: Gantt Chart and Task View.

4. Select Task View.

Your information should appear similar to what is displayed in Figure 3-6.

5. To view the details of the session task:

a. Right-click the task name and select “Get run properties”

b. Scroll down through the Task Details window.

Figure 3-5. Completed Workflow

Figure 3-6. Successful Run of a Workflow Depicted in the Task View of the Workflow Monitor

Unit 3 Lab A: Load Payment Staging Table

Informatica PowerCenter 8 Level I Developer 43

Page 66: PC8LID 20061204 Large for Printing

The task details should appear in the lower-right, as displayed Figure 3-7.

6. Select the Source /Target statistics tab. Expand the node for the source and target. Note that for the Source and Target objects in the mapping, there is a count of the rows in various categories, such as Applied Rows (success), Affected (transformed), and Rejected, also an estimated throughput speed.

7. Right-click the Session again and select Get Session Log.

a. Session log will be displayed.

b. Review the log and note the variety of information it shows.

c. Close the Session Log.

8. Select the Gantt Chart tab.

Note that the Workflow and the Session are displayed within a horizontal timeline.

Figure 3-7. Properties for the Completed Session Run

Figure 3-8. Source/Target Statistics for the Completed Session Run

Unit 3 Lab A: Load Payment Staging Table

44 Informatica PowerCenter 8 Level I Developer

Page 67: PC8LID 20061204 Large for Printing

Data Results

In the Designer, you can view the data that was loaded into the target.

1. Right-click on the STG_Payment target definition.

2. Select Preview Data.

3. Set the ODBC Data Source drop-box to the ODBC_STG Data Source Name.

4. Enter the user name tdbuxx, where xx represents your student number as assigned by the instructor.

5. Enter the password tdbuxx and click the Connect button.

Your data should appear as displayed in Figure 3-9.

Figure 3-9. Data Preview of the STG_PAYMENT Target Table

Unit 3 Lab A: Load Payment Staging Table

Informatica PowerCenter 8 Level I Developer 45

Page 68: PC8LID 20061204 Large for Printing

Lesson 3-3. Source Qualifier JoinsA Source Qualifier can join data from multiple relational tables on the same database (homogeneous join) if the tables have a primary key-foreign key relationship defined in the Source Analyzer. These columns do not have to be keys on the source database, but they should be indexed for best performance.

Unit 3 Lab A: Load Payment Staging Table

46 Informatica PowerCenter 8 Level I Developer

Page 69: PC8LID 20061204 Large for Printing

The join is performed on the source database at runtime (when SQL generated by the Source Qualifier executes). Joining data in a Source Qualifier allows the Integration Service to read data in multiple tables in a single pass, which can improve session performance.

In a case where there is no PK/FK relationship you can specify a User Defined Join. Enter the join condition in the Source Qualifier properties e.g. tableA.EmployeeID=TableB.EmployeeID. By default you get an inner join—use SQL Query override to specify other join types.

Example

A business sells a high volume of products and updates the Product Dimension table on a regular basis. To update the dimension table, a join of the PRODUCT and PRODUCT_COST table is required. Since the source tables are from the same database and have a key relationship only a single Source Qualifier transformation is needed.

Unit 3 Lab A: Load Payment Staging Table

Informatica PowerCenter 8 Level I Developer 47

Page 70: PC8LID 20061204 Large for Printing

Note the primary key-foreign key relationship between the PRODUCT_ID field of the PRODUCT table and the PRODUCT_CODE field of the PRODUCT_COST table.

Performance Considerations

For relational sources, the number of rows processed can be reduced by using SQL override and adding a “WHERE” clause or by the use of the “Source Filter” attribute if not all rows are required. Also, the default SQL generated by the Source Qualifier can be customized to improve performance.

Tip: To improve performance, only connect those ports necessary to produce the final output.

Unit 3 Lab A: Load Payment Staging Table

48 Informatica PowerCenter 8 Level I Developer

Page 71: PC8LID 20061204 Large for Printing

Unit 3 Lab B: Load Product Staging Table

Section 2: Homogeneous Join

Business Purpose

There are two Oracle tables that together contain vital information about the products sold by Mersche Motors. You need to combine the data from both tables into a single staging table that can be used as a source of data for the data warehouse.

Technical Description

PowerCenter will define a homogeneous join between the two Oracle source tables. The source database server will perform an inner join on the tables based on a join statement automatically generated by the Source Qualifier. The join set will be loaded into the staging table.

Goals

♦ Import relational source definitions

♦ View relationships between relational sources

♦ Use a Source Qualifier to define a homogeneous join and view the statement

Duration

30 minutes

Unit 3 Lab B: Load Product Staging Table

Informatica PowerCenter 8 Level I Developer 49

Page 72: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW

Mapping Name m_Stage_Product_xx

Source System Oracle Tables Target System Oracle Table

Initial Rows 48 Rows/Load 48

Short Description This mapping joins the product table and the product cost table and loads data to the staging area

Load Frequency Once

Preprocessing Target truncate

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

PRODUCT.PRODUCT_ID, PRODUCT_COST.PRODUCT_CODE

Tables

Table Name Schema/Owner Selection/Filter

PRODUCT SDBU N/A

PRODUCT_COST SDBU N/A

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

STG_PRODUCT X

Source 1

Source 2

Target

Unit 3 Lab B: Load Product Staging Table

50 Informatica PowerCenter 8 Level I Developer

Page 73: PC8LID 20061204 Large for Printing

PROCESSING DESCRIPTION (DETAIL)

This is a simple mapping that joins the PRODUCT table to the PRODUCT_COST to obtain cost information for each product. The product details along with the cost details are loaded into the staging area.

SOURCE TO TARGET FIELD MATRIX

Target Table Target Column Source Table Source Column ExpressionDefault Value

if Null

STG_PRODUCT PRODUCT_ID PRODUCT PRODUCT_ID

STG_PRODUCT GROUP_ID PRODUCT GROUP_ID

STG_PRODUCT PRODUCT_DESC PRODUCT PRODUCT_DESC

STG_PRODUCT GROUP_DESC PRODUCT GROUP_DESC

STG_PRODUCT DIVISION_DESC PRODUCT DIVISION_DESC

STG_PRODUCT SUPPLIER_DESC PRODUCT_COST SUPPLIER_DESC

STG_PRODUCT COMPONENT_ID PRODUCT_COST COMPONENT_ID

STG_PRODUCT PRODUCT_COST PRODUCT_COST PRODUCT_COST

Unit 3 Lab B: Load Product Staging Table

Informatica PowerCenter 8 Level I Developer 51

Page 74: PC8LID 20061204 Large for Printing

Instructions

Step 1: Import the Source Definitions

1. Open the Source Analyzer workspace in the Designer.

a. Right click in the workspace and select Clear All.

b. Chose the menu option Sources > Import from Database

i. Set the ODBC Data Source drop-box to the ODBC_TRANS Data Source Name

ii. Enter the user name sdbu.

iii. Tab down into Owner name and confirm that it defaults to the user name entered above.

iv. Enter the password sdbu and click the Connect button.

v. Expand the node in the Select tables area, and expand the TABLES node.

vi. Import the relational tables PRODUCT and PRODUCT_COST.

2. Save your work.

Your Source Analyzer should appear as displayed in Figure 3-10.

Tip: You can select multiple objects for simultaneous import by using the Ctrl key.

Figure 3-10. Source Definitions with a PK/FK Relationship Displayed in the Source Analyzer

Tip: The arrow connecting the keys PRODUCT_ID and PRODUCT_CODE denotes a relationship stored in the Informatica repository. By default, referential integrity (primary to foreign key) relationships defined on a database are imported when each of the tables in the relationship are imported. The arrow head is on the Primary key end (Parent / Independent / ‘one’ end) of the relationship.

Tip: It is not generally a good practice to create two different tables with the same primary key. In correct database design, this method of ‘horizontal partitioning’ of a table is usually only justified for security or performance reasons. Separating products and their product_costs doesn't meet either of these criteria. These two tables however give you a very good example of using a homogenous join in a mapping.

Unit 3 Lab B: Load Product Staging Table

52 Informatica PowerCenter 8 Level I Developer

Page 75: PC8LID 20061204 Large for Printing

Step 2: Import the Relational Target Definition

1. Open the Target Designer.

a. Right click in the workspace and select Clear All.

b. Chose the menu option Targets > Import from Database.

i. Connect using the ODBC Data Source ODBC_STG, the user name tdbuxx and the password tdbuxx, where xx represents your student number.

ii. Import the relational target definition STG_PRODUCT.

2. Save your work.

Step 3: Create the Mapping

1. Open the Mapping Designer.

2. If a mapping is visible in the workspace, close it by choosing the menu option Mappings > Close.

3. Create a new mapping named m_Stage_Product_xx. For further details about how to do this, see Step 2, “Create a Mapping” on page 38.

4. Choose the menu option Tools > Options.

a. Select the Tables tab.

i. Set the Tools drop-box at the top to Mapping Designer.

ii. Uncheck the check-box Create Source Qualifiers when opening Sources.

iii. Click OK.

5. Add the source definitions PRODUCT and PRODUCT_COST to the mapping. You may need to display the navigator window by selecting the menu option View > Navigator.

6. Create a Source Qualifier transformation by clicking on the appropriate icon in the transformation toolbar and then clicking in the workspace. The icon is shown highlighted below:

7. In the Select Sources for Source Qualifier Transformation dialog-box, confirm that both sources are selected and click OK.

8. Double-click the Source Qualifier to enter edit mode.

9. Click the rename button and change the name to sq_Product_Product_Cost.

10. Add the target definition STG_PRODUCT to the mapping.

11. Link each of the output ports in the Source Qualifier to an input port in the target with the same name (i.e., PRODUCT_ID linked to PRODUCT_ID). Note: Do not link PRODUCT_CODE.

Tip: The check-box described above allows you to specify whether a Source Qualifier transformation will be created automatically every time a Source definition is added to the mapping. Generally, this option is turned off when it is desired to add several relational Sources to the mapping and create a single Source Qualifier to join them.

Unit 3 Lab B: Load Product Staging Table

Informatica PowerCenter 8 Level I Developer 53

Page 76: PC8LID 20061204 Large for Printing

12. Link the COST port to the PRODUCT_COST port.

13. Save your mapping and confirm that it is valid. Note that the PRODUCT_CODE port in the Source Qualifier is intended to be unlinked, as it is not required in the target.

Confirm that your mapping appears the same as displayed in Figure 3-11.

14. Edit the Source Qualifier.

a. Click on the Properties tab.

b. Open the SQL Query Editor by clicking the arrow in the SQL Query property.

c. Click the Generate SQL button. Note that the join statement can now be previewed, and that it is an inner join. Also note that the PRODUCT_CODE column is not in the SELECT statement; this is because the column is not linked in the mapping and is not needed.

Your SQL Editor should appear as displayed in Figure 3-12.

15. Click OK twice.

Figure 3-11. Normal View of the Completed Mapping

Figure 3-12. Generated SQL for the m_Stage_Product Mapping

Unit 3 Lab B: Load Product Staging Table

54 Informatica PowerCenter 8 Level I Developer

Page 77: PC8LID 20061204 Large for Printing

16. Save your work.

Step 4: Create the Session and Workflow

1. From the Workflow Manager application, open the Workflow Designer tool.

2. If a Workflow is visible in the workspace, close it by choosing the menu option Workflows > Close.

3. Create a new Workflow named wkf_Stage_Product_xx.

For further details about how to do this, see Step 3, “Create a Workflow and a Session Task” on page 40.

4. Create a new Session by clicking on the appropriate icon in the task toolbar and then clicking in the workspace. The icon is shown highlighted below:

Select the mapping m_Stage_Product_xx for the Session.

5. Edit the Session.

a. In the Mapping tab:

i. Set the relational source connection object property to NATIVE_TRANS.

ii. Set the relational target connection object property to NATIVE_STGxx where xx is your student number.

iii. Check the property Truncate target table option in the target properties.

iv. In the Properties area, confirm that the load type is Bulk.

6. Link the Start task to the Session task.

For further details about how to do this, see Step 3, “Create a Workflow and a Session Task” on page 40.

7. Right click in the workspace and select Arrange > Horizontal.

8. Save your work.

Step 5: Run the Workflow and Monitor the Results

1. Start the workflow.

Tip: It is generally not a good practice to save the generated SQL unless there is a need to override it. If you cancel out of the SQL editor, then at runtime the session will create what is called the 'default query'. This is based on the ports and their links in the mapping. If you click OK and leave some SQL in the editor window, you've overridden the default query. Anytime you wanted to link a new port out of the Source Qualifier you would have to go in and regenerate the SQL.

Tip: The relationship between PRODUCT_ID and PRODUCT_CODE was used to generate the inner join statement. If you desire to join two source tables on two columns that are not keys, you may establish a relationship between them by dragging the foreign key to the primary key column in the Source Analyzer. You may also modify the join statement to make it an outer join.

Unit 3 Lab B: Load Product Staging Table

Informatica PowerCenter 8 Level I Developer 55

Page 78: PC8LID 20061204 Large for Printing

Confirm that your Task Details appear the same as displayed in Figure 3-13.

2. Confirm that your Source/Target Statistics appear the same as displayed in Figure 3-14.

3. Using the Preview Data option in the Designer, confirm that your target data appears the same as displayed in Figure 3-15. Be sure to login with user tdbuxx.

Figure 3-13. Properties of the Completed Session Run

Figure 3-14. Source/Target Statistics for the Completed Session Run

Figure 3-15. Data Preview of the STG_PRODUCT Target Table

Unit 3 Lab B: Load Product Staging Table

56 Informatica PowerCenter 8 Level I Developer

Page 79: PC8LID 20061204 Large for Printing

Lesson 3-4. Source Pipelines

Unit 3 Lab B: Load Product Staging Table

Informatica PowerCenter 8 Level I Developer 57

Page 80: PC8LID 20061204 Large for Printing

Unit 3 Lab B: Load Product Staging Table

58 Informatica PowerCenter 8 Level I Developer

Page 81: PC8LID 20061204 Large for Printing

Unit 3 Lab C: Load Dealership and Promotions Staging Table

Section 3: Two Pipeline Mapping

Business Purpose

Two Dealership and Promotions staging tables must be loaded, one from a relational table and one from a flat-file.

Technical Description

Both loads have a simple pass-through logic as in Lab A so we will combine them into one mapping. Even though two sources and two targets are involved, only one Session will be required to run this mapping.

Goals

♦ Import a fixed-width flat file definition

♦ Define two data flows within one mapping

Duration

20 minutes

Unit 3 Lab C: Load Dealership and Promotions Staging Table

Informatica PowerCenter 8 Level I Developer 59

Page 82: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

Mapping Name m_Dealership_Promotions_xx

Source System Flat file, Oracle Table Target System Oracle Table

Initial Rows 6, 31 Rows/Load 6, 31

Short DescriptionSimple pass-through mapping with 2 pipelines. One pipeline extracts from a flat file and loads to

an Oracle table. The second pipeline extracts from an Oracle table and loads an Oracle table.

Load Frequency

Preprocessing Target truncate

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

DEALERSHIP.DEALERSHIP_ID, PROMOTIONS.PROMO_ID

Tables

Table Name Schema/Owner Selection/Filter

DEALERSHIP SDBU N/A

Files

File Name File Location Fixed/Delimited Additional File Info

promotions.txt C:\pmfiles\SrcFiles Fixed

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

STG_DEALERSHIP X

STG_PROMOTIONS X

Unit 3 Lab C: Load Dealership and Promotions Staging Table

60 Informatica PowerCenter 8 Level I Developer

Page 83: PC8LID 20061204 Large for Printing

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

This is a pass-through mapping with no data transformation.

SOURCE TO TARGET FIELD MATRIX

Target Table Target Column Source Table Source Column ExpressionDefault Value

if Null

STG_DEALERSHIP DEALERSHIP_ID DEALERSHIP DEALERSHIP_ID

STG_DEALERSHIP DEALERSHIP_MANAGER_ID DEALERSHIP DEALERSHIP_MANAGER_ID

STG_DEALERSHIP DEALERSHIP_DESC DEALERSHIP DEALERSHIP_DESC

STG_DEALERSHIP DEALERSHIP_LOCATION DEALERSHIP DEALERSHIP_LOCATION

STG_DEALERSHIP DEALERSHIP_STATE DEALERSHIP DEALERSHIP_STATE

STG_DEALERSHIP DEALERSHIP_REGION DEALERSHIP DEALERSHIP_REGION

STG_DEALERSHIP DEALERSHIP_COUNTRY DEALERSHIP DEALERSHIP_COUNTRY

STG_PROMOTIONS PROMO_ID PROMOTIONS PROMO_ID

STG_PROMOTIONS PROMO_DESC PROMOTIONS PROMO_DESC

STG_PROMOTIONS PROMO_TYPE PROMOTIONS PROMO_TYPE

STG_PROMOTIONS START_DATE PROMOTIONS START_DATE

STG_PROMOTIONS EXPIRY_DATE PROMOTIONS EXPIRY_DATE

STG_PROMOTIONS PROMO_COST PROMOTIONS PROMO_COST

STG_PROMOTIONS DISCOUNT PROMOTIONS DISCOUNT

Source 1

Source 2

Target 1

Target 2

Unit 3 Lab C: Load Dealership and Promotions Staging Table

Informatica PowerCenter 8 Level I Developer 61

Page 84: PC8LID 20061204 Large for Printing

Instructions

Step 1: Import the Source Definitions

1. Import the relational source definition DEALERSHIP. (DO NOT use a shortcut.) For further details about how to do this, see Step 1, “Import the Source Definitions” on page 52.

2. Edit the source definition for the promotions.txt file in the source analyzer.

a. Click the Advanced button in the lower right of the edit box.

b. Make sure that the number of bytes to skip between records is set to 2.

Confirm that your promotions source definition appears the same as displayed in Figure 3-16.

Step 2: Import the Target Definitions

Import the relational target definitions STG_DEALERSHIP and STG_PROMOTIONS For further details about how to do this, see Step 2, “Import the Relational Target Definition” on page 53.

Step 3: Create the Mapping

1. Create a mapping named m_Dealership_Promotions_xx.

a. Make sure that the option to Create Source Qualifiers when Opening Sources is checked (on). For further details about how to do this, see Step 3, “Create the Mapping” on page 53.

b. Add the Dealership and Promotions source definitions to the mapping.

c. Confirm that a Source Qualifier was created for each.

d. Add the STG_DEALERSHIP and STG_PROMOTIONS target definitions to the mapping.

e. Link the appropriate Source Qualifier ports to the target ports.

2. Save the mapping and confirm that it is valid.

3. Right-click in a blank area within the mapping and choose the menu option Arrange All Iconic.

Note: A fixed width flat file will have bytes at the end of each row that depict a line feed and a carriage return. Depending on the system that the file was created on you will need to skip the appropriate number of bytes. If you don't your result set will be offset by 1 or 2 bytes. For files created on a mainframe set the value to 0, for UNIX/Linux set the value to 1, for all others set the value to 2.

Figure 3-16. Normal view of the promotions flat file definition displayed in the Source Analyzer

Unit 3 Lab C: Load Dealership and Promotions Staging Table

62 Informatica PowerCenter 8 Level I Developer

Page 85: PC8LID 20061204 Large for Printing

Confirm that your mapping appears as displayed in Figure 3-17.

Step 4: Create and Run the Workflow

1. Create a workflow named wkf_Load_Stage_Dealership_Promotions_xx.

2. Create a Session Task named s_m_Dealership_Promotions_xx that uses the mapping m_Dealership_Promotions_xx.

3. Edit the Session.

a. Set the database connection objects for the sources and targets in the Session. Note that both of the relational target database connections need to be set separately. For further details about how to do this, see “Create a Workflow and a Session Task” on page 40 and “Create the Session and Workflow” on page 55.

b. Confirm that the source location information for the Promotions flat file is set correctly. For further details about how to do this, see “Create a Workflow and a Session Task” on page 40.

c. Check the property Truncate target table option in the target properties.

4. Complete the Workflow, save it, and run it.

Confirm that your Task Details appear the same as displayed in Figure 3-18.

Figure 3-17. Iconic View of the Completed Mapping

Figure 3-18. Properties of the Completed Session Run

Unit 3 Lab C: Load Dealership and Promotions Staging Table

Informatica PowerCenter 8 Level I Developer 63

Page 86: PC8LID 20061204 Large for Printing

Confirm that your Source/Target Statistics appear the same as displayed in Figure 3-19.

5. Preview the target data with user tdbuxx. It should appear the same as Figure 3-20 and Figure 3-21:

Figure 3-19. Source/Target Statistics for the Completed Session Run

Figure 3-20. Data Preview of the STG_DEALERSHIP Target Table

Unit 3 Lab C: Load Dealership and Promotions Staging Table

64 Informatica PowerCenter 8 Level I Developer

Page 87: PC8LID 20061204 Large for Printing

Figure 3-21. Data Preview of the STG_PROMOTIONS Target Table

Unit 3 Lab C: Load Dealership and Promotions Staging Table

Informatica PowerCenter 8 Level I Developer 65

Page 88: PC8LID 20061204 Large for Printing

Unit 3 Lab C: Load Dealership and Promotions Staging Table

66 Informatica PowerCenter 8 Level I Developer

Page 89: PC8LID 20061204 Large for Printing

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

When you have completed this unit, you should be able to:

♦ Describe these features:

♦ Expression transformation

♦ Filter transformation

♦ File list

♦ Workflow scheduler

♦ Use these features in a mapping or workflow

Lesson 4-1. Expression Transformation

Type

Passive (does not change the number of rows).

Description

The Expression transformation lets you modify individual ports of a single row (or columns within a single row). It also lets you add and suppress ports. It cannot perform aggregation across multiple rows (use the Aggregator transformation).

Business Purpose

You can modify ports using logical and arithmetic operators or built-in functions for:

♦ Character manipulation (concatenate, truncate, etc.)

♦ Datatype conversion (to char, to date, etc.)

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

Informatica PowerCenter 8 Level I Developer 67

Page 90: PC8LID 20061204 Large for Printing

♦ Data cleansing (check nulls, replace string, etc.)

♦ Data manipulation (round, truncate, etc.)

♦ Numerical calculations (exponential, power, log, modulus, etc.)

♦ Scientific calculations (sine, cosine, etc.)

♦ Special (lookup, decode, etc.)

♦ Test (for spaces, number, etc.)

For example, you might need to adjust employee salaries, concatenate first and last names, or convert strings to numbers. You can use the Expression transformation to perform any non-aggregate calculations. You can also use the Expression transformation to test conditional statements before you output the results to target tables or other transformations.

Expression Editor Interface, Variables, and Validation

The Expression Editor Interface (shown below) helps the developer to construct an expression. Expressions can include numeric and logical operators, functions, ports, variables.

The Expression Editor provides:

♦ Numeric and arithmetic/logical operator keypads.

♦ Functions tab for built-in functions.

♦ Ports tab for port values.

♦ Variables tab for mapping and system variables.

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

68 Informatica PowerCenter 8 Level I Developer

Page 91: PC8LID 20061204 Large for Printing

Expressions resolve to a single value of a specific datatype. For example, the expression LENGTH (“HELLO WORLD”) / 2 returns a value of 5.5. The function LENGTH calculates the length of the string including all blank spaces as 11 bytes.

Tip: Highlighting a function and pressing F1 will launch the online help and open it at the highlighted function section.

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

Informatica PowerCenter 8 Level I Developer 69

Page 92: PC8LID 20061204 Large for Printing

Variables and Scope

A transformation variable is created by creating a port and selecting the V check box. When V is checked, the I and O check boxes are grayed out. This indicates that a variable port is neither an input nor an output port.

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

70 Informatica PowerCenter 8 Level I Developer

Page 93: PC8LID 20061204 Large for Printing

When a record is processed, the expression is evaluated and the result assigned to the variable port. The result must be compatible with the datatype selected for the port otherwise an error is generated.

The variable persists across the entire set of records that traverse the transformation. It may be used or modified anywhere in the set of data that is being processed.

Example 1

Check, Clean and Record Errors

Suppose that we want to:

♦ Clean Up Item Name: The Accounts Receivable department is tired of generating reports with an inconsistent set of Items Names. Some are in UPPERCASE while others are in lower case; still others are in mixed case. They would like to see all of the data in a Title case mode. They would also like a count of how many changes have been made.

♦ Missing Data: The Systems and Application group is concerned that occasionally some incomplete data is sent to end users. They will like to tag each record as an error and be able to report and investigate the data where critical fields are missing data.

♦ Invalid Dates: Due to applications issues, occasionally dates are not valid. The AR departments, as well as the auditors are very concerned about this issue. They want every record with a bad date tagged and reported on.

♦ Invalid Numbers: The Sales Department is concerned that occasionally they see non-numeric data in a report that covers sales discounts where they expect to see numeric data. Find all errors and tag the records.

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

Informatica PowerCenter 8 Level I Developer 71

Page 94: PC8LID 20061204 Large for Printing

The mapping could use the following functions and expressions:

Example 2

Calculate Sales Discounting and Inventory Days

Suppose we want to calculate:

♦ Discount Tracking: Sales Management would like to compare the amount of the suggested sell price to the actual sell price to determine the level of discounting. They plan to do this via a report. They would like a field developed that calculates sales discount.

♦ Days in Inventory: The Sales and Marketing departments would like to be able to determine how long an item was in inventory.

The mapping could use the following functions and expressions.

Performance Considerations

Multiple identical conversions of the same data should be avoided.

Ports that do not need modification should bypass the Expression transformation to save buffers.

REQ Functions Used Notes Expression

1 INITCAP The INITCAP function will place the text in title

case.

INITCAP(ITEM_NAME)

2 ISNULL

LENGTH

IS_SPACES

ISNULL will check for a NULL while

IS_SPACES will look for a string that has

spaces in it. If Length = 0, then the string is

empty.

ISNULL (port_name) OR LENGTH (port_name) = 0 OR IS_SPACES(port_name)

3 IS_DATE The IS_DATE function will check the input and

determine if the date is Valid.

IS_DATE ("03/01/2005","MM/DD/YYYY")

4 IS_NUMBER The IS_DATE function will check the input and

determine is the number is valid.

IS_NUMBER ("3.1415")

REQ Functions Used Notes Expression

1 Arithmetic This is an Arithmetic expression. ( ( MSRP - ACTUAL ) / MSRP ) * 100

2 TO_DATE

DATE_DIFF

TO_DATE will convert a string into an

Informatica Internal Date and the

DATE_DIFF function will calculate the

difference between two dates in the units

specified in the format operand. Here, the

difference is returned in days because

the format is 'DD'.

DATE_DIFF ( TO_DATE (SOLD_DT, 'MM/DD/YYYY'),TO_DATE (INVENTORY_DT, 'MM/DD/YYYY'),'DD' )

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

72 Informatica PowerCenter 8 Level I Developer

Page 95: PC8LID 20061204 Large for Printing

Lesson 4-2. Filter Transformation

Type

Active.

Description

The Filter transformation allows rows which meet the filter condition to pass through the transformation. Rows which do not meet the filter condition are skipped.

Business Purpose

A business may chose not to process records which do not meet a data quality criterion, such as containing a null value in a field which may cause a target constraint violation or eliminate from the process date field values which will not provide useful data.

Example 1

Existing customer dimension records need to be updated to reflect changes to columns like address. However, only existing customer records are to be updated. The following example uses a Lookup to verify the customer exits and a filter to skip records which do not have an exiting customer (MSTR_CUST_ID) id. An Update Strategy tags the records for update which pass the filter condition.

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

Informatica PowerCenter 8 Level I Developer 73

Page 96: PC8LID 20061204 Large for Printing

Performance Considerations

Filter records which do not meet the selection criterion as early as possible in a mapping to reduce the number of rows processed, decrease throughput and decrease run-time. In fact any active transformation, that decreases the number of rows (the Normalizer and the Router can increase the number of rows), should be placed as early as possible in the mapping to decrease total rows throughput and improve performance.

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

74 Informatica PowerCenter 8 Level I Developer

Page 97: PC8LID 20061204 Large for Printing

Lesson 4-3. File Lists

In a Session task, you can set the source instance to point to a file list (list of flat files or XML files).

♦ The session processes each file in turn.

♦ The properties of all files must match the source definition.

♦ Variables and wild cards are not allowed.

♦ All of the files must exist.

Sample file list:

d:\data\eastern_trans.txt

e:\data\midwest_trans.txt

f:\data\canada_trans.txt

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

Informatica PowerCenter 8 Level I Developer 75

Page 98: PC8LID 20061204 Large for Printing

Lesson 4-4. Workflow Scheduler

Workflows can be scheduled to run at regular intervals.

Run Options

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

76 Informatica PowerCenter 8 Level I Developer

Page 99: PC8LID 20061204 Large for Printing

♦ Run on Integration Service Initialization—will run the workflow each time the integration service initializes and then schedules it based on the other options.

♦ Run on demand—runs the workflow only when asked to.

♦ Run continuously—runs the workflow in a continuous mode. When the workflow finishes it will start again from the beginning.

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

Informatica PowerCenter 8 Level I Developer 77

Page 100: PC8LID 20061204 Large for Printing

Unit 4: Expression, Filter, File Lists, and Workflow Scheduler

78 Informatica PowerCenter 8 Level I Developer

Page 101: PC8LID 20061204 Large for Printing

Unit 4 Lab: Load the Customer Staging Table

Business Purpose

The staging area for Mersche Motors data warehouse has a customer contacts table. Mersche Motors receives new data from their regional sales offices daily in the form of three text files. The text files are identical. For processing simplicity, Mersche Motors will be making use of the PowerCenter ability to read a list of files from a single source. The mapping that will do this will run on a nightly schedule at midnight.

Technical Description

PowerCenter will source from a file list. This file list contains the names of three delimited flat files from the regional sales offices. All rows with a customer number of 99999 will need to be filtered out. There are a number of columns that will need to have the data reformatted, this will include substrings, concatenation and decodes. The target will be truncated until the mapping is fully tested.

Goals

♦ Create a Filter transformation to eliminate unwanted rows from a flat file source

♦ Create an Expression transformation to reformat incoming rows before they are written to a target

♦ Use the DECODE function as a small lookup to replace values for incoming data before writing to target

♦ Create a session task that will accept a file list as a source

♦ Create a workflow that can run on a schedule

Duration

60 Minutes

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 79

Page 102: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW

Mapping Name m_Stage_Customer_Contacts_xx

Source System Flat files Target System Oracle Table

Initial Rows 6184 Rows/Load 6177

Short DescriptionFlat file list (customer_east.txt, customer_west.txt, customer_central.txt) comma delimited files that need to

be filtered and reformatted before they are loaded into the target table.

Load Frequency Scheduled run every night at midnight.

Preprocessing

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

Files

File Name File Location Fixed/Delimited Additional File Info

customer_central.txt,

customer_east.txt,

customer_west.txt

Definition in

customer_layout.txt

C:\pmfiles\SrcFiles Delimited These 3 comma delimited flat files will be

read into the session using a filelist named

customer_list.txt.

The layout of the flat files can be found in

customer_layout.txt

customer_list.txt C:\pmfiles\SrcFiles NA File list

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

STG_CUSTOMERS X

Source TargetFilter Expression

Unit 4 Lab: Load the Customer Staging Table

80 Informatica PowerCenter 8 Level I Developer

Page 103: PC8LID 20061204 Large for Printing

PR

OC

ES

SIN

G D

ES

CR

IPT

ION

(D

ET

AIL

)

Cus

tom

er in

quir

ies

are

capt

ured

usi

ng c

usto

mer

_no

9999

9. T

he m

appi

ng w

ill f

ilter

out

the

cus

tom

er in

quir

ies.

Thi

s m

appi

ng w

ill r

efor

mat

the

cus

tom

er n

ames

, gen

der

and

tele

phon

e nu

mbe

r co

lum

ns.

SO

UR

CE

TO

TA

RG

ET

FIE

LD

MA

TR

IX

Target Table

Target Column

Source Table

Source Column

Expression

Default Value

if Null

STG_CUSTOMER

CUST_ID

customer_layout

CUSTOMER_NO

Any customer numbers that are equal to 99999 are

considered customer inquiries and not part of the

sales transactions.

STG_CUSTOMER

CUST_NAME

customer_layout

FIRSTNAME

LASTNAME

The CUST_NAME column is a concatenation of the

firstname and lastname ports.

STG_CUSTOMER

CUST_ADDRESS

customer_layout

ADDRESS

STG_CUSTOMER

CUST_CITY

customer_layout

CITY

STG_CUSTOMER

CUST_STATE

customer_layout

STATE

STG_CUSTOMER

CUST_ZIP_CODE

customer_layout

ZIP

STG_CUSTOMER

CUST_COUNTRY

customer_layout

COUNTRY

STG_CUSTOMER

CUST_PHONE

customer_layout

PHONE_NUMBE

R

The CUST_PHONE is a reformat of the

PHONE_NUMBER column. The PHONE_NUMBER

column is in the format of 9999999999 and needs to

be reformatted to (999) 999-9999.

STG_CUSTOMER

CUST_GENDER

customer_layout

GENDER

The CUST_GENDER is derived from the decoding of

the GENDER column. The GENDER column is a 1

character column that contains either 'M' (male) or 'F'

(female). Any other values will resolve to 'UNK'.

STG_CUSTOMER

CUST_AGE_GROUP

customer_layout

AGE

The CUST_AGE_GROUP is derived from the

decoding of AGE column. The valid age groups are

less than 20, 20 to 29, 30 to 39, 40 to 49, 50 to 60

and GREATER than 60.

STG_CUSTOMER

CUST_INCOME

customer_layout

INCOME

STG_CUSTOMER

CUST_E_MAIL

customer_layout

EMAIL

STG_CUSTOMER

CUST_AGE

customer_layout

AGE

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 81

Page 104: PC8LID 20061204 Large for Printing

Instructions

Step 1: Create a Flat File Source Definition

1. Launch the Designer client tool.

2. Log into the PC8_DEV repository with the user name studentxx, where xx represents your student number as assigned by the instructor.

3. Open your student folder.

4. Import the customer_layout.txt flat file definition. This file is located in the c:\pmfiles\SrcFiles directory. If the file is located in a different directory, your instructor will specify.

Ensure that the following parameters are selected:

♦ Import field names from first line.

♦ Comma delimited flat file.

♦ Text Qualifier is Double quotes.

♦ Format of the Date field is Datetime.

5. Confirm that your source definition appears as displayed in Figure 4-1.

6. Save your work to the repository.

Step 2: Create a Relational Target Definition

1. Import the STG_CUSTOMERS table found in your tdbuxx schema.

Tip: Only one flat file definition is required when using a file list as a source in Power Center. All the files that make up the file list must have same identical layout in order for the file list to be successfully processed by Power Center.

Figure 4-1. Source Analyzer View of the customer_layout Flat File Definition

Unit 4 Lab: Load the Customer Staging Table

82 Informatica PowerCenter 8 Level I Developer

Page 105: PC8LID 20061204 Large for Printing

2. Confirm that your target definition appears the same as displayed in Figure 4-2.

Step 3: Create a Mapping

1. Create a new mapping named m_Stage_Customer_Contacts_xx.

2. Add customer_layout flat file source to the mapping

3. Add STG_CUSTOMERS target to the mapping.

Your mapping will appear similar to Figure 4-3.

Step 4: Create a Filter Transformation

1. Select the Filter transformation tool button located on the Transformation tool bar and place it in the workshops between the Source Qualifier and the Target. The icon is shown highlighted below:

Figure 4-2. Target Designer View of the STG_CUSTOMERS Table Relational Definition

Figure 4-3. Mapping with Source and Target Definitions

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 83

Page 106: PC8LID 20061204 Large for Printing

Your mapping will appear similar to Figure 4-4.

2. Link the following ports from the Source Qualifier to the Filter:

♦ CUSTOMER_NO

♦ FIRSTNAME

♦ LASTNAME

♦ ADDRESS

♦ CITY

♦ STATE

♦ ZIP

♦ COUNTRY

♦ PHONE_NUMBER

♦ GENDER

♦ INCOME

♦ EMAIL

♦ AGE

3. Edit the Filter transformation.

a. Rename it to fil_Customer_No_99999.

b. Select the Properties tab.

Figure 4-4. Mapping with Newly Added Filter Transformation

Unit 4 Lab: Load the Customer Staging Table

84 Informatica PowerCenter 8 Level I Developer

Page 107: PC8LID 20061204 Large for Printing

Your display will appear similar to Figure 4-5.

4. Click the dropdown arrow for the Filter Condition Transformation Attribute to activate the Expression Editor.

5. Remove the TRUE condition from the Expression Editor.

6. Enter in the following the expression: CUSTOMER_NO != 99999 OR ISNULL(CUSTOMER_NO)

7. Validate your work, then click OK to return to the Properties of the Filter transformation.

The Properties will appear as displayed in Figure 4-6.

8. Click OK to return to the normal view of the mapping object.

Step 5: Create an Expression Transformation

1. Create an Expression transformation directly after the Filter transformation. Select the Expression transformation tool button located on the Transformation tool bar and place it in the workspace directly after the Filter. The icon is shown highlighted below:

Figure 4-5. Properties Tab of the Filter Transformation

Figure 4-6. Completed Properties Tab of the Filter Transformation

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 85

Page 108: PC8LID 20061204 Large for Printing

2. Select the following ports from the Filter transformation and pass them to Expression transformation:

♦ FIRSTNAME

♦ LASTNAME

♦ PHONE_NUMBER

♦ GENDER

♦ AGE

Your mapping will appear similar to Figure 4-7.

3. Edit the Expression transformation object.

a. Rename it exp_Format_Name_Gender_Phone.

b. Change the port type to input for all of the ports except AGE. (AGE should remain an input/output port.)

c. Prefix each of these input only ports with IN_.

d. Create a new output port after the AGE port by positioning the cursor on the AGE port and clicking the add icon.

♦ Port Name = OUT_CUST_NAME

♦ Dataytype = String

♦ Precision = 41

♦ Expression = IN_FIRSTNAME ||' ' ||IN_LASTNAME

4. Create a new output port after the OUT_CUST_NAME port.

♦ Port Name = OUT_CUST_PHONE

Figure 4-7. Filter Transformation Linked to the Expression Transformation

Velocity Best Practice: Prefixing input only ports with IN_ and output ports with OUT_ is a Velocity best practice. This makes it easier to tell what the ports are without having to go into the transformation.

Tip: This new port will concatenate the FIRSTNAME and LASTNAME ports into a single string. Do not use the CONCAT function to concatenate in expressions. Use || to achieve concatenation. The CONCAT function is only available for backwards compatibility.

Unit 4 Lab: Load the Customer Staging Table

86 Informatica PowerCenter 8 Level I Developer

Page 109: PC8LID 20061204 Large for Printing

♦ Datatype = String

♦ Precision = 14

♦ Expression = '(' || SUBSTR(TO_CHAR(IN_PHONE_NUMBER),1,3) || ') ' || SUBSTR(TO_CHAR(IN_PHONE_NUMBER),4,3) ||'-' || SUBSTR(TO_CHAR(IN_PHONE_NUMBER),7,4)

5. Create new output port after the OUT_CUST_PHONE port.

♦ Port Name = OUT_GENDER

♦ Datatype = String

♦ Precision = 6

♦ Expression = DECODE(IN_GENDER,

'M', 'MALE',

'F', 'FEMALE',

'UNK')

Tip: The expression above uses a technique known as nesting functions. TO_CHAR function is nested inside the SUBSTR function. The TO_CHAR function is performed first. The SUBSTR function is then performed against the return value from TO_CHAR.

Tip: The DECODE function used in the previous expression can be used to replace nested IIF functions or small static lookup tables. The DECODE expression in the previous step will return the value MALE if incoming port GENDER is equal to M, FEMALE if GENDER equals F, or UNK if GENDER equals anything else beside F or M.

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 87

Page 110: PC8LID 20061204 Large for Printing

6. Create a new output port after the OUT_GENDER PORT.

♦ Port Name = OUT_AGE_GROUP

♦ Datatype = String

♦ Precision = 20

♦ Expression = Write an expression using the DECODE function that will assign the appropriate age group label to each customer based on their age. Use the online help to see details about the DECODE. If after 5 minutes you have not successfully created the DECODE statement, refer to the reference section at the end of the lab for the solution. The valid age ranges and age groups are displayed in the table below. The format of the DECODE statement follows the table.

7. Save your work.

8. Connect the following ports from the Expression transformation to the target table:

Age Range Age Group Text

AGE<20 LESS THAN 20

AGE >= 20 AND <= 29 20 TO 29

AGE >= 30 AND <= 39 30 TO 39

AGE >= 40 AND <= 49 40 TO 49

AGE >= 50 AND <= 60 50 TO 60

AGE > 60 GREATER THAN 60

Figure 4-8. Sample Expression

AGE � CUST_AGE

OUT_CUST_NAME � CUST_NAME

OUT_CUST_PHONE � CUST_PHONE_NMBR

OUT_GENDER � CUST_GENDER

OUT_AGE_GROUP � CUST_AGE_GROUP

Unit 4 Lab: Load the Customer Staging Table

88 Informatica PowerCenter 8 Level I Developer

Page 111: PC8LID 20061204 Large for Printing

9. Connect the following ports from the Filter transformation to the target table:

10. Save your work.

11. Verify that your mapping is valid.

12. Right click in the workspace and select Arrange All Iconic.

Step 6: Create and Run the Workflow

1. Launch the Workflow Manager and sign into your assigned folder.

2. Open the Workflow Designer tool and create a new workflow named wkf_Stage_Customer_Contacts_xx.

3. Create a Session task using the session task tool button.

4. Select m_Stage_Customer_Contacts_xx from the Mapping list box, and click OK.

5. Link the Start object to the s_m_Stage_Customer_Contacts_xx session task.

6. Edit the s_m_Stage_Customer_Contacts_xx session.

CUSTOMER_NO � CUST_ID

ADDRESS � CUST_ADDRESS

CITY � CUST_CITY

STATE � CUST_STATE

ZIP � CUST_ZIP_CODE

COUNTRY � CUST_COUNTRY

INCOME � CUST_INCOME

EMAIL � CUST_E_MAIL

Figure 4-9. Iconic View of the Completed Mapping

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 89

Page 112: PC8LID 20061204 Large for Printing

7. Under the Mapping tab:

a. Select SQ_customer_layout located under the Sources folder in the navigator window.

b. Confirm that Source file directory is set to $PMSourceFileDir\.

c. In Properties > Attribute > Source filename type in customer_list.txt.

d. In Properties > Attribute > Source filetype click the dropdown arrow and change the default from Direct to Indirect.

Tip: The source instance you are reading is known a File List. It is a list of files which will be appended together and treated as one source file by Power Center. The name of the text file that is listed in Properties | Attribute | Source filename will be a text file that contains a list of the text files to be read in as individual sources. When you create a file list you open a blank text file with a application such as Notepad and type on a separate line each text file that is to be read as part of the file list. You may precede each file name with directory path information. If you don't provide directory path information, Power Center assumes the files will be located in the same location as the file list file.

Tip: When you use the file list feature in Power Center you have to set Properties | Attribute | Source filetype to Indirect. The default is Direct. To change the setting, click the dropdown arrow and set the value you want to use.

Unit 4 Lab: Load the Customer Staging Table

90 Informatica PowerCenter 8 Level I Developer

Page 113: PC8LID 20061204 Large for Printing

Your screen should appear similar to Figure 4-10.

The file list file used in this exercise lists three text files which are found in the default location of the file list file, $PMSourceFileDir\. Figure 4-11 displays the contents of customer_list.txt.

e. Select STG_CUSTOMERS located under the Target folder in the navigator window.

♦ Set the relational target connection object property to NATIVE_STGxx, where xx is your student number.

♦ Check the property Truncate target table option in the target properties.

8. Save your work.

9. Check Validate messages to ensure your workflow is valid.

10. Start the workflow.

Figure 4-10. Session Task Source Properties

Figure 4-11. Contents of the customer_list.txt File List

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 91

Page 114: PC8LID 20061204 Large for Printing

11. Review the session properties.

Your information should appear as displayed in Figure 4-12.

12. Review the Source/Target Statistics. Your statistics should be the same as displayed in Figure 4-13.

13. If your session failed or had errors troubleshoot and correct them by reviewing the session log and making any necessary changes to your mapping or workflow.

Figure 4-12. Properties for the Completed Session Run

Figure 4-13. Source/Target Statistics for the Completed Session Run

Unit 4 Lab: Load the Customer Staging Table

92 Informatica PowerCenter 8 Level I Developer

Page 115: PC8LID 20061204 Large for Printing

Data Results

Preview the target data from the Designer. Your data should appear as displayed in Figure 4-14.

Observe the CUST_PHONE, CUST_GENDER, CUST_AGE_GROUP columns. These columns required transforming using the Expression transformation. Scroll down and review these columns. Verify you wrote your expressions correctly.

Step 7: Schedule a Workflow

1. After debugging has been completed run the workflow for a final time for an initial table load.

2. Open the session task for the mapping and ensure the truncate table property is checked.

3. Save any changes to the repository.

Figure 4-14. Data Preview of the STG_CUSTOMERS Target Table

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 93

Page 116: PC8LID 20061204 Large for Printing

4. Select Workflows > Edit. This will display the screen seen in Figure 4-15.

5. Select the Scheduler tab.

6. Select the Edit Scheduler command button .

7. Type sch_Stage_Customers_Contacts_xx in the Name text box.

8. Select the Schedule tab.

a. Clear the Run on demand check box.

b. Select the Customized Repeat radio button and click the Edit button.

i. Select Week(s) from the Repeat every dropdown box.

ii. Check the Monday, Tuesday, Wednesday, Thursday, Friday Weekly check boxes.

iii. Select the Run once radio button in the Daily frequency group.

Figure 4-15. General Properties for the Workflow

Unit 4 Lab: Load the Customer Staging Table

94 Informatica PowerCenter 8 Level I Developer

Page 117: PC8LID 20061204 Large for Printing

Your customized options should appear the same as displayed in Figure 4-16.

iv. Click OK.

c. Set the Start Date in the Start options group to tomorrow's date.

d. Set the Start Time to 00:01.

e. Select the Forever radio button in the End options group.

Your schedule options will appear similar to the one displayed in Figure 4-17.

9. Click OK twice.

10. Save your changes to the repository.

Figure 4-16. Customized Repeat Selections

Figure 4-17. Completed Schedule Options

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 95

Page 118: PC8LID 20061204 Large for Printing

11. Right click in the workspace and select Schedule Workflow.

12. Check the Workflow Monitor to confirm that the workflow has been scheduled.

References

1. Decode Statement

DECODE(TRUE,

AGE < 20, 'LESS THAN 20',

AGE >= 20 AND AGE <= 29, '20 TO 29',

AGE >= 30 AND AGE <= 39, '30 TO 39',

AGE >= 40 AND AGE <= 49, '40 TO 49',

AGE >= 50 AND AGE <= 60, '50 TO 60',

AGE > 60, 'GREATER THAN 60')

Unit 4 Lab: Load the Customer Staging Table

96 Informatica PowerCenter 8 Level I Developer

Page 119: PC8LID 20061204 Large for Printing

Unit 4 Lab: Load the Customer Staging Table

Informatica PowerCenter 8 Level I Developer 97

Page 120: PC8LID 20061204 Large for Printing

Unit 4 Lab: Load the Customer Staging Table

98 Informatica PowerCenter 8 Level I Developer

Page 121: PC8LID 20061204 Large for Printing

Unit 5: Joins, Features and Techniques

When you have completed this unit, you should be able to:

♦ Define heterogeneous joins

♦ Use the Joiner transformation in a mapping

♦ Use Designer features and techniques

Lesson 5-1. Joiner Transformation

Joins combine data from different records (rows). Joins select rows from two different pipelines based on a relationship between the data, e.g. matching customer ID. One source is designated the Master, the other Detail.

Unit 5: Joins, Features and Techniques

Informatica PowerCenter 8 Level I Developer 99

Page 122: PC8LID 20061204 Large for Printing

Type

Active.

Description

The Joiner transformation combines fields from two data sources into a single combined data source based on one or more common fields also know as the join condition.

Business Purpose

A business has data from two different systems that needs to be combined to get the desired results.

Unit 5: Joins, Features and Techniques

100 Informatica PowerCenter 8 Level I Developer

Page 123: PC8LID 20061204 Large for Printing

Example

A business has sales transaction data on a flat file and product data on a relational table. The company needs to join the sales transaction to the product table to get some product information. We need to use the Joiner transformation to accomplish this task.

Joiner Properties

Unit 5: Joins, Features and Techniques

Informatica PowerCenter 8 Level I Developer 101

Page 124: PC8LID 20061204 Large for Printing

Join Types

There are four types of join conditions supported by the Joiner transformation:

Joiner Cache

How it Works

♦ There are two types of cache memory, index and data cache.

♦ All rows from the master source are loaded into cache memory.

♦ The index cache contains all port values from the master source where the port is specified in the join condition.

♦ The data cache contains all port values not specified in the join condition.

♦ After the cache is loaded the detail source is compared row by row to the values in the index cache.

♦ Upon a match the rows from the data cache are included in the stream.

Key Point

If there is not enough memory specified in the index and data cache properties, the overflow will be written out to disk.

Performance Considerations

The master source should be the source that will take up the least amount of space in cache. Another performance consideration would be the sorting of data prior to the Joiner transformation (discussed later).

Unit 5: Joins, Features and Techniques

102 Informatica PowerCenter 8 Level I Developer

Page 125: PC8LID 20061204 Large for Printing

Lesson 5-2. Shortcuts

Unit 5: Joins, Features and Techniques

Informatica PowerCenter 8 Level I Developer 103

Page 126: PC8LID 20061204 Large for Printing

Unit 5: Joins, Features and Techniques

104 Informatica PowerCenter 8 Level I Developer

Page 127: PC8LID 20061204 Large for Printing

Unit 5 Lab A: Load Sales Transaction Staging Table

Business Purpose

Mersche Motors receives sales transaction data from their regional sales offices in the form of a text file. The sales transaction data needs to be loaded to the staging table daily.

Technical Description

PowerCenter will source from a flat file and relational table. A Joiner transformation is used to create one dataflow that is then written to a relational target. The flat file is missing one field the staging table needs—the cost of each product. This value can be read from the STG_PRODUCT table. Each row of the source file contains a value named Product. This value has an identical corresponding value in the STG_PRODUCT table PRODUCT_ID column. Use the Joiner transformation to join the flat file to the relational table (heterogeneous join) and then write the results to the STG_TRANSACTIONS table.

Goals

♦ Create a Joiner transformation and use it to join two data streams from two different Source Qualifiers.

♦ Select the master side of the join.

♦ Specify a join condition.

Duration

30 minutes

Unit 5 Lab A: Load Sales Transaction Staging Table

Informatica PowerCenter 8 Level I Developer 105

Page 128: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW

Mapping Name m_STG_TRANSACTIONS_xx

Source System Flat file Oracle table Target System Oracle table

Initial Rows 5474 48 Rows/Load 5474

Short DescriptionFlat file and oracle table will be joined into one source datastream which will be written to an oracle target

table.

Load Frequency Daily

Preprocessing Non-Truncating Target Append

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

PRODUCT_ID

Tables

Table Name Schema/Owner Selection/Filter

STG_PRODUCT TDBUxx

Files

File Name File Location Fixed/Delimited Additional File Info

sales_transactions.txtt C:\pmfiles\SrcFiles Delimited Comma delimiter

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

STG_TRANSACTIONS X

Flat File Source

Relational Source

Joiner Transformation Relational Target

Unit 5 Lab A: Load Sales Transaction Staging Table

106 Informatica PowerCenter 8 Level I Developer

Page 129: PC8LID 20061204 Large for Printing

PROCESSING DESCRIPTION (DETAIL)

The flat file source has all the required detail except for product cost. It will be joined with the relational source into one data stream allowing access to the product cost for each sales transaction. The relational table will have fewer columns to join on as well as fewer rows. The columns to join on are PRODUCT_ID and PRODUCT. The single data stream will then be written to a staging table.

Also, there are a number of fields in the source file that need further scrutiny during the import process.

SOURCE TO TARGET FIELD MATRIX

Target Table Target Column Source File Source Column ExpressionDefault Value

if Null

STG_TRANSACTIONS CUST_ID Sales_transactions CUST_NO

STG_TRANSACTIONS PRODUCT_ID Sales_transactions PRODUCT

STG_TRANSACTIONS DEALERSHIP_ID Sales_transactions DEALERSHIP

STG_TRANSACTIONS PAYMENT_DESC Sales_transactions PAYMENT_DESC

STG_TRANSACTIONS PROMO_ID Sales_transactions PROMO_ID

STG_TRANSACTIONS DATE_ID Sales_transactions DATE_ID

STG_TRANSACTIONS TRANSACTION_DATE Sales_transactions TRANSACTION_DATE

STG_TRANSACTIONS TRANSACTION_ID Sales_transactions TRANSACTION_ID

STG_TRANSACTIONS EMPLOYEE_ID Sales_transactions EMPLOYEE_ID

STG_TRANSACTIONS TIME_KEY Sales_transactions TIME_KEY

STG_TRANSACTIONS SELLING_PRICE Sales_transactions SELLING PRICE

STG_TRANSACTIONS UNIT_COST STG_PRODUCT PRODUCT_COST

STG_TRANSACTIONS DELIVERY_CHARGES Sales_transactions DELIVERY CHARGES

STG_TRANSACTIONS SALES_QTY Sales_transactions QUANTITY

STG_TRANSACTIONS DISCOUNT Sales_transactions DISCOUNT

STG_TRANSACTIONS HOLDBACK Sales_transactions HOLDBACK

STG_TRANSACTIONS REBATE Sales_transactions REBATE

Unit 5 Lab A: Load Sales Transaction Staging Table

Informatica PowerCenter 8 Level I Developer 107

Page 130: PC8LID 20061204 Large for Printing

Instructions

Step 1: Create a Flat File Source Definition

1. Launch the Designer client tool (if it is not already running) and open your student folder.

2. Open the Source Analyzer tool.

3. Import sales_transactions.txt comma delimited flat file.

4. Ensure that the Transaction Date field has a Datatype of Datetime.

5. Save the repository.

Step 2: Create a Relational Source Definition

1. Verify you are in the Source Analyzer tool, and import STG_PRODUCT table found in your tdbuxx schema. Use ODBC_STG as the ODBC data Source.

Note that you are importing the table as a source definition, even though it is in your “target” (tdbuxx) schema.

2. Save the repository.

Step 3: Create a Relational Target Definition

1. Open the Target Designer tool.

2. Import the STG_TRANSACTIONS table found in your tdbuxx schema.

Step 4: Create a Mapping

1. Open the Mapping Designer tool.

2. Create a new mapping named m_STG_TRANSACTIONS_xx.

3. Add the sales_transactions flat file source to the new mapping.

4. Add the STG_PRODUCT relational source to the new mapping.

5. Add the STG_TRANSACTIONS relational target to the new mapping.

Your mapping should appear similar to Figure 5-1.

Figure 5-1. Normal View of the Heterogeneous Sources, Source Qualifiers and Target

Unit 5 Lab A: Load Sales Transaction Staging Table

108 Informatica PowerCenter 8 Level I Developer

Page 131: PC8LID 20061204 Large for Printing

Step 5: Create a Joiner Transformation

1. Select the Joiner transformation icon located on the Transformation tool bar with a single left click. Figure 5-2 shows the Joiner transformation button:

2. Create a new Joiner transformation.

3. Select all the ports from the SQ_sales_transactions object and copy/link them to the Joiner transformation.

4. Select only the PRODUCT_ID and PRODUCT_COST ports from SQ_STG_PRODUCT object and copy them to the Joiner transformation.

Your mapping should be similar to Figure 5-3.

5. Edit the Joiner transformation.

a. Rename it to jnr_Sales_Transaction_To_STG_PRODUCT.

b. Select the Ports tab.

c. Set the Master (M) property to the STG_PRODUCT ports.

Figure 5-2. Joiner Transformation Button

Figure 5-3. Normal View of Heterogeneous Sources Connected to a Joiner Transformation

Tip: Which ports should be the Master? Use the source that is the smaller, in rows and bytes, if the data is not sorted. If the source data is sorted, use the source with the fewest number of join column duplicates.

Unit 5 Lab A: Load Sales Transaction Staging Table

Informatica PowerCenter 8 Level I Developer 109

Page 132: PC8LID 20061204 Large for Printing

d. Uncheck the output check box for PRODUCT_ID.

e. Rename the PRODUCT_ID port to IN_PRODUCT_ID.

6. Select the Condition tab.

a. Click the Add a new condition button. Figure 5-5 displays the Add a new condition button as selected.

The Master drop down box will default to IN_PRODUCT_ID.

Figure 5-4. Edit View of the Ports Tab for the Joiner Transformation

Figure 5-5. Edit View of the Condition Tab for Joiner Transformation Without a Condition

Unit 5 Lab A: Load Sales Transaction Staging Table

110 Informatica PowerCenter 8 Level I Developer

Page 133: PC8LID 20061204 Large for Printing

b. Select the Detail drop down box and set it to PRODUCT. Your condition should be the same as displayed in Figure 5-6.

c. Click OK.

7. Save the repository.

Step 6: Link the Target Table

1. Link the following ports from the Joiner transformation to the corresponding columns in the target object. Example: JOINER PORT � TARGET COLUMN

Figure 5-6. Edit View of the Condition Tab for the Joiner Transformation with Completed Condition

Tip: The Joiner transformation can support multiple port conditions to create a join. If you need multiple port conditions simply click the Add a new condition button to add the other ports that make up the multiple port condition.

CUST_NO � CUST_ID

PRODUCT � PRODUCT_ID

DEALERSHIP � DEALERSHIP_ID

PAYMENT_DESC � PAYMENT_DESC

PROMO_ID � PROMO_ID

DATE_ID � DATE_ID

TRANSACTION DATE � TRANSACTION_DATE

TRANSACTION_ID � TRANSACTION_ID

EMPLOYEE_ID � EMPLOYEE_ID

TIME_KEY � TIME_KEY

SELLING PRICE � SELLING_PRICE

PRODUCT_COST � UNIT_COST

DELIVERY CHARGES � DELIVERY_CHARGES

QUANTITY � SALES_QTY

DISCOUNT � DISCOUNT

HOLDBACK � HOLDBACK

REBATE � REBATE

Unit 5 Lab A: Load Sales Transaction Staging Table

Informatica PowerCenter 8 Level I Developer 111

Page 134: PC8LID 20061204 Large for Printing

Figure 5-7 displays the ports linked to their corresponding columns.

2. Save the repository.

3. Verify your mapping is valid in the Output window. If the mapping is not valid, correct the invalidations that are displayed in the message.

Step 7: Create a Workflow and Session Task

1. Launch the Workflow Manager application (if it is not already running) and log into the repository and your student folder.

2. Open the Workflow Designer tool and create a new workflow named wkf_STG_TRANSACTIONS_xx.

3. Add a new Session task using the session task icon.

4. Select m_STG_TRANSACTIONS_xx from the Mapping list box and click OK.

5. Link the Start object to the s_m_STG_TRANSACTIONS_xx session task object.

6. Edit the s_m_STG_TRANSACTIONS_xx session task.

a. Select the Mapping tab.

b. Select SQ_sales_transactions located under the Sources folder in the Mapping navigator.

c. Confirm that Properties | Attribute | Source file directory is set to $PMSourceFileDir\

d. In Properties | Attribute | Source filename verify that sales_transactions.txt is displayed. The file extension (.txt) must be present.

e. Select SQ_STG_PRODUCT located under the Sources folder in the navigator window.

Set the Connections | Type to your assigned Native_STGxx connection object.

f. Select STG_TRANSACTIONS located under the Target folder in the navigator window.

Set the Connections | Type to your assigned Native_STGxx connection object.

Check the Truncate target table option checkbox.

7. Save the repository.

Figure 5-7. Normal View of Completed Mapping Heterogeneous Sources Not Displayed

Unit 5 Lab A: Load Sales Transaction Staging Table

112 Informatica PowerCenter 8 Level I Developer

Page 135: PC8LID 20061204 Large for Printing

8. Check Validate messages to ensure your workflow is valid. If you received an invalid message, correct the problem(s) and re-validate/save.

Step 8: Start the Workflow and View Results in the Workflow Monitor

1. Start the workflow.

2. Confirm that the Workflow Monitor application launches automatically.

3. Maximize the Workflow Monitor.

4. Double-click the session with your left mouse button and view the Task Details window.

Your information should appear similar to Figure 5-8.

5. Select the Transformation Statistics tab. Your statistics should be similar to Figure 5-9.

6. If your session failed or had an error proceed to the next step.

7. Right-click the Session again and select Get Session Log.

8. Search the session log for error messages that caused your session to have issues. Read the messages and correct the problem. Rerun your workflow to test your fix(s). Ask your instructor for help if you get stuck.

Figure 5-8. Task Details of the Completed Session Run

Figure 5-9. Source/Target Statistics for the Session Run

Unit 5 Lab A: Load Sales Transaction Staging Table

Informatica PowerCenter 8 Level I Developer 113

Page 136: PC8LID 20061204 Large for Printing

Data Results

Preview the target data from the Designer. Your data should appear the same as displayed in Figure 5-10.

Figure 5-10. Data Preview of the STG_TRANSACTIONS Table

Unit 5 Lab A: Load Sales Transaction Staging Table

114 Informatica PowerCenter 8 Level I Developer

Page 137: PC8LID 20061204 Large for Printing

Unit 5 Lab A: Load Sales Transaction Staging Table

Informatica PowerCenter 8 Level I Developer 115

Page 138: PC8LID 20061204 Large for Printing

Unit 5 Lab A: Load Sales Transaction Staging Table

116 Informatica PowerCenter 8 Level I Developer

Page 139: PC8LID 20061204 Large for Printing

Unit 5 Lab B: Features and Techniques I

Business Purpose

The management wants to increase the efficiency of the PowerCenter Developers.

Technical Description

This lab will detail the use of 12 PowerCenter Designer features. Each of these features will increase the efficiency of any developer who knows how to use them efficiently. At the discretion of the instructor, this lab can also be completed as a demonstration.

Goals

♦ Auto Arrange

♦ Remove Links

♦ Revert to Saved

♦ Link Path

♦ Propagating Ports

♦ Autolink by Name and Position

♦ Moving Ports

♦ Shortcut to Port Editing from Normal View

♦ Create Transformation Methods

♦ Scale-To-Fit

♦ Object Shortcuts and Copies

♦ Copy Objects Within and Between Mappings

Duration

50 minutes

Unit 5 Lab B: Features and Techniques I

Informatica PowerCenter 8 Level I Developer 117

Page 140: PC8LID 20061204 Large for Printing

Instructions

Open a Mapping

In the Designer tool:

1. In your student folder, open mapping m_Stage_Customer_Contacts_xx.

Feature 1: Auto Arrange

The Designer includes an Arrange feature that will reorganize objects in the Workspace in one simple step. This aids in readability and analysis of the mapping flow and can be applied to certain paths through a mapping associated with specific target definitions.

In a couple of clicks, this feature can take a mapping that looks like Figure 5-11.

And change it to look like Figure 5-12.

Figure 5-11. View of an Unorganized Mapping

Figure 5-12. Arranged View of a Mapping

Unit 5 Lab B: Features and Techniques I

118 Informatica PowerCenter 8 Level I Developer

Page 141: PC8LID 20061204 Large for Printing

1. Choose Layout > Arrange All Iconic or right-click in the Workspace and select Arrange All Iconic.

2. Choose Layout > Arrange All or right-click in the Workspace and select Arrange All.

3. Type Ctrl+S to save.

Feature 2: Remove Links

Click-and-drag the pointer over the blue link lines that are between exp_Format_Name_Gender_Phone and STG_CUSTOMERS.

Press the Delete key to remove the connections. Ensure no icons are deleted.

Figure 5-13. Iconic View of an Arranged Mapping

Tip: Notice the mapping would not save. When only formatting changes are made, it is not considered a change. Another change must be made to the Repository in order for the formatting to be saved.

Figure 5-14. Selecting Multiple Links

Tip: By default, each selected link changes in color from blue to red. If any other objects (e.g., transformations) were selected along with the links, redo the process.

Unit 5 Lab B: Features and Techniques I

Informatica PowerCenter 8 Level I Developer 119

Page 142: PC8LID 20061204 Large for Printing

Feature 3: Revert to Saved

In the Source Analyzer, Target Designer, and Transformation Developer, individual objects may be reverted.

1. Select Edit > Revert to Saved.

2. Select Yes to proceed.

3. Edit the exp_Format_Name_Gender_Phone expression.

In the Ports tab, select the OUT_CUST_NAME port and click the Delete button.

4. Similarly, delete the AGE port.

5. Edit the SQ_customer_layout Source Qualifier and remove the AGE port.

6. Select only the SQ_customer_layout Source Qualifier and choose Edit > Revert to Saved.

The same dialog box appears - all changes must be reverted.

7. Select Yes to proceed.

Notice all changes were reverted, not just the changes made to the SQ_customer_layout.

Tip: While editing an object in the Designer, if unwanted changes are made there is a way to revert to a previously saved version - undoing the changes since the last save. The Revert to Saved feature works with the following objects: sources, targets, transformations, mapplets and mappings.

Tip: For mappings, Revert to Saved reverts all changes to the mapping, not just selected objects. Only the active mapping in the workspace is reverted.

Figure 5-15. Designer Warning Box

Unit 5 Lab B: Features and Techniques I

120 Informatica PowerCenter 8 Level I Developer

Page 143: PC8LID 20061204 Large for Printing

Feature 4: Link Path

Tracing link paths allows the developer to highlight the path of a port either forward or backward through an entire mapping or mapplet.

If the class is doing this lab as a follow-along exercise, do a Revert to Saved so that everyone is synchronized.

1. Ensure that the mapping is in the arranged normal view.

2. Right-click on CUSTOMER_NO in the SQ_customer_layout Source Qualifier and choose Select Link Path > Forward.

Notice how the path for CUSTOMER_NO, from SQ_customer_layout all the way to STG_CUSTOMERS, is highlighted in red.

Figure 5-16. Selecting the forward link path

Figure 5-17. Highlighted forward link path

Unit 5 Lab B: Features and Techniques I

Informatica PowerCenter 8 Level I Developer 121

Page 144: PC8LID 20061204 Large for Printing

3. Right-click on the OUT_CUST_NAME port in the exp_Format_Name_Gender_Phone and select Link Path > Both.

Notice how the OUT_CUST_NAME port's path not only shows where it proceeds to the STG_CUSTOMERS target definition, but also from its origin all the way back to the customer_layout source definition. Both the IN_FIRSTNAME and IN_LASTNAME are used in the formula to produce OUT_CUST_NAME, so both links are highlighted in red.

Feature 5: Propagating Ports

1. Edit SQ_customer_layout and change CUSTOMER_NO to CUST_NO and change the Precision to 10.

2. Click OK.

3. Right click on CUST_NO in the SQ_customer_layout transformation and select Propagate Attributes.

Figure 5-18. Highlighted link path going forward and backward

Tip: When a port name, datatype, precision, scale, or description is changed, those changes can be propagated to the rest of the mapping.

Figure 5-19. Selecting to propagate the attributes

Unit 5 Lab B: Features and Techniques I

122 Informatica PowerCenter 8 Level I Developer

Page 145: PC8LID 20061204 Large for Printing

4. Under Attributes to propagate, choose Name and Precision with a Direction of Forward.

5. Choose Preview.

Notice the arrow between SQ_customer_layout and fil_Customer_No_99999 turns green. The green arrow indicates the places where a change would be made. Why is there only one change?

6. Select Propagate.

Was a change made in the filter?

7. Click Close.

8. Edit SQ_customer_layout and change GENDER to CUST_GENDER and change the Precision to 7.

9. Click OK.

10. Right click on CUST_GENDER in the SQ_customer_layout transformation and select Propagate Attributes.

a. Under Attributes to propagate, choose Name and Precision with a direction of Forward.

b. Select Preview.

Notice the green arrows? What will be changed?

c. Select Propagate.

d. Edit exp_Format_Name_Gender_Phone and open the Expression Editor for OUT_GENDER. Notice the expression now contains CUST_GENDER.

e. Close the Propagate dialog box.

Feature 6: Autolink by Name and Position

♦ Link by name

♦ Link by name and prefix

♦ Link by name and suffix

The Designer adds links between input and output ports that have the same name. Linking by name is case insensitive. Link by name when using the same port names across transformations.

1. Revert to Saved to reset the mapping.

Figure 5-20. Propagation attribute dialog box

Tip: Developers can automatically link ports by name in the Designer. Use any of the following options to automatically link by name:

Unit 5 Lab B: Features and Techniques I

Informatica PowerCenter 8 Level I Developer 123

Page 146: PC8LID 20061204 Large for Printing

2. Remove the links between the exp_Format_Name_Gender_Phone and the STG_CUSTOMERS target definition.

3. Right-click in the white space inside the mapping. Choose Autolink.

The Autolink dialog box opens.

4. Select the exp_Format_Name_Gender_Phone transformation from the From Transformation drop-down menu; then highlight the STG_CUSTOMERS transformation in the To Transformations box.

Figure 5-21. Autolink dialog box

Tip: Only one transformation may be selected in the From Transformation box and one to many transformations may be selected in the To Transformations box. For objects that contain groups, such as Router transformations or XML targets, select the group name from the To Transformations list.

Unit 5 Lab B: Features and Techniques I

124 Informatica PowerCenter 8 Level I Developer

Page 147: PC8LID 20061204 Large for Printing

5. Click OK.

Notice that nothing happened. Look carefully at the exp_Format_Name_Gender_Phone and STG_CUSTOMERS and you will notice that none of the ports match exactly, therefore autolink by name will not work in this situation. Would autolink by position work?

6. Select Layout > Autolink.

7. Select the exp_Format_Name_Gender_Phone transformation from the From Transformation drop-down menu; then highlight the STG_CUSTOMERS transformation in the To Transformations box.

8. Select the Name radio button.

9. Click More to view the options for entering prefixes and suffixes. Note the button toggles to become the Less button.

10. Type OUT_ in the From Transformation Prefix field.

11. Click OK

Notice that only the OUT_CUST_NAME port was linked. This is because this is the only port with a matching name.

Tip: When autolinking by name, the Designer adds links between ports that have the same name, case insensitive. The Designer also has the ability to link ports based on prefixes or suffixes defined. Adding suffixes and/or prefixes in port names help identify the ports purpose. For example, a suggested best practice is to use the prefix “OUT_” when the port is derived from input ports that were modified as it passes through the transformation. Without this feature, Autolink would skip over the names that don't match and force the developer to manually link the desired ports.

Figure 5-22. Defining a prefix in the autolink dialog box

Unit 5 Lab B: Features and Techniques I

Informatica PowerCenter 8 Level I Developer 125

Page 148: PC8LID 20061204 Large for Printing

Feature 7: Moving Ports

1. Revert to Saved to reset the mapping.

2. Open the exp_Format_Name_Gender_Phone and click on the Ports tab.

3. Single-click on the AGE port and move it up to the top using the Up arrow icon found in the upper right corner of the toolbar.

The results will look like Figure 5-23:

4. Single-click on the number to the left of the IN_PHONE_NUMBER port.

5. Single-click and hold the left mouse button and note the faint square that appears at the bottom of the pointer.

6. Move PHONE_NUMBER directly below AGE.

7. Click Cancel to discard the changes.

Feature 8: Shortcut to Port Editing from Normal View

1. Resize or scroll down until the AGE port appears in the exp_Format_Name_Gender_Phone.

2. Double-click on the AGE port.

3. Notice you are now in the Ports tab.

4. Delete the AGE port.

Figure 5-23. Expression after the AGE port has been moved

Figure 5-24. Click and drag method of moving ports

Tip: There is a shortcut to go from the Normal View directly to the port desired in the Edit View - this is especially useful in transformation objects that have dozens of ports.

Unit 5 Lab B: Features and Techniques I

126 Informatica PowerCenter 8 Level I Developer

Page 149: PC8LID 20061204 Large for Printing

Feature 9: Create Transformation Methods

1. Revert to Saved to reset the mapping.

2. On the Transformation toolbar, find the Aggregator Transformation button and single-click.

3. Move the mouse into the Workspace. The cursor changes to crosshairs.

4. Single-click in the workspace where you want to place the transformation.

The selected transformation appears in the desired location of the Workspace and the cursor changes back to an arrow.

5. Select Transformation > Create.

6. Select the Aggregator from the drop-down list.

7. Enter the name agg_TargetTableName and click Create.

Tip: When the mouse pointer hovers over a transformation icon in the toolbar that the name of the transformation object appears momentarily.

Figure 5-25. Creating a transformation using the menu

Figure 5-26. Create Transformation dialog box

Unit 5 Lab B: Features and Techniques I

Informatica PowerCenter 8 Level I Developer 127

Page 150: PC8LID 20061204 Large for Printing

8. Click on the Done button and the new transformation appears in the Workspace.

Feature 10: Scale-to-Fit

1. Revert to Saved to reset the mapping.

There are features to change the magnification of the contents of the Workspace. Use the toolbar or the Layout menu options to set zoom levels. The toolbar has the following zoom options:

2. Click on the Zoom out 10% button on the toolbar.

3. Click anywhere in the Workspace and the mapping will zoom out by 10% each time the mouse is clicked.

4. Keep clicking until the mapping is small enough to fit within the window.

5. Click on the Zoom in 10% button on the toolbar.

6. Click anywhere in the Workspace and the mapping will zoom in by 10% each time the mouse is clicked.

Figure 5-27. Normal View of the Newly Created Aggregator Transformation

Figure 5-28. Zoom options

Tip: Zoom out 10% on button. Uses a point selected as the center point from which to decrease the current magnification in 10 percent increments.

Tip: Zoom in 10% on button increases the current magnification of a rectangular area selected. Degree of magnification depends on the size of the area selected, Workspace size, and current magnification.

Unit 5 Lab B: Features and Techniques I

128 Informatica PowerCenter 8 Level I Developer

Page 151: PC8LID 20061204 Large for Printing

7. Toggle off the Zoom in 10% button.

8. Click on the Scale to Fit button in the toolbar.

Another way to do this is to select Layout > Scale to Fit.

Feature 11: Object Shortcuts and Copies

This feature allows you create a shortcut of an object (a source or target definition, a mapping, etc.) in any folder. A shortcut is a “pointer” to the original object. If the object is edited, all shortcuts inherit the changes. The shortcut itself cannot be edited.

1. Select and double-click the DEV_SHARED folder. Note that the folder name in the Navigator window is now bold. This means that the folder is open.

2. Open your student folder by either double clicking on it or by right-clicking on it and selecting open.

Note that the DEV_SHARED folder is no longer bold (open) but it remains expanded so you can see the subfolders.

3. Open the Mapping Designer and close any mapping that is in the workspace.

4. Expand the Mappings subfolder in the DEV_SHARED folder.

5. Click and drag the m_Stage_Customer_Contacts mapping to the Mapping Designer workspace and release the mouse button.

Figure 5-29. Navigator window in the Designer

Tip: Only one folder at a time can be open. Any number of folders can be expanded so that the subfolders and objects are visible. As we will see below, it is important to distinguish between expanded folders and the open folder.

Unit 5 Lab B: Features and Techniques I

Informatica PowerCenter 8 Level I Developer 129

Page 152: PC8LID 20061204 Large for Printing

You will see the following confirmation message:

6. Click Yes

7. Save the changes to the repository.

Note that your folder now has a shortcut to the mapping. Select the menu option Mappings ' Edit to see how the shortcut location is displayed.

8. Open the Filter transformation in edit mode. Note that all properties are grayed-out and not editable. A shortcut can never be edited directly.

9. Perform the same click-and-drag operation with the same mapping, only this time press the [Ctrl] key after you have begun to drag the mapping. Note that this creates a copy of the mapping instead of a shortcut.

10. Click No in the Copy Confirmation message box.

We will now learn how to copy an object within the same folder. The instructions below are to copy a mapping but the same procedure can be used for any other object.

1. In the Navigator window, select any mapping in your folder.

2. Press Ctrl+C on your keyboard, followed immediately by Ctrl+V.

3. Click Yes in the Copy Confirmation message box.

The Copy Wizard will be displayed.

4. The red x on the mapping indicates a conflict. Choose Rename for the conflict resolution.

5. Click the Edit button. If desired, you can supply your own new name to the mapping to replace the “1” added by the Designer. Mappings within a folder must have unique names.

6. Click Next, then Finish.

Feature 12: Copy Objects Within and Between Mappings

You may find that you would like to duplicate a given set of transformations within a mapping or a mapplet, preserving the data flow between them. This technique may prove useful if you know that you will need to use the logic contained in the transformations in other mappings or mapplets.

1. Use the Arrange All Iconic feature on the m_Stage_Customer_Contacts_xx mapping.

Tip: The destination folder (the folder you are placing the copy or shortcut into) must be the open folder. The origin folder that contains the original object will be expanded.

Tip: A common error when copying objects within a folder is to use the mouse to move the cursor from the object to the workspace after copying the object Ctrl+C. This is unnecessary and will cause the copy operation to fail.

Unit 5 Lab B: Features and Techniques I

130 Informatica PowerCenter 8 Level I Developer

Page 153: PC8LID 20061204 Large for Printing

2. Use your left mouse button to draw a rectangle that encloses the Filter and the Expression transformations. These objects will then appear selected.

3. Press Ctrl+C on your keyboard, followed immediately by Ctrl+V.

Note that both transformations have been copied into the mapping, including the data flow between the input and output ports. They have been automatically renamed with a “1” on the end of their names.

4. Open another mapping in the Mapping Designer. It does not matter which mapping is used, provided it is not a shortcut.

5. Press Ctrl+V. The transformations are copied into the open mapping.

6. Close your folder but do not save the changes.

Tip: The copy objects within and between mappings feature can be used only within a single folder.

Unit 5 Lab B: Features and Techniques I

Informatica PowerCenter 8 Level I Developer 131

Page 154: PC8LID 20061204 Large for Printing

Unit 5 Lab B: Features and Techniques I

132 Informatica PowerCenter 8 Level I Developer

Page 155: PC8LID 20061204 Large for Printing

Unit 6: Lookups and Reusable Transformations

When you have completed this unit, you should be able to

♦ Describe these feature:

♦ Lookup transformation

♦ Reusable transformations

♦ Use these features in a mapping.

Lesson 6-1. Lookup Transformation (Connected)

Unit 6: Lookups and Reusable Transformations

Informatica PowerCenter 8 Level I Developer 133

Page 156: PC8LID 20061204 Large for Printing

Type

Passive.

Description

A Lookup transformation allows the inclusion of additional information in the transformation process from an external database or flat file source. In SQL terms a Lookup transformation may be thought as a “sub-query”. The basic Lookup transformation types are connected, un-connected and dynamic.

Properties

We will discuss only some of the properties in this section. The remaining properties will be discussed in other sections.

Unit 6: Lookups and Reusable Transformations

134 Informatica PowerCenter 8 Level I Developer

Page 157: PC8LID 20061204 Large for Printing

Option Lookup Type Description

Lookup SQL Override Relational Overrides the default SQL statement to query the lookup table.Use only with the lookup cache enabled.

Lookup Table Name Relational Specifies the name of the table from which the transformation looks up and caches values.

Lookup Policy on Multiple Match

Flat File, Relational

Determines what happens when the Lookup transformation finds multiple rows that match the lookup condition. You can select the first or last row returned from the cache or lookup source, or report an error.

Lookup Condition Flat File, Relational

Displays the lookup condition you set in the Condition tab.

Connection Information Relational Specifies the database containing the lookup table. You can select the exact database connection or you can use the $Source or $Target variable. If you use one of these variables, the lookup table must reside in the source or target database you specify when you configure the session. If you select the exact database connection, you can also specify what type of database connection it is.

Source Type Flat File, Relational

Indicates that the Lookup transformation reads values from a relational database or a flat file.

Tracing Level Flat File, Relational

Sets the amount of detail included in the session log when you run a session containing this transformation.

Datetime Format Flat File If you do not define a datetime format for a particular field in the lookup definition or on the Ports tab, the Integration Service uses the properties defined here. You can enter any datetime format. The default is MM/DD/YYYY HH24:MI:SS.

Unit 6: Lookups and Reusable Transformations

Informatica PowerCenter 8 Level I Developer 135

Page 158: PC8LID 20061204 Large for Printing

Business Purpose

A business may bring data from various sources but additional data from local sources may be need such as product codes, dates, names, etc.

Example

In the following example an insurance company pays commissions on each new policy; however there may be a possibility by clerical error duplicate policies may be submitted. The goal is to check submitted policies against current list and reject the policies which are duplicates.

A policy number is passed to a connected Lookup transformation is used to check the current policy table for the pre-existence of a policy. If the policy number exists the matching policy number is returned, if the policy number does not exist a “null” value is returned. The return is used as the “Group Filter Condition” in the Router transformation. The Router filter condition is “ISNULL (POLICY_NO1)” and is based on the return value from the Lookup transformation “POLICY_NO” port NOT the value from the Source Qualifier. Rows from the source which have no match (null return) in the lookup table will make the filter condition and pass to the new (POLICY_NEW) target. All other rows go to the Router Default group and are passed to the reject (ROLICIES_REJ) target.

Performance Considerations

All rows pass through a connected Lookup so there may be performance degradation in executing additional Lookups when there are not needed. Caching a very large table may require a large amount of memory.

Thousand Separator Flat File If you do not define a thousand separator for a particular field in the lookup

definition or on the Ports tab, the Integration Service uses the properties

defined here.

You can choose no separator, a comma, or a period. The default is no

separator.

Decimal Separator Flat File If you do not define a decimal separator for a particular field in the lookup

definition or on the Ports tab, the Integration Service uses the properties

defined here.

You can choose a comma or a period decimal separator. The default is

period.

Case-Sensitive String

Comparison

Flat File If selected, the Integration Service uses case-sensitive string comparisons

when performing lookups on string columns.

Note: For relational lookups, the case-sensitive comparison is based on the

database support.

Null Ordering Flat File Determines how the Integration Service orders null values. You can choose

to sort null values high or low. By default, the Integration Service sorts null

values high.

Note: For relational lookups, null ordering is based on the database support.

Sorted Input Flat File Indicates whether or not the lookup file data is sorted.

Option Lookup Type Description

Unit 6: Lookups and Reusable Transformations

136 Informatica PowerCenter 8 Level I Developer

Page 159: PC8LID 20061204 Large for Printing

Lesson 6-2. Reusable Transformations

You can create a reusable transformation in:

♦ Transformation Developer

♦ Mapping Designer and then ‘promote.’

Reusable transformations are listed in the Transformations node of the Navigator.

Drag and drop them in any mapping to make a shortcut and then override the properties as needed.

Key Points

♦ You can also copy them as non-reusable by depressing the Ctrl key while dragging.

♦ You can edit ports only in the Transformation Developer.

♦ Instances dynamically inherit changes.

♦ Source Qualifier transformations cannot be reusable.

♦ Changing reusable transformations can invalidate mappings

Unit 6: Lookups and Reusable Transformations

Informatica PowerCenter 8 Level I Developer 137

Page 160: PC8LID 20061204 Large for Printing

Unit 6: Lookups and Reusable Transformations

138 Informatica PowerCenter 8 Level I Developer

Page 161: PC8LID 20061204 Large for Printing

Unit 6 Lab A: Load Employee Staging Table

Business Purpose

Information about Mersche Motors employees is saved to three text files each day. We must read each of these files individually and load them to the staging area. The files do not contain employee salary information, so we must find each employee's salary. We also must reformat some of the other fields.

Technical Description

We have three text files coming in daily with employee information that we would like to put into a file list. We need to find a salary for each employee, concatenate first name and last name, change the format of age and phone number and add a load date.

Goals

♦ Practice using Reusable Expression and Lookup to Flat File

Duration

45 Minutes

Unit 6 Lab A: Load Employee Staging Table

Informatica PowerCenter 8 Level I Developer 139

Page 162: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

LOOKUPS

Mapping Name m_STG_EMPLOYEES_xx

Source System Flat file Target System Oracle table

Initial Rows 109 Rows/Load 109

Short DescriptionFile list will be read, source data will be reformatted, a load date will be added and salary information for

each employee will be added.

Load Frequency Daily

Preprocessing Target Append

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

Files

File Name File Location Fixed/Delimited Additional File Info

employees_central.txt,

employees_east.txt,

employees_west.txt

Definition in employees_layout.txt

C:\pmfiles\SrcFiles Delimited These 3 comma delimited flat files will

be read into the session using a filelist

employees_list.txt.

The layout of the flat files can be found

in employees_layout.txt.

employees_list.txt C:\pmfiles\SrcFiles File list

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

STG_EMPLOYEES X

Lookup Name lkp_salary

Table salaries.txt Location C:\pmfiles\LkpFiles

Match

Condition(s)

EMPLOYEE_ID = IN_EMPLOYEE_ID

Filter/SQL

Override

Unit 6 Lab A: Load Employee Staging Table

140 Informatica PowerCenter 8 Level I Developer

Page 163: PC8LID 20061204 Large for Printing

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

The mapping will read from three flat files contained in a file list. For each employee id, we will find the corresponding salary. First name and last name will be concatenated. Age and phone number will be re-formatted before loading into the STG_EMPLOYEES Oracle table.

Source

Expression

Lookup

Target

Unit 6 Lab A: Load Employee Staging Table

Informatica PowerCenter 8 Level I Developer 141

Page 164: PC8LID 20061204 Large for Printing

SO

UR

CE

TO

TA

RG

ET

FIE

LD

MA

TR

IX

Target Table

Target Column

Data type

Source File

Source Column

Expression

Default

Value if

Null

STG_EMPLOYEES

EMPLOYEE_ID

number(p,s)

employees_layout

EMPLOYEE_ID

STG_EMPLOYEES

EMPLOYEE_NAME

varchar2

employees_layout

Derived

Concatenate First Name and

Last Name

STG_EMPLOYEES

EMPLOYEE_ADDRESS

varchar2

employees_layout

ADDRESS

STG_EMPLOYEES

EMPLOYEE_CITY

varchar2

employees_layout

CITY

STG_EMPLOYEES

EMPLOYEE_STATE

varchar2

employees_layout

STATE

STG_EMPLOYEES

EMPLOYEE_ZIP_CODE

number(p,s)

employees_layout

ZIP_CODE

STG_EMPLOYEES

EMPLOYEE_COUNTRY

varchar2

employees_layout

COUNTRY

STG_EMPLOYEES

EMPLOYEE_PHONE_NMBR

varchar2

employees_layout

Derived

The PHONE_NUMBER column

is in the format of 9999999999

and needs to be reformatted to

(999) 999-9999.

STG_EMPLOYEES

EMPLOYEE_FAX_NMBR

varchar2

employees_layout

FAX_NUMBER

STG_EMPLOYEES

EMPLOYEE_EMAIL

varchar2

employees_layout

EMAIL

STG_EMPLOYEES

EMPLOYEE_GENDER

varchar2

employees_layout

Derived

GENDER is currently either M or

F. It needs to be Male, Female or

UNK

STG_EMPLOYEES

AGE_GROUP

varchar2

employees_layout

Derived

The CUST_AGE_GROUP is

derived from the decoding of

AGE column. The valid age

groups are less than 20, 20 to

29, 30 to 39, 40 to 49, 50 to 60

and Greater than 60

STG_EMPLOYEES

NATIVE_LANG_DESC

varchar2

employees_layout

NATIVE_LANGUAGE

STG_EMPLOYEES

SEC_LANG_DESC

varchar2

employees_layout

SECOND_LANGUAGE

STG_EMPLOYEES

TER_LANG_DESC

varchar2

employees_layout

THIRD_LANGUAGE

STG_EMPLOYEES

POSITION_TYPE

varchar2

employees_layout

POSITION_TYPE

STG_EMPLOYEES

REGIONAL_MANAGER

varchar2

employees_layout

REGIONAL_MANAGER

Unit 6 Lab A: Load Employee Staging Table

142 Informatica PowerCenter 8 Level I Developer

Page 165: PC8LID 20061204 Large for Printing

STG_EMPLOYEES

DEALERSHIP_ID

number(p,s)

employees_layout

DEALERSHIP_ID

STG_EMPLOYEES

DEALERSHIP_MANAGER

varchar2

employees_layout

DEALERSHIP_MANAGER

STG_EMPLOYEES

EMPLOYEE_SALARY

number(p,s)

employees_layout

Derived

A Salary field for each Employee

ID can be found in salaries.txt.

STG_EMPLOYEES

HIRE_DATE

date

employees_layout

HIRE_DATE

STG_EMPLOYEES

DATE_ENTERED

date

employees_layout

DATE_ENTERED

Target Table

Target Column

Data type

Source File

Source Column

Expression

Default

Value if

Null

Unit 6 Lab A: Load Employee Staging Table

Informatica PowerCenter 8 Level I Developer 143

Page 166: PC8LID 20061204 Large for Printing

Instructions

Step 1: Create a Flat File Source Definition

1. Launch the Designer client tool (if it is not already running) and log into the PC8_DEV repository.

2. Import employees_layout.txt comma delimited flat file into your student folder. Make sure that you import the field names from the first line.

3. Save the repository.

Your source definition should look the same as displayed in Figure 6-1.

Step 2: Create a Relational Target Definition

1. In the Target Designer, import the STG_EMPLOYEES table.

2. Save the repository.

Your target definition should look the same as Figure 6-2.

Step 3: Step Three: Create a Reusable Transformation

1. Open the mapping m_Stage_Customer_Contacts_xx.

Figure 6-1. Source Analyzer view of the employees_layout flat file definition

Figure 6-2. Target Designer view of the STG_EMPLOYEES relational table definition

Velocity Best Practice: A Velocity Design best practice is to use as many reusable transformations as possible. This decreases development time and keeps the mappings consistent.

Unit 6 Lab A: Load Employee Staging Table

144 Informatica PowerCenter 8 Level I Developer

Page 167: PC8LID 20061204 Large for Printing

2. Edit exp_Format_Name_Gender_Phone and check the Make reusable box on the Transformation tab.

3. Click Yes when you see the popup box.

4. Review the Transformation dialog box. What differences do you now see?

5. Select the Ports tab. Can you change anything here? Why are you unable to make changes?

6. Open the Transformation Developer by clicking the respective icon in the toolbar.

.

Figure 6-3. Transformation edit dialog box showing how to make a transformation reusable

Figure 6-4. Question box letting you know the action is irreversible

Tip: Converting a transformation to reusable is nonreversible. The Transformation will now be saved in the Transformations node within the Navigator window and will be available as a stand-alone object to drag into any mapping as a shortcut.

Figure 6-5. Transformation edit dialog box of a reusable transformation

Unit 6 Lab A: Load Employee Staging Table

Informatica PowerCenter 8 Level I Developer 145

Page 168: PC8LID 20061204 Large for Printing

7. From the Navigator Window, locate the Transformations node in your respective student folder.

8. Drag exp_Format_Name_Gender_Phone into the Transformation Developer workspace.

9. Edit exp_Format_Name_Gender_Phone and add the prefix re_ to rename it to re_exp_Format_Name_Gender_Phone_Load_Date.

10. Select the Ports tab.

a. Change the name of the OUT_CUST_NAME port to OUT_NAME.

b. Change the name of the OUT_CUST_PHONE port to OUT_PHONE.

c. Click OK.

11. Save the repository.

Step 4: Create a Mapping

1. Open the Mapping Designer by clicking the respective icon in the toolbar.

2. Create a new mapping named m_STG_EMPLOYEES_xx.

3. Add employees_layout.txt flat file source to the new mapping.

4. Add STG_EMPLOYEES relational target to the new mapping. Your mapping should appear similar to Figure 6-7.

Figure 6-6. Navigator window depicting the Transformations node

Velocity Best Practice: It is a Velocity recommendation that reusable transformations use the prefix re. Shortcuts should have the prefix sc (or SC if you prefer).

Figure 6-7. Partial mapping with source and target

Unit 6 Lab A: Load Employee Staging Table

146 Informatica PowerCenter 8 Level I Developer

Page 169: PC8LID 20061204 Large for Printing

Step 5: Create a Lookup Transformation

1. Select the Lookup transformation tool bar button located on the Transformations tool bar with a single left click. The selected icon in Figure 6-8 identifies the Lookup tool button.

2. Move your mouse pointer into the Mapping Designer Workspace and single click your left mouse button. This will create a new Lookup Transformation.

3. Choose Import > From Flat File for the location of the Lookup Table.

4. Locate the c:\pmfiles\LkpFiles directory and select the file salaries.txt. If the file is located in a different directory, your instructor will specify.

5. The Flat File Import Wizard will appear. Confirm that the Delimited option button is selected.

6. Select the Import field names from first line check box. Your Wizard should appear similar to Figure 6-10.

7. Click Next.

Figure 6-8. Transformation Toolbar

Figure 6-9. Lookup Transformation table location dialog box

Figure 6-10. Dialog box 1 of the 3 step Flat File Import Wizard

Unit 6 Lab A: Load Employee Staging Table

Informatica PowerCenter 8 Level I Developer 147

Page 170: PC8LID 20061204 Large for Printing

8. Confirm that only the Comma check box under Delimiters is selected.

9. Select the No quotes option button under Text Qualifier.

10. Click Next.

11. Confirm that the field names are displayed under Column Information. These were imported from the first line of the file.

12. Click Finish.

13. Confirm that your Lookup Transformation appears as displayed in Figure 6-11.

14. Drag and drop EMPLOYEE_ID from SQ_employees_layout to the new Lookup Transformation.

15. Edit the Lookup Transformation.

Rename it to lkp_salaries.

16. Click on the Ports tab.

Rename EMPLOYEE_ID1 to IN_EMPLOYEE_ID.

17. Uncheck the output port for IN_EMPLOYEE_ID.

18. Select the Condition tab.

19. Select the Add a new condition button. PowerCenter will choose the first lookup port and the first input port automatically.

Figure 6-11. Normal view of the newly created Lookup Transformation

Velocity Best Practice: Velocity naming conventions specify to name Lookup transformations lkp_LOOKUP_TABLE_NAME.

Velocity Best Practice: It is a Velocity best to Prefix all input ports to an Expression or Lookup with IN_.

Unit 6 Lab A: Load Employee Staging Table

148 Informatica PowerCenter 8 Level I Developer

Page 171: PC8LID 20061204 Large for Printing

Your condition should look similar to Figure 6-12.

20. Click OK.

21. Save the repository.

Step 6: Add a Reusable Expression Transformation

In the Navigator Window, under Transformations node, click and drag re_exp_Format_Name_Gender_Phone_Load_Date into the Mapping Designer Workspace.

Step 7: Link Transformations

1. Link the following ports from SQ_employees_layout to the STG_EMPLOYEES target:

2. Save the repository.

3. Link the Following ports from lkp_SALARIESto STG_EMPLOYEES:

Figure 6-12. Lookup Transformation condition box

EMPLOYEE_ID �EMPLOYEE_ID

ADDRESS �EMPLOYEE_ADDRESS

CITY �EMPLOYEE_CITY

STATE �EMPLOYEE_STATE

ZIP_CODE �EMPLOYEE_ZIP_CODE

COUNTRY �EMPLOYEE_COUNTRY

FAX_NUMBER �EMPLOYEE_FAX_NMBR

EMAIL �EMPLOYEE_EMAIL

NATIVE_LANGUAGE �NATIVE_LANG_DESC

SECOND_LANGUAGE �SEC_LANG_DESC

THIRD_LANGUAGE �TER_LANG_DESC

POSITION_TYPE �POSITION_TYPE

DEALERSHIP_ID �DEALERSHIP_ID

REGIONAL_MANAGER �REGIONAL_MANAGER

DEALERSHIP_MANAGER �DEALERSHIP_MANAGER

HIRE_DATE �HIRE_DATE

DATE_ENTERED �DATE_ENTERED

SALARY �EMPLOYEE_SALARY

Unit 6 Lab A: Load Employee Staging Table

Informatica PowerCenter 8 Level I Developer 149

Page 172: PC8LID 20061204 Large for Printing

4. Link the following ports from SQ_employees_layout to re_exp_Format_Name_Gender_Phone_Load_Date:

5. Link the following ports from re_exp_Format_Name_Gender_Phone_Load_Date to STG_EMPLOYEES:

6. Save the repository.

Step 8: Create and Run the Workflow

1. Launch the Workflow Manager client and sign into your assigned folder.

2. Open the Workflow Designer tool and create a new workflow named wkf_STG_EMPLOYEES_xx.

3. Create a session task using the session task tool button.

4. Select m_STG_EMPLOYEES_xx from the Mapping list box and click OK.

5. Link the Start object to the s_m_STG_EMPLOYEES_xx session task object.

6. Edit the s_m_STG_EMPLOYEES_xx session.

7. Under the Mapping tab:

♦ Confirm that Source file directory is set to $PMSourceFileDir\.

♦ In Properties | Attribute | Source filename type in employees_list.txt.

♦ In Properties | Attribute | Source filetype click the drop-down arrow and change the default from Direct to Indirect.

Your Mapping | Source | Properties | Attributes should be the same as Figure 6-13.

♦ Select STG_EMPLOYEES located under the Target folder in the navigator window.

♦ Set the relational target connection object property to NATIVE_STGxx where xx is your student number.

♦ Check the property Truncate target table option in the target properties.

FIRSTNAME �IN_FIRSTNAME

LASTNAME �IN_LASTNAME

PHONE_NUMBER �IN_PHONE_NUMBER

GENDER �IN_GENDER

AGE �AGE

OUT_NAME �EMPLOYEE_NAME

OUT_PHONE �EMPLOYEE_PHONE_NMBR

OUT_GENDER �EMPLOYEE_GENDER

OUT_AGE_GROUP �AGE_GROUP

Figure 6-13. Source properties for the employee_list file list

Unit 6 Lab A: Load Employee Staging Table

150 Informatica PowerCenter 8 Level I Developer

Page 173: PC8LID 20061204 Large for Printing

♦ Select lkp_salaries from the Transformations folder in the navigator window.

♦ Verify the Lookup source file directory is $PMLookupFileDir\.

♦ Type salaries.txt in the Lookup filename.

8. Save the repository.

9. Check Validate messages to ensure your workflow is valid.

10. Start the workflow.

11. Review the Task Details

12. Review the Source/Target Statistics.

Figure 6-14. Task Details of the completed session run

Figure 6-15. Source/Target Statistics of the completed session run

Unit 6 Lab A: Load Employee Staging Table

Informatica PowerCenter 8 Level I Developer 151

Page 174: PC8LID 20061204 Large for Printing

13. Use Preview Data feature in the Designer to view the data results.

Figure 6-16. Data Preview of the STG_EMPLOYEES target table

Note: not all rows and columns are shown.

Unit 6 Lab A: Load Employee Staging Table

152 Informatica PowerCenter 8 Level I Developer

Page 175: PC8LID 20061204 Large for Printing

Unit 6 Lab A: Load Employee Staging Table

Informatica PowerCenter 8 Level I Developer 153

Page 176: PC8LID 20061204 Large for Printing

Unit 6 Lab A: Load Employee Staging Table

154 Informatica PowerCenter 8 Level I Developer

Page 177: PC8LID 20061204 Large for Printing

Unit 6 Lab B: Load Date Staging Table

Business Purpose

The date staging area in the operational data store must be loaded with one record for each date covered in the data marts. Each date must be described with the date attributes used in the data mart, such as the month name, quarter number, whether the date is a weekday or a weekend, and so forth.

Technical Description

To load the date staging area, we will use Informatica date functions and variables to transform a date value and date id. The raw dates are in a flat file.

Goals

♦ Copy an Expression transformation to convert a string date to various descriptive date columns.

♦ Use the Expression Editor to create or view expressions and become familiar with date function syntax.

♦ Understand the evaluation sequence of input, output, and variable ports.

♦ Learn how to use variable ports.

Duration

30 minutes

Unit 6 Lab B: Load Date Staging Table

Informatica PowerCenter 8 Level I Developer 155

Page 178: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW

Mapping Name m_STG_DATES_xx

Source System Flat file Target System Oracle table

Initial Rows 4019 Rows/Load 4019

Short Description A text file will run through an expression to do date manipulation and load to our date staging area.

Load Frequency Once

Preprocessing Target append

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

Files

File Name File Location Fixed/Delimited Additional File Info

dates.txt C:\pmfiles\SrcFiles Delimited Comma delimiter

Tables Schema Owner

Table Name Update Delete Insert Unique Key

STG_DATES X

Source Expression Target

Unit 6 Lab B: Load Date Staging Table

156 Informatica PowerCenter 8 Level I Developer

Page 179: PC8LID 20061204 Large for Printing

PR

OC

ES

SIN

G D

ES

CR

IPT

ION

(D

ET

AIL

)

Thi

s m

appi

ng w

ill g

ener

ate

the

date

sta

ging

tab

le f

rom

the

dat

es t

ext

file

. The

Exp

ress

ion

tran

sfor

mat

ion

is u

sed

to d

eriv

e th

e di

ffer

ent

date

val

ues.

SO

UR

CE

TO

TA

RG

ET

FIE

LD

MA

TR

IX

Target

Table

Target Column

Source

File

Source

Column

Expression

Default

Value

if Null

Data

Issues/

Quality

STG_DATES

DATE_ID_LEGACY

dates.txt

DATE_ID

STG_DATES

DATE_VALUE

dates.txt

Derived

Reformat the DATE column to MM/DD/YYYY

STG_DATES

DAY_OF_MONTH

dates.txt

derived

The current day of the current month. EX - TUESDAY

STG_DATES

MONTH_NUMBER

dates.txt

derived

The month number of the year.

STG_DATES

YEAR_VALUE

dates.txt

derived

The year for each record.

STG_DATES

DAY_OF_WEEK

dates.txt

derived

The day of the week for the record.

STG_DATES

DAY_NAME

dates.txt

derived

The name of the day for the record.

STG_DATES

MONTH_NAME

dates.txt

derived

The month name for the record.

STG_DATES

DAY_OF_YEAR

dates.txt

derived

The day number of the year for the record.

EX - 1-365

STG_DATES

MONTH_OF_YEAR

dates.txt

derived

The month number of the year for the record.

STG_DATES

WEEK_OF_YEAR

dates.txt

derived

The week number of the year for the record.

STG_DATES

DAY_OVERALL

dates.txt

derived

The day number overall.

STG_DATES

WEEK_OVERALL

dates.txt

derived

The week number overall.

STG_DATES

MONTH_OVERALL

dates.txt

derived

The month number overall.

STG_DATES

YEAR_OVERALL

dates.txt

derived

The year number overall.

STG_DATES

HOLIDAY_INDICATOR

dates.txt

derived

This flag will tell us whether the record date is a holiday.

STG_DATES

WORKDAY_INDICATOR

dates.txt

derived

This flag will tell us whether the record date is a workday.

STG_DATES

WEEKDAY_INDICATOR

dates.txt

derived

This flag will tell us whether the record date is a weekday.

STG_DATES

WEEKEND_INDICATOR

dates.txt

derived

This flag will tell us whether the record date is a weekend.

STG_DATES

QUARTER_OF_YEAR

dates.txt

derived

The quarter number of the year.

Unit 6 Lab B: Load Date Staging Table

Informatica PowerCenter 8 Level I Developer 157

Page 180: PC8LID 20061204 Large for Printing

STG_DATES

SEASON

dates.txt

derived

The current season.

STG_DATES

LAST_DAY_IN_MONTH

dates.txt

derived

Flag to indicate the current date is last day of the month.

STG_DATES

LAST_DAY_IN_QUARTER

dates.txt

derived

Flag to indicate the current date is last day of the quarter.

STG_DATES

LAST_DAY_IN_YEAR

dates.txt

derived

Flag to indicate the current date is the last day of the

year.

Target

Table

Target Column

Source

File

Source

Column

Expression

Default

Value

if Null

Data

Issues/

Quality

Unit 6 Lab B: Load Date Staging Table

158 Informatica PowerCenter 8 Level I Developer

Page 181: PC8LID 20061204 Large for Printing

Instructions

Step 1: Create a Flat File Source Definition

1. Launch the Designer (if it is not already running) and connect to the PC8_DEV repository.

2. Open your student folder.

3. Import the dates.txt comma delimited flat file source using the Flat File Wizard. Make sure that you import the field names from the first line. Note: Treat date as a string - it will be converted in the mapping.

4. Save the repository.

Step 2: Create a Relational Target Definition

1. Import the STG_DATES table using the Target Designer.

2. Save the repository.

Step 3: Create a Mapping

1. Create a new mapping named m_STG_DATES_xx.

2. Add dates flat file source to the mapping.

3. Add the STG_DATES target to the mapping.

Your mapping should appear similar to Figure 6-17.

4. Expand the DEV_SHARED folder.

5. Expand the Transformations subfolder.

a. select the re_exp_STG_DATES.

b. With your left mouse button, drag the transformation toward your mapping but DO NOT DROP IT.

c. Hold down the Ctrl key.

d. Drop the transformation into the mapping.

e. If a Copy Confirmation message box appears, click “Yes.”

Note: If the confirmation box says “Shortcut” instead of “Copy”, try again and make sure that you hold down the Ctrl key continuously as you drop the transformation into the mapping.

Figure 6-17. Mapping with Source and Target definitions

Unit 6 Lab B: Load Date Staging Table

Informatica PowerCenter 8 Level I Developer 159

Page 182: PC8LID 20061204 Large for Printing

6. Link the two output ports on the Source Qualifier to the two input ports on the Expression transformation, matching the names.

7. Use the “Autolink” feature to link the output ports in the Expression transformation to the corresponding fields in the target definition - by Position.

8. Save the mapping and confirm it is valid.

Your mapping will appear the same as in Figure 6-18.

9. Edit the Expression transformation and click on the Ports tab.

10. Examine the structure of the Expression transformation ports and expressions.

Note that the DATE_ID is an integer that is passed directly to the target table unchanged.

The input port DATE supplies a string that describes an individual date, such as 'May 20, 2005'. The variable ports will process that string in various ways in order to extract a specific descriptor, such as the day of the week, the quarter, the month, whether the date is a holiday, etc. These descriptors will later be used in the data warehouse to group and filter report data.

11. Examine some of the variable port expressions and see if you can determine how they work. You can use PowerCenter Help to view the syntax for any function. If you wish, ask your instructor for clarification on any of the expressions.

Note that variable ports cannot be output ports, so a separate set of output ports is used at the bottom of the transformation in order to output the data to the target. Most of these output ports simply call a variable port.

Variable ports were used in this transformation because they will be resolved one at a time, top to bottom. In this case, some of the later expressions are dependent on the results of the earlier expressions.

Figure 6-18. Completed Mapping

Tip: Informatica evaluates ports in the following order: input/output (input only as well), variable, and then output. Variables are evaluated in top down order, so it is important to put them in a specific order.

Unit 6 Lab B: Load Date Staging Table

160 Informatica PowerCenter 8 Level I Developer

Page 183: PC8LID 20061204 Large for Printing

Step 4: Create a Workflow and a Session Task

1. Launch the Workflow Manager application (if it's not already running) and connect to the PC8_DEV repository.

2. Open your student folder.

3. Create a new workflow named wkf_Load_STG_DATES_xx.

4. Create a session named s_m_STG_DATES_xx that uses the m_STG_DATES_xx mapping,

5. Edit the session you just created.

a. Select the Mapping tab.

b. Select the Source Qualifier icon SQ_dates.

c. In the Properties area scroll down and confirm the source file name and location. Ensure that the Source Filename property value includes the .txt extension.

d. Select the target STG_DATES.

e. Select your appropriate target connection object.

f. Select the option “Truncate target table”.

6. Complete the workflow by linking the Start task to the session task.

7. Save the repository.

Step 5: Run the Workflow and Monitor the Results

1. Start the workflow.

2. Maximize the Workflow Monitor and select the Task View.

3. Review the Task Details.

Your information should appear the same as in Figure 6-19.

Figure 6-19. Task Details of the completed session run

Unit 6 Lab B: Load Date Staging Table

Informatica PowerCenter 8 Level I Developer 161

Page 184: PC8LID 20061204 Large for Printing

4. Review the Source/Target Statistics.

Figure 6-20. Source/Target Statistics for the session run

Unit 6 Lab B: Load Date Staging Table

162 Informatica PowerCenter 8 Level I Developer

Page 185: PC8LID 20061204 Large for Printing

Data Results

Use the Preview Data feature in the Designer to view the data results.

Your results should appear similar to those in Figure 6-21 through Figure 6-22.

Figure 6-21. Data preview of the STG_DATES table - screen 1

Figure 6-22. Data preview of the STG_DATES table - screen 2 scrolled right

Unit 6 Lab B: Load Date Staging Table

Informatica PowerCenter 8 Level I Developer 163

Page 186: PC8LID 20061204 Large for Printing

Unit 6 Lab B: Load Date Staging Table

164 Informatica PowerCenter 8 Level I Developer

Page 187: PC8LID 20061204 Large for Printing

Unit 7: Debugger

After completing this unit, you should be able to:

♦ Describe the Debugger

♦ Use the Debugger to troubleshoot a mapping problem

Lesson 7-1. Debugging Mappings

The Debugger is a wizard-driven Designer tool that runs a test session.

Integration Service must be running before starting a Debug Session.

1. Start the Debugger. A spinning Debugger Mode icon is displayed - stops when the Integration Service is ready.

2. Choose an existing session or define a one-time debug session. Options:

♦ Load or discard target data

♦ Save debug environment for later use

3. Monitor the Debugger:

♦ Output window - view Debug or Session log.

♦ Transformation Instance Data window - view transformation data.

♦ Target Instance window - view target data.

Unit 7: Debugger

Informatica PowerCenter 8 Level I Developer 165

Page 188: PC8LID 20061204 Large for Printing

4. Move through the session - menu options include:

♦ Next Instance. Runs until it reaches the next transformation or satisfies a breakpoint condition.

♦ Step to Instance. Runs until it reaches the selected transformation instance or satisfies a breakpoint condition.

♦ Show current instance. Displays the current instance in the Transformation Instance window.

♦ Continue. Runs until it satisfies a breakpoint condition.

♦ Break now. Pauses wherever it is currently processing.

5. Modify data and breakpoints. When the Debugger pauses, you can modify:

♦ Change data

♦ Change variable values

♦ Add or change breakpoints

Unit 7: Debugger

166 Informatica PowerCenter 8 Level I Developer

Page 189: PC8LID 20061204 Large for Printing

Unit 7: Debugger

Informatica PowerCenter 8 Level I Developer 167

Page 190: PC8LID 20061204 Large for Printing

Unit 7: Debugger

168 Informatica PowerCenter 8 Level I Developer

Page 191: PC8LID 20061204 Large for Printing

Unit 7 Lab: Using the Debugger

Business Purpose

The m_STG_DATES_DEBUG mapping contains at least one error that results in bad data loaded into the target table. This error must be found and corrected so the data warehouse project will be successful.

Technical Description

The Debugger will be used to track down the cause of the error or errors.

Objectives

♦ Use the Debug Wizard.

♦ Use the Debug Toolbar.

Duration

30 minutes

Unit 7 Lab: Using the Debugger

Informatica PowerCenter 8 Level I Developer 169

Page 192: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

Sources

Targets

High Level Process Overview

Processing Description (Detail)

This mapping will generate the date staging table from the dates text file.

Mapping Name m_STG_DATES_DEBUG

Source System Flat file Target System Oracle table

Initial Rows 4019 Rows/Load 4019

Short DescriptionA text file is run through an Expression transformation to do date manipulation and load our date staging

area.

Load Frequency Once

Preprocessing Target append

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

Files

File Name File Location Fixed/Delimited Additional File Info

dates.txt C:\pmfiles\SrcFiles Delimited Comma delimiter

Tables Schema Owner

Table Name Update Delete Insert Unique Key

STG_DATES_VIEW X

Source Expression Target

Unit 7 Lab: Using the Debugger

170 Informatica PowerCenter 8 Level I Developer

Page 193: PC8LID 20061204 Large for Printing

So

urc

e T

o T

arg

et

Fie

ld M

atr

ix

Target Table

Target Column

Source File

Source

Column

Expression

Default

Value

if Null

STG_DATES_VIEW

DATE_ID_LEGACY

dates.txt

DATE_ID

STG_DATES_VIEW

DATE_VALUE

dates.txt

derived

Reformat to

MM/DD/YYYY

STG_DATES_VIEW

DAY_OF_MONTH

dates.txt

derived

Current day of current

month

STG_DATES_VIEW

MONTH_NUMBER

dates.txt

derived

Month number of the year

STG_DATES_VIEW

MONTH_NAME

dates.txt

derived

Name of the month

STG_DATES_VIEW

YEAR_VALUE

dates.txt

derived

Year for each record

Unit 7 Lab: Using the Debugger

Informatica PowerCenter 8 Level I Developer 171

Page 194: PC8LID 20061204 Large for Printing

Instructions

Step 1: Copy and Inspect the Debug Mapping

1. Expand the DEV_SHARED folder.

a. Locate the mapping m_STG_DATES_DEBUG and copy it to your folder.

b. If a source or target conflict occurs choose Reuse.

2. Save the Repository.

3. Open the mapping in your workspace.

a. Get an overall idea what kind of processing is being done.

b. Read each of the expressions in the Expression transformation. Note that the mapping is a simplified version of the one used in Unit 6 Lab B.

You have been told only that there is an “error” in the data being written to the target, without any further clarification as to the nature of the error.

Step 2: Step Through the Debug Wizard

1. Press your F9 key. This evokes the Debug Wizard. The first page of the Debug Wizard is informational. Please read it.

2. Press the Next button.

Tip: Many mapping errors can be found by carefully inspecting the mapping - without using the Debugger. However, if the error cannot be located in a timely fashion in this manner, the Debugger will assist you by showing the actual data passing through the transformation ports. In order to properly use the Debugger, you must first understand the logic of the mapping.

Tip: The Debugger requires a valid mapping and session to run; it cannot help you determine why a mapping is invalid. The Designer Output Window will show you the reason(s) why a mapping is invalid.

Unit 7 Lab: Using the Debugger

172 Informatica PowerCenter 8 Level I Developer

Page 195: PC8LID 20061204 Large for Printing

Your Wizard should appear similar to Figure 7-1 below. Accept the default setting - Create a debug session instance for this mapping, and press the Next button.

The next page of the Wizard allows you to set connectivity properties. This information is familiar to you from creating sessions, except that here it is a subset of the regular session options and is formatted somewhat differently.

3. Set the Target Connection Value to your target schema database connection object. The debugger data will be discarded in a later step so this value will be ignored.

4. Select the Properties tab at the bottom. Your Wizard should appear as in Figure 7-2 below.

♦ Ensure that the Source Filename property values includes the .txt extension. In this lab, verify you enter dates.txt.

♦ Ensure that the Target load type property value is set to Normal

Figure 7-1. Debug Session creation dialog box

Figure 7-2. Debug Session connections dialog box

Unit 7 Lab: Using the Debugger

Informatica PowerCenter 8 Level I Developer 173

Page 196: PC8LID 20061204 Large for Printing

5. Press the Next button.

6. We will not be overriding transformation properties, so press Next again.

7. Accept the defaults on the Session Configuration Wizard page and press Next.

8. The final Wizard page allows us to choose whether or not to discard the target data (the default) and choose which target data to view. Accept the defaults here as well.

9. When you press the Finish button, a Debug session will be created and it will initialize, opening the required database connections. No data will be read until we are ready to view it.

Step 3: Use the Debugger to Locate the Error

When the Debug Wizard Finish button is pressed, the appearance of the Designer interface will change, and it will likely require some minor adjustment to make it more readable. Note that three window panes are visible at the bottom third of the screen. Adjust the horizontal dividers with your mouse until what you see resembles Figure 7-3.

1. Set the Target Instance and Instance drop-boxes as shown in Figure 7-3 as well.

Note: The term instance is sometimes used as a synonym for transformation.

As mentioned earlier, the Debug session is initialized at this point but no data is read. We will manually control the debugger so we can easily review the data values and spot the error. The debugger can be controlled via the Designer menu, via hotkeys (described in the menu), or with the Debug Toolbar. We will use the toolbar.

Figure 7-3. Designer while running a Debug Session

Unit 7 Lab: Using the Debugger

174 Informatica PowerCenter 8 Level I Developer

Page 197: PC8LID 20061204 Large for Printing

2. The Debug Toolbar is not visible by default. To make it visible, select the menu option Tools > Customize. You will see the dialog box shown in Figure 7-4.

3. Select the Debugger toolbar.

4. Click OK.

The Debug Toolbar is short. When it is undocked, it appears as in Figure 7-5. If you cannot see it right away, look for the red “stop sign” on the right.

5. You can cause one row of data to be read by the Source Qualifier by pressing the third toolbar button - tooltip Next Instance.

Note that some data is shown in the Instance window.

6. Toggle the Instance drop-box to the Expression transformation. The data has not yet gone that far.

Note: No data available means null in the Debugger.

7. Press the fourth toolbar button - tooltip “Step to Instance.”

Note that one more row has been read, and the first row has been “pushed” into the Expression transformation and the Target table.

8. Press the Next Instance toolbar button (third) several times. Note that each time it is pressed, one more row is read and one more row (the row that was read from the previous press) is loaded into the target. The Instance window jumps between the Source Qualifier and the Expression (i.e., it follows the row).

Figure 7-4. Customize Toolbars Dialog Box

Figure 7-5. Debugger Toolbar

Tip: if you cannot find the Debugger Toolbar after using the menu option to select it, another toolbar has shifted it off the screen. Re-arrange the other docked toolbars until you can see it.

Unit 7 Lab: Using the Debugger

Informatica PowerCenter 8 Level I Developer 175

Page 198: PC8LID 20061204 Large for Printing

9. Press the Step to Instance toolbar button (fourth) several times. Note that it also causes one row to be read and written, but the Instance window shows only the data in one transformation - the one chosen in the drop-box.

10. Examine the data being sent to the target. What is the error? Hint: compare the values with the actual date being read from the source file.

Now that you are familiar with the basics of operating the Debugger, locate the cause of the error.

Step 4: Fix the Error and Confirm the Data is Correct

When you have found the error, you will not be able to fix it while the Debugger is running (try it). The mapping properties are grayed-out because there is an “in-use” lock on the mapping.

1. Stop the Debugger by pressing the second toolbar button. Press Yes.

2. Fix the mapping error.

3. Save the Repository.

4. Re-start the Debug Wizard as in Step 2. Note that your Debug session properties (such as connectivity) have been saved locally, making it easier for you to evoke the Debugger again if needed.

5. Confirm that the data being sent to the target is now correct.

Unit 7 Lab: Using the Debugger

176 Informatica PowerCenter 8 Level I Developer

Page 199: PC8LID 20061204 Large for Printing

Unit 7 Lab: Using the Debugger

Informatica PowerCenter 8 Level I Developer 177

Page 200: PC8LID 20061204 Large for Printing

Unit 7 Lab: Using the Debugger

178 Informatica PowerCenter 8 Level I Developer

Page 201: PC8LID 20061204 Large for Printing

Unit 8: Sequence Generator

After completing this unit, you should be able to:

♦ Describe the Sequence Generator transformation

♦ Use the Sequence Generator transformation in a mapping

Lesson 8-1. Sequence Generator Transformation

Type

Passive.

Description

The Sequence Generator Transformation generates unique numeric values that can be used to create keys. The values created by the sequence generator are sequential but not guaranteed to be contiguous. The Sequence Generator is an “output” only transformation with two outputs represented by the “NEXTVAL” and “CURRVAL” ports. Typically connect the “NEXTVAL” port to generate a new key. When connected to multiple targets the output of the Sequence Generator generates sequential values for each target. To use the same value for each target, pass the output of the Sequence Generator to an Expression transformation before connecting it to a target.

Unit 8: Sequence Generator

Informatica PowerCenter 8 Level I Developer 179

Page 202: PC8LID 20061204 Large for Printing

Properties

For more details, see the online help.

Unit 8: Sequence Generator

180 Informatica PowerCenter 8 Level I Developer

Page 203: PC8LID 20061204 Large for Printing

Business Purpose

A business receives customer information which is used to update a data warehouse customer dimension table with a customer history. A sequence generator is used to create surrogate keys to maintain referential integrity within the dimension table since a customer may have duplicate entries.

Example

The following example shows a partial mapping where the sequence generator is used to generate a new key for the Dates dimension table.

Performance Considerations

It is best to configure the Sequence Generator transformation as close to the target as possible in a mapping otherwise a mapping will be carrying extra sequence numbers through the transformation process which will not be transformed.

Unit 8: Sequence Generator

Informatica PowerCenter 8 Level I Developer 181

Page 204: PC8LID 20061204 Large for Printing

Unit 8: Sequence Generator

182 Informatica PowerCenter 8 Level I Developer

Page 205: PC8LID 20061204 Large for Printing

Unit 8 Lab: Load Date Dimension Table

Business Purpose

The Mersche Motors data warehouse has a date dimension table that needs to be loaded. The date dimension needs to be loaded before any of the other dimension tables.

Technical Description

PowerCenter will extract the dates from a shared relational table and load them into a shared relational table. All columns in the source table have matching columns in the target table. A primary key for the target table will be assigned using the Sequence Generator transformation.

Goals

♦ Create sources and targets based on shortcuts

♦ Create a Sequence Generator transformation

♦ Create unique integer primary key values using the NEXTVAL port

Duration

20 Minutes

Unit 8 Lab: Load Date Dimension Table

Informatica PowerCenter 8 Level I Developer 183

Page 206: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

The Sequence Generator transformation will be used to assign unique integer values for the DATE_KEY field as rows are passed from the STG_DATES table to the DIM_DATES table.

Mapping Name m_DIM_DATES_LOAD_xx

Source System Oracle Table Target System Oracle Table

Initial Rows 4019 Rows/Load 4019

Short DescriptionSource relational table will be directly loaded into a relational target. The primary key for the target table will

be assigned by a sequence generator.

Load Frequency Once

Preprocessing

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

Tables

Table Name Schema/Owner Selection/Filter

STG_DATES TDBUxx

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

DIM_DATES X DATE_KEY

Relational

Source

Relational

Target

Sequence

Generator

Unit 8 Lab: Load Date Dimension Table

184 Informatica PowerCenter 8 Level I Developer

Page 207: PC8LID 20061204 Large for Printing

SO

UR

CE

TO

TA

RG

ET

FIE

LD

MA

TR

IX

Target Table

Target Column

Source File

Source Column

Expression

Default Value if

Null

DIM_DATES

DATE_KEY

STG_DATES

derived

NEXTVAL from Sequence

Generator

DIM_DATES

DATE_VALUE

STG_DATES

DATE_VALUE

DIM_DATES

DATE_ID_LEGACY

STG_DATES

DATE_ID_LEGACY

DIM_DATES

DATE_OF_MONTH

STG_DATES

DATE_OF_MONTH

DIM_DATES

MONTH_NUMBER

STG_DATES

MONTH_NUMBER

DIM_DATES

YEAR_VALUE

STG_DATES

YEAR_VALUE

DIM_DATES

DAY_OF_WEEK

STG_DATES

DAY_OF_WEEK

DIM_DATES

DAY_NAME

STG_DATES

DAY_NAME

DIM_DATES

MONTH_NAME

STG_DATES

MONTH_NAME

DIM_DATES

DAY_OF_YEAR

STG_DATES

DAY_OF_YEAR

DIM_DATES

MONTH_OF_YEAR

STG_DATES

MONTH_OF_YEAR

DIM_DATES

WEEK_OF_YEAR

STG_DATES

WEEK_OF_YEAR

DIM_DATES

DAY_OVERALL

STG_DATES

DAY_OVERALL

DIM_DATES

WEEK_OVERALL

STG_DATES

WEEK_OVERALL

DIM_DATES

MONTH_OVERALL

STG_DATES

MONTH_OVERALL

DIM_DATES

YEAR_OVERALL

STG_DATES

YEAR_OVERALL

DIM_DATES

HOLIDAY_INDICATOR

STG_DATES

HOLIDAY_INDICATOR

DIM_DATES

WORKDAY_INDICATOR

STG_DATES

WORKDAY_INDICATOR

DIM_DATES

WEEKDAY_INDICATOR

STG_DATES

WEEKDAY_INDICATOR

DIM_DATES

WEEKEND_INDICATOR

STG_DATES

WEEKEND_INDICATOR

DIM_DATES

QUARTER_OF_YEAR

STG_DATES

QUARTER_OF_YEAR

DIM_DATES

SEASON

STG_DATES

SEASON

DIM_DATES

LAST_DAY_IN_MONTH

STG_DATES

LAST_DAY_IN_MONTH

Unit 8 Lab: Load Date Dimension Table

Informatica PowerCenter 8 Level I Developer 185

Page 208: PC8LID 20061204 Large for Printing

DIM_DATES

LAST_DAY_IN_QUARTER

STG_DATES

LAST_DAY_IN_QUARTER

DIM_DATES

LAST_DAY_IN_YEAR

STG_DATES

LAST_DAY_IN_YEAR

Target Table

Target Column

Source File

Source Column

Expression

Default Value if

Null

Unit 8 Lab: Load Date Dimension Table

186 Informatica PowerCenter 8 Level I Developer

Page 209: PC8LID 20061204 Large for Printing

Instructions

Step 1: Copy a Shared Relational Source Table

1. Expand the DEV_SHARED folder and locate the source definition STG_DATES in the ODBC_STG node. Notice that this STG_DATES object is a source, while the STG_DATES that you have already used is a target.

2. Ensure that your student folder is open.

3. Copy the STG_DATES source definition from the DEV_SHARED folder into your student folder.

4. Save your work.

Step 2: Create a Shortcut to a Shared Relational Target Table

1. In the DEV_SHARED folder, locate the target DIM_DATES.

2. Make sure your student folder is open.

3. Highlight the target and drag it to your student folder.

4. Rename the target as SC_DIM_DATES.

5. Save your work.

You will now be able to see the SC_DIM_DATES shortcut in your own student folder.

Step 3: Create a Mapping

1. Create a new mapping named m_DIM_DATES_LOAD_xx.

2. Add the STG_DATES relational source to the new mapping.

3. Add the SC_DIM_DATES relational target to the new mapping.

4. Expand the mapping objects.

Velocity Best Practice: The SC_ prefix is the Velocity Best Practice naming convention for shortcut objects.

Unit 8 Lab: Load Date Dimension Table

Informatica PowerCenter 8 Level I Developer 187

Page 210: PC8LID 20061204 Large for Printing

Your mapping should appear similar to Figure 8-1.

Step 4: Create a Sequence Generator Transformation

1. From the Transformation toolbar, select the Sequence Generator transformation icon.

2. Position the Sequence Generator transformation before the target.

3. From the Sequence Generator transformation select the NEXTVAL port and link it to the DATE_KEY column of the SC_DIM_DATES target.

4. Rename the sequence generator seq_DIM_DATES_DATE_KEY.

Figure 8-1. Expanded view of m-DIM_DATES_LOAD

Figure 8-2. Sequence Generator Transformation icon

Tip: You can create approximately two billion primary or foreign key values with the Sequence Generator transformation by connecting the NEXTVAL port to the desired transformation or target and using the widest range of values (1 to 2147483647) with the smallest interval (1).

Figure 8-3. Normal view of the sequence generator NEXTVAL port connected to a target column

Unit 8 Lab: Load Date Dimension Table

188 Informatica PowerCenter 8 Level I Developer

Page 211: PC8LID 20061204 Large for Printing

5. Select the Properties tab and observe the properties available in the sequence generator.

a. Check the “Reset” Attribute Value.

b. Describe the following properties. Use the Help system to find the answers.

♦ Increment by:________________________________________________

♦ Current value:________________________________________________

6. Click the OK button to return to the Normal view of the sequence generator.

7. Save your work.

Step 5: Link the Target Table

1. Link all the ports from the Source Qualifier transformation to the corresponding columns in the target object using Autolink by name. See Figure 8-4.

2. Save your work.

3. Verify your mapping is valid in the Output window. If the mapping is not valid, correct the invalidations that are displayed in the message.

Step 6: Create and Run the Workflow

1. Launch the Workflow Manager (if not already running) and connect to the repository and open your student folder.

2. From Workflow Designer create a new workflow named wkf_DIM_DATES_LOAD_xx.

3. Use the Session task icon and create a new Session task.

4. Associate the m_DIM_DATES_LOAD_xx mapping to the new session task.

5. Link the Start object to the s_m_DIM_DATES_LOAD_xx session task object.

Figure 8-4. Normal view of connected ports to the target

Unit 8 Lab: Load Date Dimension Table

Informatica PowerCenter 8 Level I Developer 189

Page 212: PC8LID 20061204 Large for Printing

6. Edit the s_m_DIM_DATES_LOAD_xx session task and set the following options in the Mapping tab:

♦ Select SQ_SC_STG_DATES from the Sources folder in the navigator window.

♦ Set the Connections Value to your assigned NATIVE_STGxx connection value.

♦ Select SC_DIM_DATES from the Target folder in the navigator window.

♦ Set the Connections Value to your assigned NATIVE_EDWxx connection value.

♦ Set the Target Load type to Normal.

♦ Check the property Truncate target table option in the target properties.

7. Save your work.

8. Check Validate messages to ensure your workflow is valid. If you received an invalid message, correct the problem(s) and re-validate/save.

9. Start the workflow.

10. Review the Task Details. Your information should appear similar to Figure 8-5.

11. Select the Source/Target Statistics tab. Your statistics should be similar to Figure 8-6.

Figure 8-5. Task Details of the completed session run

Figure 8-6. Source/Target statistics for the session run

Unit 8 Lab: Load Date Dimension Table

190 Informatica PowerCenter 8 Level I Developer

Page 213: PC8LID 20061204 Large for Printing

Data Results

Preview the target data from the Designer. Your data should appear similar to Figure 8-7.

Figure 8-7. Data Preview of the DIM_DATES table

Unit 8 Lab: Load Date Dimension Table

Informatica PowerCenter 8 Level I Developer 191

Page 214: PC8LID 20061204 Large for Printing

8

Unit 8 Quiz

1. When would you use the Sequence Generator

transformation?

2. What is the relationship between the values of

CURRVAL and NEXTVAL?

Unit 8

Unit 8 Lab: Load Date Dimension Table

192 Informatica PowerCenter 8 Level I Developer

Page 215: PC8LID 20061204 Large for Printing

Unit 9: Lookup Caching, More Features and Techniques

After completing this unit, you should be able to:

♦ Define Persistent Lookup cache

♦ Use Persistent Lookup caching in a mapping

♦ Use additional Designer features and techniques

Lesson 9-1. Lookup Caching

Description

The Lookup transformation allows you to cache the lookup table in memory. This is the default.

Properties

This section will discuss the cache related properties. Dynamic cache will be discussed in a later module.

Option Lookup Type Description

Lookup Caching Enabled Flat File, Relational

Indicates whether the Integration Service caches lookup values during the session.

Lookup Cache Directory Name

Flat File, Relational

Specifies the directory used to build the lookup cache files when you configure the Lookup transformation to cache the lookup source. Also used to save the persistent lookup cache files when you select the Lookup Persistent option.By default, the Integration Service uses the $PMCacheDir directory configured for the Integration Service process.

Unit 9: Lookup Caching, More Features and Techniques

Informatica PowerCenter 8 Level I Developer 193

Page 216: PC8LID 20061204 Large for Printing

For more detailed information refer to the online help.

Lookup Cache

How it Works

♦ There are two types of cache memory, index and data cache.

♦ All port values from the lookup table where the port is part of the lookup condition are loaded into index cache.

♦ The index cache contains all port values from the lookup table where the port is specified in the lookup condition.

♦ The data cache contains all port values from the lookup table that are not in the lookup condition and that are specified as “output” ports.

♦ After the cache is loaded, values from the Lookup input port(s) that are part of the lookup condition are compared to the index cache.

♦ Upon a match the rows from the cache are included in the stream.

Key Point

If there is not enough memory specified in the index and data cache properties, the overflow will be written out to disk.

Performance Considerations

Lookup caching typically improves performance because the Integration Service need not execute an external read request to perform the lookup. However, this is true only if the time taken to load the lookup cache is less than the time that would be taken to perform the external read requests. To reduce the amount of cache required, turn off or delete any unused output ports in the Lookup transformation. You can also index the lookup file to speed the retrieval time. You can use where clauses in the SQL override to minimize the amount of data written to cache.

Lookup Cache Persistent Flat File,

Relational

Indicates whether the Integration Service uses a persistent lookup cache.

Lookup Data Cache Size Flat File,

Relational

Indicates the maximum size the Integration Service allocates to the data cache

in memory. When the Integration Service cannot store all the data cache data in

memory, it pages to disk as necessary.

Lookup Index Cache Size Flat File,

Relational

Indicates the maximum size the Integration Service allocates to the index cache

in memory. When the Integration Service cannot store all the index cache data

in memory, it pages to disk as necessary.

Cache File Name Prefix Flat File,

Relational

Use only with persistent lookup cache. Specifies the file name prefix to use with

persistent lookup cache files.

Recache From Lookup

Source

Flat File,

Relational

Use only with the lookup cache enabled. When selected, the Integration

Service rebuilds the lookup cache from the lookup source when it first calls the

Lookup transformation instance.

If you use a persistent lookup cache, it rebuilds the persistent cache files before

using the cache. If you do not use a persistent lookup cache, it rebuilds the

lookup cache in memory before using the cache.

Option Lookup Type Description

Unit 9: Lookup Caching, More Features and Techniques

194 Informatica PowerCenter 8 Level I Developer

Page 217: PC8LID 20061204 Large for Printing

Rule Of Thumb

Cache if the number (and size) of records in the lookup table is small relative to the number of mapping rows requiring a lookup.

Unit 9: Lookup Caching, More Features and Techniques

Informatica PowerCenter 8 Level I Developer 195

Page 218: PC8LID 20061204 Large for Printing

Unit 9: Lookup Caching, More Features and Techniques

196 Informatica PowerCenter 8 Level I Developer

Page 219: PC8LID 20061204 Large for Printing

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and

Persistent Cache)

Business Purpose

Mersche Motors runs a number of promotions that begin and end on certain dates. The promotions are stored in the promotions dimension table. This table stores the start and expiry dates as date keys that reference the date dimension table.

Technical Description

The DIM_PROMOTIONS table requires start and expiration date keys. These exist in the DIM_DATES table that was populated in the previous lab. To obtain these date keys, which were created by the sequence generator, it will be necessary to perform a Lookup to the DIM_DATES table in the EDW database. The DIM_DATES table changes infrequently so it will be loaded into cache in a persistent state. The lookup cache will be used often by other Mappings that load Dimension tables.

Goals

Understand how to configure and use a persistent Lookup cache.

Duration

25 minutes

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

Informatica PowerCenter 8 Level I Developer 197

Page 220: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

LOOKUPS

Mapping Name m_DIM_PROMOTIONS_LOAD_xx

Source System Oracle Table Target System Oracle Table

Initial Rows 6 Rows/Load 6

Short DescriptionPromotion data is run through the mapping and a lookup must be performed to the DIM_DATE table to

acquire the date keys for the start date and expiration date in the DIM_PROMOTIONS table.

Load Frequency To be determined

Preprocessing DIM_DATES must be loaded

Post Processing

Error Strategy Default

Reload Strategy

Unique Source Fields

Tables

Table Name Schema/Owner Selection/Filter

STG_PROMOTIONS TDBUxx None

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

DIM_PROMOTIONS X PROMO_ID

Lookup Name lkp_START_DATE_KEY

Table DIM_DATES Location EDW

Match Condition(s) DIM_DATES.DATE_VALUE = STG_PROMOTIONS.START_DATE

Filter/SQL Override

Lookup Name lkp_EXPIRY_DATE_KEY

Table DIM_DATES Location EDW

Match Condition(s) DIM_DATES.DATE_VALUE = STG_PROMOTIONS.EXPIRY_DATE

Filter/SQL Override

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

198 Informatica PowerCenter 8 Level I Developer

Page 221: PC8LID 20061204 Large for Printing

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

This mapping will populate the DIM_PROMOTIONS table with data. In order to successfully populate the DIM_PROMOTIONS table there must be two Lookups to the DIM_DATES table to acquire values for the START_DK and EXPIRY_DK date keys. Students will need to determine which columns to use for the condition in the Lookup Transformation.

SOURCE TO TARGET FIELD MATRIX

Note: This lab requires the successful completion of the Unit 8 Lab.

Target Table Target Column Source Table Source Column ExpressionDefault Value

if Null

DIM_PROMOTIONS PROMO_ID STG_PROMOTIONS PROMO_ID

DIM_PROMOTIONS PROMO_DESC STG_PROMOTIONS PROMO_DESC

DIM_PROMOTIONS PROMO_TYPE STG_PROMOTIONS PROMO_TYPE

DIM_PROMOTIONS START_DK DIM_DATES DATE_KEY

DIM_PROMOTIONS EXPIRY_DK DIM_DATES DATE_KEY

DIM_PROMOTIONS PROMO_COST STG_PROMOTIONS PROMO_COST

DIM_PROMOTIONS DISCOUNT STG_PROMOTIONS DISCOUNT

Target

Lookup

Source

Lookup

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

Informatica PowerCenter 8 Level I Developer 199

Page 222: PC8LID 20061204 Large for Printing

Instructions

Step 1: Create a Shortcut to a Shared Relational Source Table

1. In the Source Analyzer, create a short cut to the STG_PROMOTIONS source table from the DEV_SHARED > Sources > ODBC_STG folder.

2. Rename the shortcut to SC_STG_PROMOTIONS.

3. Save your work.

Step 2: Create a Shortcut to Shared Relational Target Table

1. In the Target Designer, create a shortcut to the DIM_PROMOTIONS target table from the DEV_SHARED > Targets folder.

2. Rename the shortcut to SC_DIM_PROMOTIONS.

Note: If the SC_DIM_DATES target table is not displayed in the Target Designer drag it in from the Targets folder in your student folder. Notice the primary key-foreign key relationships.

3. Save your work.

Step 3: Create a Mapping

1. Create a new mapping named m_DIM_PROMOTIONS_LOAD_xx.

2. Add the source definition shortcut SC_STG_PROMOTIONS to the mapping.

3. Add the target definition shortcut SC_DIM_PROMOTIONS to the mapping.

4. Arrange transformations appropriately and Autolink the ports “By Name” between:

♦ SQ _SC_STG_PROMOTIONS and SC_DIM_PROMOTIONS.

5. Save your work. It should look like the mapping in Figure 9-1.

Step 4: Create Lookups for the Start and Expiry Date Keys

1. Examine Figure 9-1.

2. In Figure 9-1, compare START_DATE and EXPIRY_DATE in SQ_SC_STG_PROMOTIONS to START_DK AND EXPIRY_DK in the SC_DIM_PROMOTIONS Target table. Notice that these two ports are not connected and the datatypes are different. The target requires key values (number), not dates.

In what table do these Date Key values exist? _________________________.

Figure 9-1. m_DIM_PROMOTIONS_LOAD mapping

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

200 Informatica PowerCenter 8 Level I Developer

Page 223: PC8LID 20061204 Large for Printing

3. Examine Figure 9-2.

The date dimension table (DIM_DATES) was populated by the previous lab, the DATE_KEY was generated by the seq_DIM_DATES_DATE_KEY Sequence Generator transformation and DATE_VALUE has a datatype of date/time.

4. To acquire the value for the START_DK in the DIM_PROMOTIONS target you need to perform a Lookup on the DIM_DATES table.

You will base the Lookup Condition on the ____________________ port from SQ_SC_STG_PROMOTIONS Source Qualifier and the ____________________ column in the DIM_DATES Lookup table.

5. Similarly, to acquire the value for the EXPIRY_DK in the DIM_PROMOTIONS Target you will need a second Lookup on the DIM_DATES as well.

You will base the Lookup Condition on the ____________________ port from SQ_SC_STG_PROMOTIONS Source Qualifier and the ____________________ column in the DIM_DATES Lookup table.

6. Add a Lookup Transformation to the mapping based on the SC_DIM_DATES (shortcut to DIM_DATES) target table.

Figure 9-2. m_DIM_DATES from the previous lab that populated the DIM_DATES table

Figure 9-3. Select Lookup Table

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

Informatica PowerCenter 8 Level I Developer 201

Page 224: PC8LID 20061204 Large for Printing

7. Rename the Lookup Transformation to lkp_START_DATE_KEY.

8. Click OK.

9. Click YES to verify the “Look up condition is empty”. You will define this shortly.

10. Now Drag and Drop the START_DATE port from SQ_SC_STG_PROMOTIONS Source Qualifier to an empty port in the lkp_START_DATE_KEY transformation.

11. Make START_DATE input only.

12. Rename START_DATE to IN_START_DATE.

13. Define the Lookup Condition to look like Figure 9-4:

14. On the Properties tab and verify the following values:

♦ Lookup Table Name = DIM_DATES (default).

♦ Lookup Caching Enabled = Checked (default).

♦ Lookup Cache Persistent = Checked (needs to be set).

♦ Cache File Name Prefix = LKPSTUxx (where xx is your student number).

15. Link the DATE_KEY port from the lkp_START_DATE_KEY transformation to the START_DK port in the SC_DIM_PROMOTIONS target.

16. Save your work.

Note: Notice that this transformation has many ports. We could have unchecked to Output column on all except for the ones that we need but since this Lookup Transformation will be persistent it would have limited its functionality for other Mappings that might leverage it.

The lkp_START_DATE_KEY transformation will not retrieve values for EXPIRY_DK because the lookup conditions will be different.

17. Create a second Lookup transformation called lkp_EXPIRY_DATE_KEY by selecting the lkp_START_DATE_KEY transformation and pressing Ctrl+C and Ctrl+V.

18. Make the changes necessary to the Lookup to ensure that the EXPIRY_DATE finds the proper DATE_KEY.

a. Rename it to lkp_EXPIRY_DATE_KEY.

b. Rename port IN_START_DATE to IN_EXPIRY_DATE.

c. Verify the Lookup Condition is correct.

19. Link the EXPIRY_DATE port from SQ_SC_STG_PROMOTIONS Source Qualifier to the IN_EXPIRY_DATE port in the lkp_EXPIRY_DATE_KEY transformation.

Figure 9-4. Lookup Condition

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

202 Informatica PowerCenter 8 Level I Developer

Page 225: PC8LID 20061204 Large for Printing

20. Link the DATE_KEY port from the lkp_EXPIRY_DATE_KEY transformation to the EXPIRY_DK port in the SC_DIM_PROMOTIONS target.

21. Save your work.

Step 5: Create and Run the Workflow

1. Launch the Workflow Manager and sign into your assigned folder.

2. Create a new Workflow named wkf_DIM_PROMOTIONS_LOAD_xx.

3. Create a new Session task using the mapping m_DIM_PROMOTIONS_LOAD_xx.

4. Edit the s_m_DIM_PROMOTIONS_LOAD_xx session task.

5. In the Mapping tab:

a. Select SQ_SC_STG_PROMOTIONS located under the Sources folder in the navigator window.

b. Set the Connections > Type to your assigned NATIVE_STGxx connection object.

c. Select SC_DIM_PROMOTIONS located under the Target folder in the navigator window.

d. Set the Connections > Type to your assigned NATIVE_EDWxx connection object.

e. Ensure that the Target load type is set to Normal.

6. Complete the workflow by linking the Start and Session tasks and save your work.

7. Run the workflow.

8. Review the Task Details.

Figure 9-5. m_DIM_POROMOTIONS_LOAD completed mapping

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

Informatica PowerCenter 8 Level I Developer 203

Page 226: PC8LID 20061204 Large for Printing

Your information should appear similar to Figure 9-6.

9. Select the Source/Target Statistics tab. Your statistics tab should appear as Figure 9-7.

Figure 9-6. Task Details of the completed session run

Figure 9-7. Source/Target Statistics of the completed session run

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

204 Informatica PowerCenter 8 Level I Developer

Page 227: PC8LID 20061204 Large for Printing

Data Results

Preview the target data. The results should be similar as Figure 9-8. Note the values for START_DK and EXPIRY_DK.

By setting the Lookup Cache Persistent property on the Lookup transformations, two files were created in the cache file directory defined for the Integration Service process. See Figure 9-9. Note that in this lab, these files are on the Integration Service process machine, not your local computer. Also note that the names correspond to the name you entered in the Cache File Name Prefix Lookup property. To view these files, you will need to map to the file system on the Integration Service process machine. Verify that the files have a timestamp similar to when you ran the above workflow.

Figure 9-8. Data Preview of the DIM_PROMOTIONS target table

Figure 9-9. Preview files created when Persistent Cache is set on Lookup Transformation

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

Informatica PowerCenter 8 Level I Developer 205

Page 228: PC8LID 20061204 Large for Printing

Unit 9 Lab A: Load Promotions Dimension Table (Lookup and Persistent Cache)

206 Informatica PowerCenter 8 Level I Developer

Page 229: PC8LID 20061204 Large for Printing

Unit 9 Lab B: Features and Techniques II

Business Purpose

The management wants to increase the efficiency of the PowerCenter Developers.

Technical Description

This lab will detail the use of 4 PowerCenter Designer features. Each of these features will increase the efficiency of any developer who knows how to use them appropriately. At the discretion of the instructor, this lab can also be completed as a demonstration.

Goals:

Use the following features:

♦ Find Within Workspace

♦ View Object Dependencies

♦ Compare Objects

♦ Overview Window

Duration

15 minutes

Unit 9 Lab B: Features and Techniques II

Informatica PowerCenter 8 Level I Developer 207

Page 230: PC8LID 20061204 Large for Printing

Instructions

Open a Mapping

In the Designer tool, perform the following steps:

1. In your Studentxx folder, select the m_Stage_Customer_Contacts_xx mapping and drag it into the Mapping Designer.

Feature 1: Find in Workspace

When using this feature you can perform a string search for the name of an object, table, column, or port for all the transformations in a mapping currently open in the Mapping Designer or workspace. This feature can also be used in the Source Analyzer, Target Designer, Mapplet Designer, and Transformation Developer.

1. Select the Find in workspace toolbar icon .

2. Type the word “customer” in the Find What text box.

3. Click Find Now.

Your results should appear as in Figure 9-10.

Figure 9-10. Find in workspace dialog box

Unit 9 Lab B: Features and Techniques II

208 Informatica PowerCenter 8 Level I Developer

Page 231: PC8LID 20061204 Large for Printing

Note: In the Find in workspace feature, the term “fields” can mean columns in sources or targets or ports in transformations. The term “table” can mean a source or target definition or a transformation.

Feature 2: View Object Dependencies

By viewing object dependencies in the Designer a user can learn which objects may be affected by making changes to source or target definitions, mappings, mapplets, or transformations (reusable or non-reusable). Direct and indirect dependencies are shown. Object dependencies can also be viewed from the Workflow Manager and the Repository Manager. The Repository Manager will show any of the supported dependencies between a wide range of objects within the repository. See the Repository Guide for a complete list.

1. Select the flat-file source definition promotions in the Navigator window.

2. Right-click and select Dependencies

You will see the Dependencies dialog box as shown in Figure 9-11.

3. Click OK

You will see the View Dependencies window, which will show detailed information about each of the dependencies found. Browse through this window, noting that some of the information relates to Team-Based Development (version control) properties like Version, Timestamp, and Version Comments.

Note: By clicking the Save button on the toolbar, the dependencies can be saved as an .htm file for future reference.

4. Experiment by viewing the dependencies of other objects.

Velocity Best Practice: By using the Velocity Methodology object naming conventions (such as transformation type prefixes) it will be easier to locate the found objects in the workspace. For example, in Figure 9-10 we know that SQ_customer_layout is a Source Qualifier and fil_Customer_No_99999 is a filter.

Figure 9-11. View Dependencies dialog box

Tip: Dependencies can also be viewed by right-clicking on an object directly in a workspace, such as a source definition in the Mapping Designer or the Source Analyzer.

Unit 9 Lab B: Features and Techniques II

Informatica PowerCenter 8 Level I Developer 209

Page 232: PC8LID 20061204 Large for Printing

Feature 3: Compare Objects

This feature allows you to compare all of the ports and properties of any two objects within a mapping or mapplet.

1. Open the m_DIM_PROMOTIONS_LOAD_xx mapping.

2. Right-click the Lookup transformation lkp_START_DATE_KEY and select Compare Objects.

3. For the Instance 2 drop-box, select the Lookup transformation lkp_EXPIRY_DATE_KEY. This is the object we wish to compare with the Lookup transformation. Your screen should appear as Figure 9-12.

4. Click Compare.

Figure 9-12. Transformation compare objects dialog box

Unit 9 Lab B: Features and Techniques II

210 Informatica PowerCenter 8 Level I Developer

Page 233: PC8LID 20061204 Large for Printing

5. Browse the tabs in the Transformations window that appears. Select the Properties tab, and what you see should be similar to Figure 9-13.

We will now learn how to compare objects that are in different folders.

6. Open the target definition STG_DATES in the Target Designer.

7. Right-click the target and select Compare Objects.

Figure 9-13. Compare Transformation objects Properties details

Note: A great deal of comparative information is displayed in the tabs. All differences will appear in red. Ports that are highlighted in yellow indicate a difference in the expression which may not be easily visible in this view.

Unit 9 Lab B: Features and Techniques II

Informatica PowerCenter 8 Level I Developer 211

Page 234: PC8LID 20061204 Large for Printing

8. The Select Targets dialog box allows you to choose a comparison object in another folder. Click the Browse button for Target 2 and select the DIM_DATES table in the DEV_SHARED folder. Your screen should appear as Figure 9-14.

9. Click Compare.

10. Browse the information in the various tabs. Note that this method can quickly tell you the differences, if any, between two objects in two different folders. See Figure 9-15.

Figure 9-14. Target comparison dialog box

Figure 9-15. Column differences between two target tables

Tip: In order to compare objects across folders, both folders must be open.

Unit 9 Lab B: Features and Techniques II

212 Informatica PowerCenter 8 Level I Developer

Page 235: PC8LID 20061204 Large for Printing

Feature 4: Overview Window

The Overview window is useful when a large mapping is “zoomed in” on your screen so you can work on the individual transformations, but the zoom level makes it difficult to scroll into a different section of the mapping because you cannot see where you are scrolling to. The Overview window has been described as a “bird’s eye view” of the mapping, enabling you to see your position relative to the entire structure.

1. In the Mapping Designer, set the zoom level to 100-percent.

2. Click the Toggle Overview Window toolbar button.

3. The Overview window will appear in the upper-right hand corner of your screen. Use your left mouse button to drag the dotted rectangle to a different location within the mapping. If you were searching for a target or a source in a large and complex mapping, this feature would make it faster to locate.

Tip: Selected mapping objects appear red in the Overview window.

Unit 9 Lab B: Features and Techniques II

Informatica PowerCenter 8 Level I Developer 213

Page 236: PC8LID 20061204 Large for Printing

Unit 9 Lab B: Features and Techniques II

214 Informatica PowerCenter 8 Level I Developer

Page 237: PC8LID 20061204 Large for Printing

Unit 10: Sorter, Aggregator and Self-Join

After completing this unit, you should be able to:

♦ Describe the following features:

♦ Sorter transformations

♦ Aggregator transformations

♦ Active and passive transformations

♦ Data concatenations

♦ Self-Joins

♦ Use these features in mappings

Lesson 10-1. Sorter Transformation

Type

Active.

Description

The Sorter transformation sorts the incoming data based on one or more key values - the sort order can be ascending, descending or mixed. The Sorter transformation attribute, “Distinct” provides a facility to remove duplicates from the input rows.

Unit 10: Sorter, Aggregator and Self-Join

Informatica PowerCenter 8 Level I Developer 215

Page 238: PC8LID 20061204 Large for Printing

Properties

Option Description

Sorter Cache Size The Integration Service uses the Sorter Cache Size property to determine the maximum amount of memory it

can allocate to perform the sort operation. The Integration Service passes all incoming data into the Sorter

transformation before it performs the sort operation. You can specify any amount between 1 MB and 4 GB for

the Sorter cache size.

Case Sensitive The Case Sensitive property determines whether the Integration Service considers case when sorting data.

When you enable the Case Sensitive property, the Integration Service sorts uppercase characters higher than

lowercase characters.

Work Directory The directory that the Integration Service uses to create temporary files while it sorts data. After the Integration

Service sorts the data, it deletes the temporary files.

Distinct You can configure the Sorter transformation to treat output rows as distinct. If you configure the Sorter

transformation for distinct output rows, the Mapping Designer configures all ports as part of the sort key.

Tracing Level Sets the amount of detail included in the session log when you run a session containing this transformation.

Null Treated Low Enable this property if you want the Integration Service to treat null values as lower than any other value when

it performs the sort operation.

Transformation

Scope

Specifies how the Integration Service applies the transformation logic to incoming data:

- Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when a row of

data depends on all rows in the same transaction, but does not depend on rows in other transactions.

- All Input. Applies the transformation logic on all incoming data. When you choose All Input, the PowerCenter

drops incoming transaction boundaries. Choose All Input when a row of data depends on all rows in the

source.

Unit 10: Sorter, Aggregator and Self-Join

216 Informatica PowerCenter 8 Level I Developer

Page 239: PC8LID 20061204 Large for Printing

Business Purpose

A business may aggregate data on records received from relational sources (Databases) or flat files with related records in random order. Sorting the records prior to passing them on to an Aggregator transformation may improve the overall performance of the aggregation task.

Example

In the following example Gross Profit and Profit Margin are calculated for each item sold. To improve performance of this session a Sorter transformation is added prior to the Aggregator transformation. The Aggregator “Sorted Input” property must be checked to notify the Aggregator to expect input in sort order.

Sorter Cache

How It Works

♦ If the cache size specified in the properties exceeds the available amount of memory on the Integration Service process machine then the Integration Service fails the session.

♦ All of the incoming data is passed into cache memory before the sort operation is performed.

♦ If the amount of incoming data is greater than the cache size specified then the PowerCenter will temporarily store the data in the Sorter transformation work directory.

Key Points

The Integration Service requires disk space of at least twice the amount of incoming data when storing data in the work directory.

Performance Considerations

Using a Sorter transformation may improve performance over an “Order By” clause in a SQL override in aggregate session when the source is a database because the source database may not be tuned with the buffer sizes needed for a database sort.

Unit 10: Sorter, Aggregator and Self-Join

Informatica PowerCenter 8 Level I Developer 217

Page 240: PC8LID 20061204 Large for Printing

Lesson 10-2. Aggregator Transformation

Type

Active.

Description

The Aggregator transformation calculates aggregates such as sums, minimum or maximum values across multiple groups of rows. The Aggregator transformation can apply expressions to its ports however those expressions will be applied to a group of rows unlike the Expression transformation which applies calculations on a row-by-row basis only. Aggregate functions are created in output ports only. Function grouping requirements are set using the Aggregator GroupBy port.

Unit 10: Sorter, Aggregator and Self-Join

218 Informatica PowerCenter 8 Level I Developer

Page 241: PC8LID 20061204 Large for Printing

Properties

Option Description

Cache Directory Local directory where the Integration Service creates the index and data cache files.

Tracing Level Amount of detail displayed in the session log for this transformation.

Sorted Input Indicates input data is presorted by groups. Select this option only if the mapping passes sorted data

to the Aggregator transformation.

Aggregator Data Cache

Size

Data cache size for the transformation. Default cache size is set to Auto.

Aggregator Index Cache

Size

Index cache size for the transformation. Default cache size is set to Auto

Transformation Scope Specifies how the Integration Service applies the transformation logic to incoming data:

- Transaction. Applies the transformation logic to all rows in a transaction. Choose Transaction when

a row of data depends on all rows in the same transaction, but does not depend on rows in other

transactions.

- All Input. Applies the transformation logic on all incoming data. When you choose All Input, the

PowerCenter drops incoming transaction boundaries. Choose All Input when a row of data depends

on all rows in the source.

Unit 10: Sorter, Aggregator and Self-Join

Informatica PowerCenter 8 Level I Developer 219

Page 242: PC8LID 20061204 Large for Printing

Business Purpose

A business may want to calculate gross profit or profit margins based on items sold or summarize weekly, monthly or quarterly sales activity.

Example

The following example calculates a value for units sold (OUT_UNITS_SOLD) and revenue (OUT_REVENUE) and cost (OUT_COST) for each promotion id by date.

Unit 10: Sorter, Aggregator and Self-Join

220 Informatica PowerCenter 8 Level I Developer

Page 243: PC8LID 20061204 Large for Printing

Aggregator Cache

How It Works

♦ There are two types of cache memory, index and data cache.

♦ All rows are loaded into cache before any aggregation takes place.

♦ The index cache contains group by port values.

♦ The data cache contains all port values variable and connected output ports.

♦ Non group by input ports used in non-aggregate output expression.

♦ Non group by input/output ports.

♦ Local variable ports.

♦ Port containing aggregate function (multiply by three).

♦ One output row will be returned for each unique occurrence of the group by ports.

Key Points

♦ If there is not enough memory specified in the index and data cache properties, the overflow will be written out to disk.

♦ No rows are returned until all of the rows have been aggregated.

♦ Checking the sorted input attribute will bypass caching.

♦ You enable automatic memory settings by configuring a value for the Maximum Memory Allowed for Auto Memory Attributes or the Maximum Percentage of Total Memory Allowed for Auto Memory Attributes. If the value is set to zero for either of these attributes, the Integration Service disables automatic memory settings and uses default values.

Unit 10: Sorter, Aggregator and Self-Join

Informatica PowerCenter 8 Level I Developer 221

Page 244: PC8LID 20061204 Large for Printing

Performance Considerations

Aggregator performance can be increased when you sort the input data in the same order as the Aggregator Group By ports prior to doing the Aggregation. The Aggregator sorted input property would need to be checked. Relational source data can be sorted using an “order by” clause in the Source Qualifier override. Flat file source data can be sorted using an external sort application or the Sorter transformation. Cache size is also important in assuring optimal performance in the Aggregator. Make sure that your cache size settings are large enough to accommodate all of the data. If they are not the system will cache out to disk causing a slow down in performance.

Lesson 10-3. Active and Passive Transformations

Passive transformations operate on one row at a time AND preserve the number of rows. Examples: Expression, Lookup, Sequence Generator.

Active transformations operate on groups of rows AND/OR change the number of rows. Examples: Source Qualifier, Filter, Joiner, Aggregator.

Unit 10: Sorter, Aggregator and Self-Join

222 Informatica PowerCenter 8 Level I Developer

Page 245: PC8LID 20061204 Large for Printing

Lesson 10-4. Data Concatenation

Data concatenation brings together different pieces of the same record (row). Data concatenation works only if combining branches of the same source pipeline. For example, one branch has a customer ID and the other branch has the customer name. But if either branch contains an active transformation, the correspondence between the branches no longer exists.

Unit 10: Sorter, Aggregator and Self-Join

Informatica PowerCenter 8 Level I Developer 223

Page 246: PC8LID 20061204 Large for Printing

Lesson 10-5. Self-Join

Description

The Joiner transformation combines fields from two data sources into a single combined data source based on one or more common fields also know as the join condition. However when values to be combined are located within the same pipeline a self join provides a solution. The two pipelines being joined need to be sorted in the same order.

Business Purpose

A business may have to extract data from a single employee master table with employee data such as names, title, salary and reporting department and create a new table showing only those employees whose salary is greater than the average salary for the department.

Example

The following example loads employee data with appropriate links to the data records for the employees’ managers.

Unit 10: Sorter, Aggregator and Self-Join

224 Informatica PowerCenter 8 Level I Developer

Page 247: PC8LID 20061204 Large for Printing

Key Points

♦ The inputs to the Joiner from the single source must separate into two data streams.

♦ For self-joins between two branches of the same pipeline.

♦ Must add a transformation between the Source Qualifier and the Joiner in at least one branch of the pipeline.

♦ Data must be pre-sorted by the join key.

♦ Configure the Joiner to accept sorted input.

♦ For self-joins between records from the same source. Create two instances of the source and join the pipelines from each source.

Performance Considerations

There is a performance benefit in a self join since it requires both the master and detailed side to be sorted.

Unit 10: Sorter, Aggregator and Self-Join

Informatica PowerCenter 8 Level I Developer 225

Page 248: PC8LID 20061204 Large for Printing

Unit 10: Sorter, Aggregator and Self-Join

226 Informatica PowerCenter 8 Level I Developer

Page 249: PC8LID 20061204 Large for Printing

Unit 10 Lab: Reload the Employee Staging Table

Business Purpose

Mersche Motors employee data has been loaded into the STG_EMPLOYEES table but after validating the data it was determined that data was missing. Although the lookup to the salaries.txt file and Workflow were successful the developer noticed that there is no data in the DEALERSHIP_MANAGER column of the target table. By leveraging the previous mapping that initially loaded the Employee data, the developer must put the Dealership Manager's full name in the DEALERSHIP_MANAGER column.

Technical Description

We will copy the m_STG_EMPLOYEES_xx mapping created in a previous lab and modify it to derive the Manager Name and load it into the DEALERSHIP_MANAGER column of the STG_EMPLOYEES table. To do this we will have to split the data into two streams. One stream will have all employee records and the other will have only manager records that will need to be joined back together using the manager records as the master. On the Manager stream we will filter on the POSITION_TYPE column for MANAGER records and relate them back to the SALESREP records using the DEALERSHIP_ID. This is necessary because there is only one Manager per dealership. We will also need to maintain the Lookup with respect to the salaries.txt file to ensure that salary data is still populated.

Goals

♦ Leverage an existing Mapping to solve a data integrity issue

♦ Split the data stream and use a self-join to bring it back together

♦ Copy and modify an existing reusable Expression transformation

Duration

70 Minutes

Unit 10 Lab: Reload the Employee Staging Table

Informatica PowerCenter 8 Level I Developer 227

Page 250: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

LOOKUPS

Mapping Name m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD_xx

Source System Flat File Target System Oracle Table

Initial Rows 109 Rows/Load 109

Short DescriptionFile list will be read, source data will be reformatted and salary information for each employee will be added.

Determine the names of the Managers and populate the DEALERSHIP_MANAGER column.

Load Frequency Daily

Preprocessing Target Append

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

DEALERSHIP_ID, EMPLOYEE_ID

Files

File Name File Location Fixed/Delimited Additional File Info

employees_central.txt,

employees_east.txt,

employees_west.txt

Definition in employees_layout.txt

C:\pmfiles\SrcFiles Delimited These 3 comma delimited flat files will

be read into the session using a filelist

employees_list.txt.

The layout of the flat files can be found

in employees_layout.txt.

employees_list.txt C:\pmfiles\SrcFiles File list

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

STG_EMPLOYEES X

Lookup Name lkp_salary

Table salaries.txt Location C:\pmfiles\LkpFiles

Match

Condition(s)

EMPLOYEE_ID = IN_EMPLOYEE_ID

Filter/SQL

Override

Unit 10 Lab: Reload the Employee Staging Table

228 Informatica PowerCenter 8 Level I Developer

Page 251: PC8LID 20061204 Large for Printing

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

The mapping will read from three flat files contained in a file list. A reusable Expression transformation, however, needs to be copied and modified to receive the employee id from the source. Additionally, the data will be sorted by Dealership id and then the data stream will be split. The master data flow (bottom) will group by DEALERSHIP_ID in the Aggregator. This will allow only 1 row of output from the Aggregator for each unique set of group by ports. The two data flows will be concatenated with a self-join based on dealership id thereby enabling the mapping to retrieve the dealership manager for each record. A lookup to a salary text file will retrieve the salary information for each employee. The data will then be loaded into the STG_EMPLOYEES table.

Source Expression Sorter Joiner Target

Filter Aggregator

Lookup

Unit 10 Lab: Reload the Employee Staging Table

Informatica PowerCenter 8 Level I Developer 229

Page 252: PC8LID 20061204 Large for Printing

SO

UR

CE

TO

TA

RG

ET

FIE

LD

MA

TR

IX

Target Table

Target Column

Data type

Source File

Source Column

Expression

Default

Value if

Null

STG_EMPLOYEES

EMPLOYEE_ID

number(p,s)

employees_layout

EMPLOYEE_ID

STG_EMPLOYEES

EMPLOYEE_NAME

varchar2

employees_layout

Derived

Concatenate First Name and

Last Name

STG_EMPLOYEES

EMPLOYEE_ADDRESS

varchar2

employees_layout

ADDRESS

STG_EMPLOYEES

EMPLOYEE_CITY

varchar2

employees_layout

CITY

STG_EMPLOYEES

EMPLOYEE_STATE

varchar2

employees_layout

STATE

STG_EMPLOYEES

EMPLOYEE_ZIP_CODE

number(p,s)

employees_layout

ZIP_CODE

STG_EMPLOYEES

EMPLOYEE_COUNTRY

varchar2

employees_layout

COUNTRY

STG_EMPLOYEES

EMPLOYEE_PHONE_NMBR

varchar2

employees_layout

Derived

The PHONE_NUMBER column

is in the format of 9999999999

and needs to be reformatted to

(999) 999-9999.

STG_EMPLOYEES

EMPLOYEE_FAX_NMBR

varchar2

employees_layout

FAX_NUMBER

STG_EMPLOYEES

EMPLOYEE_EMAIL

varchar2

employees_layout

EMAIL

STG_EMPLOYEES

EMPLOYEE_GENDER

varchar2

employees_layout

Derived

GENDER is currently either M or

F. It needs to be Male, Female or

UNK

STG_EMPLOYEES

AGE_GROUP

varchar2

employees_layout

Derived

The CUST_AGE_GROUP is

derived from the decoding of

AGE column. The valid age

groups are less than 20, 20 to

29, 30 to 39, 40 to 49, 50 to 60

and Greater than 60

STG_EMPLOYEES

NATIVE_LANG_DESC

varchar2

employees_layout

NATIVE_LANGUAGE

STG_EMPLOYEES

SEC_LANG_DESC

varchar2

employees_layout

SECOND_LANGUAGE

STG_EMPLOYEES

TER_LANG_DESC

varchar2

employees_layout

THIRD_LANGUAGE

STG_EMPLOYEES

POSITION_TYPE

varchar2

employees_layout

POSITION_TYPE

STG_EMPLOYEES

REGIONAL_MANAGER

varchar2

employees_layout

REGIONAL_MANAGER

Unit 10 Lab: Reload the Employee Staging Table

230 Informatica PowerCenter 8 Level I Developer

Page 253: PC8LID 20061204 Large for Printing

STG_EMPLOYEES

DEALERSHIP_ID

number(p,s)

employees_layout

DEALERSHIP_ID

STG_EMPLOYEES

DEALERSHIP_MANAGER

varchar2

employees_layout

DEALERSHIP_MANAGER

Concatenated FIRSTNAME and

LASTNAME of the manager. The

employee records are split apart

and then joined back together

based on DEALERSHIP_ID

STG_EMPLOYEES

EMPLOYEE_SALARY

number(p,s)

employees_layout

Derived

A Salary field for each Employee

ID can be found in salaries.txt.

STG_EMPLOYEES

HIRE_DATE

date

employees_layout

HIRE_DATE

STG_EMPLOYEES

DATE_ENTERED

date

employees_layout

DATE_ENTERED

Target Table

Target Column

Data type

Source File

Source Column

Expression

Default

Value if

Null

Unit 10 Lab: Reload the Employee Staging Table

Informatica PowerCenter 8 Level I Developer 231

Page 254: PC8LID 20061204 Large for Printing

Instructions

Step 1: Copy an Existing Mapping

1. Launch the Designer and sign into your assigned folder.

2. Locate the mapping m_STG_EMPLOYEES_xx in the Navigator window.

3. Copy it and rename it m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD_xx.

4. Open m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD_xx in the Mapping Designer to make it the current mapping for editing.

5. Save your work.

Step 2: Examine Source Data to Determine a Key for Self-Join

Figure 10-2 shows the employees_central.txt file. Some columns are not in view or hidden.

Which of these columns can we use to determine Manager records?

Answer: ________________________

Which of these columns can we use for a self-join condition to obtain the Dealership Manager name for the employee records?

Answer: ________________________

Figure 10-1. m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD mapping

Figure 10-2. Employee_central.txt

Unit 10 Lab: Reload the Employee Staging Table

232 Informatica PowerCenter 8 Level I Developer

Page 255: PC8LID 20061204 Large for Printing

Step 3: Prepare the New Mapping for Modification

Many of the links will need to be removed in order to build the self-join.

1. Right-click and select Arrange all and expand the Source Qualifier, Lookup and Target large enough to view all the ports and links.

2. Remove all the links to the lkp_salaries transformation and all of the links to the STG_EMPLOYEES target.

3. Rename the re_exp_Format_Name_Gender_Phone_Load_Date reusable transformation to exp_Format_Name_Gender_Phone_Load_Date_Mgr (notice the name change but the reusable transformation name that this expression is an instance of stays the same)

4. Save your work and notice that the mapping is now invalid.

Your mapping should look similar to Figure 10-4 if you Arrange all Iconic.

Step 4: Create a Sorter Transformation

1. Add a Sorter transformation to the mapping and name it srt_EMPLOYEES_DEALERSHIP_ID_DESC.

2. Select the following ports from SQ_employees_layout and drag them into the Sorter Transformation:

♦ DEALERSHIP_ID

♦ EMPLOYEE_ID

Figure 10-3. Renaming an instance of a Reusable Transformation

Figure 10-4. m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD after most links removed

Figure 10-5. Sorter Transformation Icon on Toolbar

Unit 10 Lab: Reload the Employee Staging Table

Informatica PowerCenter 8 Level I Developer 233

Page 256: PC8LID 20061204 Large for Printing

♦ ADDRESS

♦ CITY

♦ STATE

♦ ZIP_CODE

♦ COUNTRY

♦ FAX_NUMBER

♦ EMAIL

♦ NATIVE_LANGUAGE

♦ SECOND_LANGUAGE

♦ THIRD_LANGUAGE

♦ POSITION_TYPE

♦ REGIONAL_MANAGER

♦ HIRE_DATE

♦ DATE_ENTERED

3. Select all the output ports from the exp_FORMAT_NAME_GENDER_PHONE_LOAD_DATE_MGR transformation and drag them into the srt_EMPLOYEES_DEALERSHIP_ID_DESC transformation.

4. Edit the Sorter transformation.

a. On the DEALERSHIP_ID port check the checkbox in the 'Key' column to define the sort column.

b. Rename the following ports:

♦ OUT_NAME to EMPLOYEE_NAME

♦ OUT_PHONE to EMPLOYEE_PHONE

♦ OUT_GENDER to EMPLOYEE_GENDER

♦ OUT_AGE_GROUP to AGE_GROUP

5. Save your work.

Step 5: Create a Filter Transformation

The source file contains sales representatives and managers. This stream of the mapping will only contain managers.

1. Create a Filter transformation named fil_MANAGERS.

2. Link the following ports from srt_EMPLOYEES_DEALERSHIP_ID_DESC transformation to the fil_MANAGERS transformation:

♦ DEALERSHIP_ID

♦ EMPLOYEE_NAME

♦ POSITION_TYPE

3. Set the filter condition to only allow 'MANAGER' position types.

4. Save your work.

Unit 10 Lab: Reload the Employee Staging Table

234 Informatica PowerCenter 8 Level I Developer

Page 257: PC8LID 20061204 Large for Printing

Step 6: Create an Aggregator Transformation

The filtered source data may contain multiple entries for a manager. The Aggregator transformation can be used to eliminate duplicate manager records.

1. Create an Aggregator transformation named agg_MANAGERS.

2. Link the following ports from fil_MANAGERS transformation to the agg_MANAGERS transformation:

♦ DEALERSHIP_ID

♦ EMPLOYEE_NAME

3. Edit the Aggregator.

♦ On the DEALERSHIP_ID port, check the checkbox in the 'Group By' column.

♦ Under the Properties tab, check the 'Sorted Input' checkbox.

4. Save your work.

The mapping depicting the Sorter to Filter to Aggregator flow should be the same as Figure 10-7.

Step 7: Create a Joiner Transformation for the Self-Join

1. Create a Joiner transformation and name it jnr_MANAGERS_EMPLOYEES.

2. On the Properties tab set Sorted Input property to “checked.”

3. Click OK on the Edit Transformations dialogue and the click Yes on the “Join Condition is empty…” dialogue. The join condition will be set shortly.

Figure 10-6. Aggregator Transformation Icon on Toolbar

Tip: By making the DEALERSHIP_ID the group by port the Aggregator will return one row for each unique DEALERSHIP_ID. This will remove the duplicate manager rows.

Figure 10-7. Partial mapping flow depicting the flow from the Sorter to the Filter to the Aggregator

Unit 10 Lab: Reload the Employee Staging Table

Informatica PowerCenter 8 Level I Developer 235

Page 258: PC8LID 20061204 Large for Printing

4. Link all ports from the agg_MANAGERS transformation into the jnr_MANAGERS_EMPLOYEES Joiner transformation.

5. Link all ports from the srt_EMPLOYEES_DEALERSHIP_ID_DESC transformation to the jnr_MANAGERS_EMPLOYEES transformation.

6. Edit the jnr_MANAGERS_EMPLOYEES transformation:

a. Rename the two ports linked from the Aggregator transformation as follows:

♦ DEALERSHIP_ID to MANAGER_DEALERSHIP_ID

♦ EMPLOYEE_NAME to MANAGER_NAME

b. Ensure that both ports have checks under the “M” column defining them as the Master record.

c. Rename the following ports linked from the Sorter transformation:

♦ DEALERSHIP_ID1 to EMPLOYEE_DEALERSHIP_ID

♦ EMPLOYEE_NAME1 to EMPLOYEE_NAME (remove the '1')

d. Add the following join condition:

MANAGER_DEALERSHIP_ID = EMPLOYEE_DEALERSHIP_ID

7. Save your work.

Review Figure 10-8 to verify your work.

Step 8: Get Salaries from the Lookup

1. Link the EMPLOYEE_ID port from the jnr_MANAGERS_EMPLOYEES transformation to the IN_EMPLOYEE_ID port in the lkp_salaries Lookup transformation.

Figure 10-8. Split data stream joined back together

Unit 10 Lab: Reload the Employee Staging Table

236 Informatica PowerCenter 8 Level I Developer

Page 259: PC8LID 20061204 Large for Printing

Step 9: Connect the Joiner and Lookup to the Target

1. Link the following ports between jnr_MANAGERS_EMPLOYEES to STG_EMPLOYEES target:

2. Link the SALARY port from the lkp_salaries transformation to the EMPLOYEE_SALARY port in the STG_EMPLOYEES target.

Tip: Hint: Some ports can be auto-linked by name; the rest must be done manually.

MANAGER_NAME --> DEALERSHIP_MANAGER

EMPLOYEE_DEALERSHIP_ID --> DEALERSHIP_ID

EMPLOYEE_ID --> EMPLOYEE_ID

EMPLOYEE_NAME --> EMPLOYEE_NAME

ADDRESS --> EMPLOYEE_ADDRESS

CITY --> EMPLOYEE_CITY

STATE --> EMPLOYEE_STATE

ZIP_CODE --> EMPLOYEE_ZIP_CODE

COUNTRY --> EMPLOYEE_COUNTRY

EMPLOYEE_PHONE --> EMPLOYEE_PHONE_NUMBER

FAX_NUMBER --> EMPLOYEE_FAX_NUMBER

EMAIL --> EMPLOYEE_EMAIL

NATIVE_LANGUAGE --> NATIVE_LANG_DESC

SECOND_LANGUAGE --> SEC_LANG_DESC

THIRD_LANGUAGE --> TER_LANG_DESC

POSITION_TYPE --> POSITION_TYPE

REGIONAL_MANAGER --> REGIONAL_MANAGER

HIRE_DATE --> HIRE_DATE

EMPLOYEE_GENDER --> EMPLOYEE_GENDER

AGE_GROUP --> AGE_GROUP

DATE_ENTERED --> DATE_ENTERED

Unit 10 Lab: Reload the Employee Staging Table

Informatica PowerCenter 8 Level I Developer 237

Page 260: PC8LID 20061204 Large for Printing

3. Save your work.

Step 10: Create and Run the Workflow

1. Launch the Workflow Manager and sign into your assigned folder.

2. Create a new workflow named wkf_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD_xx

3. Create a session task using the m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD_xx mapping.

4. Link the Start task to the s_m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD_xx session task.

5. Edit session s_m_STG_EMPLOYEES_DEALERSHIP_MGR_LOAD_xx.

a. In the Mapping tab, confirm that Source file directory is set to $PMSourceFileDir\.

b. In Properties > Attribute > Source filename type in employees_list.txt and change Source filetype property from Direct to Indirect.

The properties should look similar to Figure 10-10.

c. Select STG_EMPLOYEES located under the Target folder in the Mapping navigator.

i. Set the relational target connection object property to NATIVE_STGxx where xx is your student number.

ii. Check the property Truncate target table option in the target properties. (this will need to be set because the data load from a previous lab needs to be replaced).

d. Select lkp_salaries from the Transformations folder on the mapping tab and verify the following property values:

♦ Lookup source file directory = $PMLookupFileDir\.

♦ Lookup source filename = salaries.txt.

6. Save your work.

Figure 10-9. Iconic view of the completed self-join mapping.

Figure 10-10. Source properties for the employee_list.txt file list

Unit 10 Lab: Reload the Employee Staging Table

238 Informatica PowerCenter 8 Level I Developer

Page 261: PC8LID 20061204 Large for Printing

7. Start the workflow.

8. Review the Task Details.

9. Review the Source/Target Statistics.

Figure 10-11. Task Details of the completed session run

Figure 10-12. Source/Target Statistics of the completed session run

Unit 10 Lab: Reload the Employee Staging Table

Informatica PowerCenter 8 Level I Developer 239

Page 262: PC8LID 20061204 Large for Printing

Data Results

Preview the target data from the Designer. Your data should appear the same as displayed in Figure 10-13 through Figure 10-14.

Note: Not all rows and columns are shown.

Figure 10-13. Data preview of the self-join of Managers and Employees in the STG_EMPLOYEES target table - screen 1

Figure 10-14. Data preview of the STG_EMPLOYEES target table - screen 2 scrolled right

Unit 10 Lab: Reload the Employee Staging Table

240 Informatica PowerCenter 8 Level I Developer

Page 263: PC8LID 20061204 Large for Printing

Unit 10 Lab: Reload the Employee Staging Table

Informatica PowerCenter 8 Level I Developer 241

Page 264: PC8LID 20061204 Large for Printing

Unit 10 Lab: Reload the Employee Staging Table

242 Informatica PowerCenter 8 Level I Developer

Page 265: PC8LID 20061204 Large for Printing

Unit 11: Router, Update Strategy and Overrides

After completing this Unit, you should be able to:

♦ Describe the following features:

♦ Router transformations

♦ Update Strategy transformations

♦ Port default values

♦ Source Qualifier overrides

♦ Target update overrides

♦ Session task mapping overrides

♦ Use these features in mappings and workflows

Lesson 11-1. Router Transformation

Type

Active.

Description

The Router transformation is similar to the Filter transformation because it passes row data that meet the Router Group filter condition to the downstream transformation or target. The Router transformation has a single input group and one or more output groups with each output group representing a filter condition.

Use of router transformations can result in improved performance compared to performing the same logic with multiple filter transformations.

Unit 11: Router, Update Strategy and Overrides

Informatica PowerCenter 8 Level I Developer 243

Page 266: PC8LID 20061204 Large for Printing

Business Purpose

A business may receive records that are re-directed to specific targets, the records are “routed” to each target based on conditions of one or more record (row) fields.

Unit 11: Router, Update Strategy and Overrides

244 Informatica PowerCenter 8 Level I Developer

Page 267: PC8LID 20061204 Large for Printing

Example

In the following example a business receives sales results based on responses to coupons featured in the local newspapers, magazines, and at their website. Each record is loaded into different target tables based on a promotion code.

In the example the “DEFAULT” group routes rows that do not meet any of the group filters to an exception table. This would capture a record where a promo code (PROMO_ID) was incorrectly entered or a new code that has not been included in a filter group.

Performance Considerations

When splitting row data based on field values a Router transformation has a performance advantage over multiple Filter transformations because a row is read once into the input group but evaluated multiple times based on the number of groups. Whereas using multiple Filter transformation requires the same row data to be duplicated for each Filter transformation.

Unit 11: Router, Update Strategy and Overrides

Informatica PowerCenter 8 Level I Developer 245

Page 268: PC8LID 20061204 Large for Printing

Lesson 11-2. Update Strategy Transformation

Type

Active.

Unit 11: Router, Update Strategy and Overrides

246 Informatica PowerCenter 8 Level I Developer

Page 269: PC8LID 20061204 Large for Printing

Description

The Update Strategy transformation “tags” a row with the appropriate DML (data manipulation language) for the PowerCenter writer to apply to a relational target. Each row can be “tagged” with one of the following flags (the DD label stands for Data Driven):

DD_INSERT - tags a row for insert to a target

DD_UPDATE - tags a row for update to a target

DD_DELETE - tags a row for delete to a target

DD_REJECT - tags a row for reject

Business Purpose

A business process may require more than a single DML action on a target table. A target table may require historical information dealing with previous entries. Rows written to a target table, based on one or more criteria, may have to be inserted, updated or deleted. The Update Strategy transformation can be applied to meet this requirement.

Example

In the following example a business wants to maintain the “MASTER_CUSTOMER” table with current information. Using a set of Filter transformations along with previous mapping objects, two data paths

Note: For the row tags DD_DELETE and DD_UPDATE, the table definition in a mapping must have a key identified otherwise the session created from that mapping will fail. Rows tagged with DD_REJECT will be passed on to the next transformation or target and subsequently placed in the appropriate “bad file” if the “Forward Rejected Rows” attribute is “checked” (default). If the attribute is “un-checked” then reject rows will be skipped.

Unit 11: Router, Update Strategy and Overrides

Informatica PowerCenter 8 Level I Developer 247

Page 270: PC8LID 20061204 Large for Printing

have been developed, one for inserts (DD_INSERT) with the addition of a sequence number for new records and one for updates (DD_UPDATE) to update existing records with new information.

Performance Considerations

The Update Strategy transformation performance can vary depending on the number of updates and inserts. In some cases there may be a performance benefit to split a mapping with updates and inserts into two mappings and sessions, one mapping with inserts and the other with updates.

Lesson 11-3. Expression Default Values

Unit 11: Router, Update Strategy and Overrides

248 Informatica PowerCenter 8 Level I Developer

Page 271: PC8LID 20061204 Large for Printing

Lesson 11-4. Source Qualifier Override

Unit 11: Router, Update Strategy and Overrides

Informatica PowerCenter 8 Level I Developer 249

Page 272: PC8LID 20061204 Large for Printing

Properties

Property Description

SQL Query Allows you to override the default SQL query that PowerCenter creates at runtime.

User Defined Join Allows you to specify a user defined join.

Source Filter Allows you to create a where clause that will be inserted into the SQL query that is generated at

runtime. The “where” portion of the statement is not required. EG. Table1.ID = Table2.ID.

Number of Sorted Ports PowerCenter will insert an order by clause in the generated SQL query. The order by will be on the

number of ports specified, from the top down. EG. In the sq_Product_Product_Cost Source Qualifier, if

the number of sorted ports = 2, the order by will be:

ORDER BY PRODUCT.PRODUCT_ID, PRODUCT.GROUP_ID.

Tracing Level Specifies the amount of detail written to the session log.

Select Distinct Allows you to select distinct values only.

Pre SQL Allows you to specify SQL that will be run prior to the pipeline being run. The SQL will be run using the

connection specified in the session task.

Post SQL Allows you to specify SQL that will be run after the pipeline has been run. The SQL will be run using the

connection specified in the session task.

Output is Deterministic Source or transformation output that does not change between session runs when the input data is

consistent between runs. When you configure this property, the Integration Service does not stage

source data for recovery if transformations in the pipeline always produce repeatable data.

Output is Repeatable Source or transformation output that is in the same order between session runs when the order of the

input data is consistent. When output is deterministic and output is repeatable, the Integration Service

does not stage source data for recovery.

Unit 11: Router, Update Strategy and Overrides

250 Informatica PowerCenter 8 Level I Developer

Page 273: PC8LID 20061204 Large for Printing

Lesson 11-5. Target Override

By default, target tables are updated based on key values. You can change this in target properties:

1. Update Override

2. Generate SQL

3. Edit UPDATE WHERE clause with non-key items

Unit 11: Router, Update Strategy and Overrides

Informatica PowerCenter 8 Level I Developer 251

Page 274: PC8LID 20061204 Large for Printing

Lesson 11-6. Session Task Mapping Overrides

You can override some mapping attributes in the Session task Mapping tab.

Examples

♦ Source readers: Turn a relational source into a flat file

Unit 11: Router, Update Strategy and Overrides

252 Informatica PowerCenter 8 Level I Developer

Page 275: PC8LID 20061204 Large for Printing

♦ User-defined join: Modify a homogeneous join in the Source Qualifier

♦ Source filters: Add a filter to the Source Qualifier

♦ Target writers: Turn a relational target into a flat file

Unit 11: Router, Update Strategy and Overrides

Informatica PowerCenter 8 Level I Developer 253

Page 276: PC8LID 20061204 Large for Printing

Unit 11: Router, Update Strategy and Overrides

254 Informatica PowerCenter 8 Level I Developer

Page 277: PC8LID 20061204 Large for Printing

Unit 11 Lab: Load Employee Dimension Table

Business Purpose

The Mersche Motors data warehouse employee table is updated on a daily basis. Source rows from the staging area need to be tested to see if a row already exists in the dimension table. Rows need to be tagged for update or insert accordingly. Any rows containing bad data will need to be written to an error file.

Technical Description

Rows from the STG_EMPLOYEES table need to be loaded into the DIM_EMPLOYEES table. Before loading the rows, EMPLOYEE_ID needs to be tested for NULL values. Invalid rows need to be written to an error file. Valid rows need to be tested to see if they exist already in DIM_EMPLOYEES and tagged for either INSERT or UPDATE accordingly. Finally, any rows sent to the DIM_EMPLOYEE table need to get valid dates from DIM_DATES.

Goals

♦ Use of Update Strategy to tag rows for INSERT or UPDATE.

♦ Use of the Router transformation to conditionally route rows to different target instances.

♦ Source Qualifier Session property override.

♦ Using the Default Values option for NULL data replacement.

♦ Overriding Target writer option.

Duration

60 minutes

Unit 11 Lab: Load Employee Dimension Table

Informatica PowerCenter 8 Level I Developer 255

Page 278: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

LOOKUPS

Mapping Name m_DIM_EMPLOYEES_LOAD_xx

Source System Oracle Table Target System Oracle Table, Flat File

Initial Rows 88, 21 Rows/Load 85, 21, 3, 0

Short DescriptionMove data from staging table to the dimension target table with error rows written to a flat file. Lookups

required for date entries and to target table to test existing rows.

Load Frequency Daily

Preprocessing Target Append/Update

Post Processing

Error Strategy Null employee_id rows written to error file

Reload Strategy

Unique Source

Fields

EMPLOYEE_ID

Tables

Table Name Schema/Owner Selection/Filter

STG_EMPLOYEES TDBUxx SQ override for daily loads only

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

DIM_EMPLOYEES X X EMPLOYEE_ID

Files

File Name File Location Fixed/Delimited Additional File Info

dim_employees_err1.outt C:\pmfiles\TgtFiles Fixed Based on DIM_EMPLOYEES definition

Lookup Name lkp_DIM_EMPLOYEES_EMPLOYEE_ID

Table DIM_EMPLOYEES Location TDBUxx

Match

Condition(s)

STG_EMPLOYEES.EMPLOYEE_ID = DIM_EMPLOYEES.EMPLOYEE_ID

Filter/SQL

Override

Unit 11 Lab: Load Employee Dimension Table

256 Informatica PowerCenter 8 Level I Developer

Page 279: PC8LID 20061204 Large for Printing

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

The DIM_EMPLOYEE table needs to be loaded from the STG_EMPLOYEES table. The STG_EMPLOYEES has two days worth of data, 01/02/2003 and 01/03/2003. The second day contains corrections to some of the first day's data. The mapping needs to be executed twice and manual SQ override will be required for both runs.

The DIM_EMPLOYEE and DIM_DATES tables will be used as Lookup tables. Any rows with a null value for employee_id need to be routed to an error file. Substitute the NULL employee_id with 99999 using the default value option.

Lookup Name lkp_DIM_DATES_INSERTS

Table DIM_DATES Location TDBUxx

Match

Condition(s)

STG_EMPLOYEES.DATE_ENTERED = DIM_DATES.DATE_VALUE

Filter/SQL

Override

Reuse persistent cache from previous lab

Lookup Name lkp_DIM_DATES_UPDATES

Table DIM_DATES Location TDBUxx

Match

Condition(s)

STG_EMPLOYEES.DATE_ENTERED = DIM_DATES.DATE_VALUE

Filter/SQL

Override

Reuse persistent cache from previous lab

Relational

SourceExpression Router

Update

StrategyRelational Target

(Inserts)

Lookup

Update

Strategy

Lookup

Flat File Target

(Errors)

Relational Target

(Updates)

Lookup

Unit 11 Lab: Load Employee Dimension Table

Informatica PowerCenter 8 Level I Developer 257

Page 280: PC8LID 20061204 Large for Printing

SO

UR

CE

TO

TA

RG

ET

FIE

LD

MA

TR

IX

Target Table

Target Column

Source Table

Source Column

Expression

Default

Value if

Null

DIM_EMPLOYEES

EMPLOYEE_ID

STG_EMPLOYEES

EMPLOYEE_ID

DIM_EMPLOYEES

EMPLOYEE_NAME

STG_EMPLOYEES

EMPLOYEE_NAME

DIM_EMPLOYEES

EMPLOYEE_ADDRESS

STG_EMPLOYEES

EMPLOYEE_ADDRESS

DIM_EMPLOYEES

EMPLOYEE_CITY

STG_EMPLOYEES

EMPLOYEE_CITY

DIM_EMPLOYEES

EMPLOYEE_STATE

STG_EMPLOYEES

EMPLOYEE_STATE

DIM_EMPLOYEES

EMPLOYEE_ZIP_CODE

STG_EMPLOYEES

EMPLOYEE_ZIP_CODE

DIM_EMPLOYEES

EMPLOYEE_COUNTRY

STG_EMPLOYEES

EMPLOYEE_COUNTRY

DIM_EMPLOYEES

EMPLOYEE_PHONE_NMBR

STG_EMPLOYEES

EMPLOYEE_PHONE_NMBR

DIM_EMPLOYEES

EMPLOYEE_FAX_NMBR

STG_EMPLOYEES

EMPLOYEE_FAX_NMBR

DIM_EMPLOYEES

EMPLOYEE_EMAIL

STG_EMPLOYEES

EMPLOYEE_EMAIL

DIM_EMPLOYEES

EMPLOYEE_GENDER

STG_EMPLOYEES

EMPLOYEE_GENDER

DIM_EMPLOYEES

AGE_GROUP

STG_EMPLOYEES

AGE_GROUP

DIM_EMPLOYEES

NATIVE_LANG_DESC

STG_EMPLOYEES

NATIVE_LANG_DESC

DIM_EMPLOYEES

SEC_LANG_DESC

STG_EMPLOYEES

SEC_LANG_DESC

DIM_EMPLOYEES

TER_LANG_DESC

STG_EMPLOYEES

TER_LANG_DESC

DIM_EMPLOYEES

POSITION_TYPE

STG_EMPLOYEES

POSITION_TYPE

DIM_EMPLOYEES

DEALERSHIP_ID

STG_EMPLOYEES

DEALERSHIP_ID

DIM_EMPLOYEES

REGIONAL_MANAGER

STG_EMPLOYEES

REGIONAL_MANAGER

DIM_EMPLOYEES

DEALERSHIP_MANAGER

STG_EMPLOYEES

DEALERSHIP_MANAGER

DIM_EMPLOYEES

INSERT_DK

DIM_DATES

DATE_KEY

Lookup to DIM_DATES table matching the date

entered column from the STG_EMPLOYEES to

the date value column in the DIM_DATES table

DIM_EMPLOYEES

UPDATE_DK

DIM_DATES

DATE_KEY

Lookup to DIM_DATES table matching the date

entered column from the STG_EMPLOYEES to

the date value column in the DIM_DATES table

Unit 11 Lab: Load Employee Dimension Table

258 Informatica PowerCenter 8 Level I Developer

Page 281: PC8LID 20061204 Large for Printing

Instructions

Step 1: Copy the Mapping

1. Launch the Designer and open your assigned folder.

2. Copy the m_DIM_EMPLOYEES_LOAD partial mapping from the DEV_SHARED folder to your student folder and rename it to m_DIM_EMPLOYEES_LOAD_xx.

3. Click Yes when the Target Dependencies dialog box comes up.

4. Save your work.

Step 2: Edit the Expression Transformation

1. Open the mapping m_DIM_EMPLOYEES_LOAD_xx.

Your mapping should appear similar to Figure 11-2.

2. Edit the exp_NULL_EMPLOYEE_ID Expression transformation and add a Default value of 99999 to the EMPLOYEE_ID port.

3. Click the button to validate the default entry and the click OK.

Figure 11-1. Mapping copy Target Dependencies dialog box

Figure 11-2. Iconic view of the m_DIM_EMPLOYEES_MAPPING

Unit 11 Lab: Load Employee Dimension Table

Informatica PowerCenter 8 Level I Developer 259

Page 282: PC8LID 20061204 Large for Printing

4. Save your work.

Step 3: Create a Router Transformation

The Router transformation is going to be used to determine which rows will be inserted, updated or sent to the error file. This will be done by checking the value of the EMPLOYEE_ID port.

1. Add a Router transformation to the mapping.

2. Drag both ports from lkp_DIM_EMPLOYEES_EMPLOYEE_ID into the Router.

3. Drag all ports except EMPLOYEE_ID from exp_NULL_EMPLOYEE_ID to the Router.

4. Edit the Router transformation.

a. Rename the Router to rtr_DIM_EMPLOYEES.

b. In the Groups tab add 3 new groups using the Add new group icon:

i. Name the first group INSERTS.

ii. Add the Group filter condition:

ISNULL(EMPLOYEE_ID) AND IN_EMPLOYEE_ID != 99999

iii. Name the second group UPDATES.

iv. Add the Group Filter Condition:

NOT ISNULL(EMPLOYEE_ID) AND IN_EMPLOYEE_ID != 99999

v. Name the third group ERRORS.

vi. Add the Group Filter Condition:

IN_EMPLOYEE_ID = 99999

Router should look similar to Figure 11-3.

Step 4: Create an Update Strategy for INSERTS

1. Add an Update Strategy transformation named upd_INSERTS to the mapping.

Figure 11-3. Router Groups

Unit 11 Lab: Load Employee Dimension Table

260 Informatica PowerCenter 8 Level I Developer

Page 283: PC8LID 20061204 Large for Printing

2. In the Router, scroll down to the INSERTS Router group and drag all ports, except EMPLOYEE_ID1 and HIRE_DATE1, to the upd_INSERTS Update Strategy transformation.

3. Edit the upd_INSERTS Update Strategy transformation:

a. Rename the IN_EMPLOYEE_ID1 port to EMPLOYEE_ID.

b. In the Properties tab:

i. Select the Update Strategy Expression Value box. Delete the 0 and enter DD_INSERT. See Figure 11-4.

Step 5: Create Lookup to DIM_DATES

1. Create a Lookup transformation named lkp_DIM_DATES_INSERTS that references the SC_DIM_DATES target table.

2. Pass DATE_ENTERED1 from upd_INSERTS to lkp_DIM_DATES_INSERTS.

3. Edit the lkp_DIM_DATES_INSERTS Lookup transformation:

a. Uncheck all the Output checkmarks on all the ports except for DATE_KEY.

b. Rename the DATE_ENTERED1 port to IN_DATE_ENTERED.

c. Create the condition DATE_VALUE = IN_DATE_ENTERED. Ensure that you use DATE_VALUE and not DATE_KEY.

d. In the Properties tab set the following values:

♦ Lookup cache persistent = Checked (needs to be set)

♦ Cache File Name Prefix = LKPSTUxx (where xx is your student number)

Figure 11-4. Update Strategy set to INSERT

Unit 11 Lab: Load Employee Dimension Table

Informatica PowerCenter 8 Level I Developer 261

Page 284: PC8LID 20061204 Large for Printing

Step 6: Link upd_INSERTS and lkp_DIM_DATES_INSERTS to Target DIM_EMPLOYEE_INSERTS

1. Link the DATE_KEY port from lkp_DIM_DATES_INSERTS to the INSERT_DK column in the DIM_EMPLOYEES_INSERTS target.

2. Right click anywhere in the workspace and select Autolink…

3. Select upd_INSERTS from the From transformation drop box and DIM_EMPLOYEES_INSERTS from the To transformation box. Select the More button and enter a '1' for From Transformation Suffix.

4. Click OK.

5. Iconize the upd_INSERTS, lkp_DIM_DATES_INSERTS and DIM_EMPLOYEES_INSERTS transformations.

6. Save your work.

Step 7: Create an Update Strategy for UPDATES

1. Create an Update Strategy transformation named upd_UPDATES.

2. In the Router, scroll down to the UPDATES Router group and drag all ports, except IN_EMPLOYEE_ID3 and HIRE_DATE3, to the upd_UPDATES Update Strategy transformation.

3. Edit the upd_UPDATES Update Strategy transformation.

4. In the Properties tab, select the Update Strategy Expression Value box. Delete the 0 and enter DD_UPDATE.

Step 8: Create Second Lookup to DIM_DATES

1. Right click on the existing lkp_DIM_DATES_INSERTS Lookup transformation and select Copy.

2. Move the cursor to the workspace, right click, and select Paste.

3. Link DATE_ENTERED3 from upd_UPDATES to IN_DATE_ENTERED in the new Lookup transformation.

4. Edit the new Lookup transformation:

a. Rename the new Lookup lkp_DIM_DATES_UPDATES.

b. Ensure the Lookup condition is: DATE_VALUE = IN_DATE_ENTERED.

Step 9: Link upd_UPDATES and lkp_DIM_DATES_UPDATES to Target DIM_EMPLOYEE_UPDATES

1. From lkp_DIM_DATES_UPDATES, link DATE_KEY to UPDATE_DK in DIM_EMPLOYEES_UPDATES.

2. Right click anywhere in the workspace and select Autolink.

3. Select upd_UPDATES from the From transformation drop box and DIM_EMPLOYEES_UPDATES from the To transformation box. Select the More button and enter a '3' for From transformation Suffix.

4. Click OK.

Unit 11 Lab: Load Employee Dimension Table

262 Informatica PowerCenter 8 Level I Developer

Page 285: PC8LID 20061204 Large for Printing

5. Iconize the upd_UPDATES, lkp_DIM_DATES_UPDATES and DIM_EMPLOYEES_UPDATES transformations.

6. Save your work.

Step 10: Link ERRORS Router Group to DIM_EMPLOYEES_ERR

Using Autolink…

1. Select the ERRORS group of rtr_DIM_EMPLOYEES from the From Transformation drop down box and DIM_EMPLOYEES_ERR from the To Transformation box. Select the More>> button and enter a '4' for From Transformation Suffix.

2. Click OK.

3. Delete the link for EMPLOYEE_ID4 and link instead IN_EMPLOYEE_ID4.

4. Save your work and ensure the mapping is VALID.

5. Arrange All Iconic and the mapping should look similar to Figure 11-5:

Step 11: Create and Run the Workflow

The first thing that we need to do is to run a pre-created workflow that loads three dimension tables.

1. Launch the Workflow Manager and sign into your assigned folder.

2. Locate and run the wkf_U11_Preload_DIM_PAYMENT_DEALERSHIP_PRODUCT_xx workflow. Make sure that it completed successfully and that all rows were successful.

3. Create a workflow named wkf_DIM_EMPLOYEES_LOAD_xx.

4. Add a new Session task named s_m_DIM_EMPLOYEES_LOAD_xx, using the m_DIM_EMPLOYEES_LOAD_xx mapping.

5. Link the Start task to the new Session task.

Figure 11-5. Iconic view of the completed mapping

Unit 11 Lab: Load Employee Dimension Table

Informatica PowerCenter 8 Level I Developer 263

Page 286: PC8LID 20061204 Large for Printing

6. Edit the Session task Mapping tab:

a. Select the node in the navigation window.

i. Change all DB Connection values that relate to the target tables (DIM) to NATIVE_EDWxx.

ii. Change all DB Connection values that relate to the source tables (STG) to NATIVE_STGxx.

iii. Change the $Target connection value to NATIVE_EDWxx as well. (This will take care of the three lookup tables pointing to $Target.)

b. In the Mapping tab navigator window:

i. Click on SQ_STG_EMPLOYEES.

ii. Scroll down in the Properties section window to the Source Filter attribute.

iii. Add the Source Filter condition: DATE_ENTERED = '01/02/2003'

c. Click on the target DIM_EMPLOYEE_ERR.

i. Under the Writers section: Change Relational Writer to File Writer. The error handling specifications want error rows written to a file, not a table.

Figure 11-6. Source Filter Value

Tip: It is sometimes easier to add a quick Source filter in the Session than to go back and modify the mapping, save it, refresh the session, save it, then run the workflow. SQL overrides will override any entries in the mapping until the override is deleted. Make sure if using 'shortcuts' the prefix to the table is deleted before saving the filter.

Unit 11 Lab: Load Employee Dimension Table

264 Informatica PowerCenter 8 Level I Developer

Page 287: PC8LID 20061204 Large for Printing

ii. In the Properties Attribute, rename the Output filename to include your student number. .

7. Save your work and start the workflow.

8. Review the Task Details and Source/Target statistics. They should be the same as displayed in Figure 11-8 and Figure 11-9.

Figure 11-7. Writers section of Target schema

Tip: To create a flat file as a target instead of the original table, simply change the Writers type from Relational to File. A fixed width flat file based on the format of the target definition will be created automatically. The properties of this file can also be altered by the user.

Figure 11-8. Task Details of the completed session run

Figure 11-9. Source/Target Statistics

Unit 11 Lab: Load Employee Dimension Table

Informatica PowerCenter 8 Level I Developer 265

Page 288: PC8LID 20061204 Large for Printing

Data Results

Preview the DIM_EMPLOYEES target data from the Designer, your data should appear similar as displayed in Figure 11-10.

Scroll all the way to the right to confirm that the INSERT_DK column was updated and not the UPDATE_DK column.

Also, you may want to review the three rows that were written to the error file. See the instructor for the location of the files. If the Integration Service process runs on UNIX, you may need special permission from your administrator to see the files.

Step 12: Prepare, Run, and Monitor the Second Run

1. Edit the s_m_DIM_EMPLOYEES_LOAD_xx session task.

2. In the Mapping tab, click SQ_STG_EMPLOYEES in the Navigation window.

3. Scroll down the Properties section and edit the Source filter to reflect day two loading: 01/03/2003.

4. Save and run the workflow.

5. Review the Task Details and Source/Target statistics.

Figure 11-10. Data Results for DIM_EMPLOYEES

Figure 11-11. Data Results for the Error Flat File (Located on the Machine Hosting the Integration Service Process

Unit 11 Lab: Load Employee Dimension Table

266 Informatica PowerCenter 8 Level I Developer

Page 289: PC8LID 20061204 Large for Printing

They should be the same as displayed in Figure 11-12 and Figure 11-13.

Figure 11-12. Task Details tab results for second run

Figure 11-13. Source/Target Statistics for second run

Unit 11 Lab: Load Employee Dimension Table

Informatica PowerCenter 8 Level I Developer 267

Page 290: PC8LID 20061204 Large for Printing

Preview the DIM_EMPLOYEES target data from the Designer. Scroll to the far right of the data screen and notice that there are now entries for UPDATE_DK and new entries at the bottom of the list for INSERT_DK.

Figure 11-14. Data preview showing updates to the target table

Unit 11 Lab: Load Employee Dimension Table

268 Informatica PowerCenter 8 Level I Developer

Page 291: PC8LID 20061204 Large for Printing

Unit 11 Lab: Load Employee Dimension Table

Informatica PowerCenter 8 Level I Developer 269

Page 292: PC8LID 20061204 Large for Printing

Unit 11 Lab: Load Employee Dimension Table

270 Informatica PowerCenter 8 Level I Developer

Page 293: PC8LID 20061204 Large for Printing

Unit 12: Dynamic Lookup and Error Logging

After completing this module, you should be able to:

♦ Describe the following features:

♦ Dynamic Lookup cache

♦ Error logging

♦ Use these features in mappings and workflows

Lesson 12-1. Dynamic Lookup Cache

Type

Passive.

Description

A Basic Lookup transformation allows the inclusion of additional information in the transformation process from an external database or flat file source. However when the lookup table is also the target row data may go out of sync with the target table image loaded in memory. The Dynamic Lookup transformation allows for the synchronization of the target lookup table image in memory with its physical table in a database.

Business Purpose

In a data warehouse dimension tables are frequently updated and changes to new row data must be captured within a load cycle.

Unit 12: Dynamic Lookup and Error Logging

Informatica PowerCenter 8 Level I Developer 271

Page 294: PC8LID 20061204 Large for Printing

Example

A business updates their customer master table on a daily basis. Within a day a customer may change there status or correct an error in their information. A new customer record may be added in the morning and a change to that record may be added later in the day, the change (insert followed by an update) needs to be detected dynamically.

The following data is an example of two new records followed by two changed records within the day. The record for David Mulberry shows a change in the zip code from 02061 to 02065. The record for Silvia Williamson shows a change in marital status from “S” to “M”.

The following mapping uses a Lookup transformation Dynamic Lookup Cache option to capture the changes:

Unit 12: Dynamic Lookup and Error Logging

272 Informatica PowerCenter 8 Level I Developer

Page 295: PC8LID 20061204 Large for Printing

Dynamic Cache Properties

For more detailed explanations consult the online help.

Option Lookup Type Description

Dynamic Lookup

Cache

Relational Indicates to use a dynamic lookup cache. Inserts or updates rows in the lookup cache as it

passes rows to the target table.

Use only with the lookup cache enabled.

Output Old Value On

Update

Relational When you enable this property, the Integration Service outputs old values out of the

lookup/output ports. When the Integration Service updates a row in the cache, it outputs

the value that existed in the lookup cache before it updated the row based on the input

data. When the Integration Service inserts a new row in the cache, it outputs null values.

Insert Else Update Relational Applies to rows entering the Lookup transformation with the row type of insert. When you

select this property and the row type entering the Lookup transformation is insert, the

Integration Service inserts the row into the cache if it is new, and updates the row if it

exists. If you do not select this property, the Integration Service only inserts new rows into

the cache when the row type entering the Lookup transformation is insert.

Update Else Insert Relational Applies to rows entering the Lookup transformation with the row type of update. When you

select this property and the row type entering the Lookup transformation is update, the

Integration Service updates the row in the cache if it exists, and inserts the row if it is new.

If you do not select this property, the Integration Service only updates existing rows in the

cache when the row type entering the Lookup transformation is update.

Unit 12: Dynamic Lookup and Error Logging

Informatica PowerCenter 8 Level I Developer 273

Page 296: PC8LID 20061204 Large for Printing

Unit 12: Dynamic Lookup and Error Logging

274 Informatica PowerCenter 8 Level I Developer

Page 297: PC8LID 20061204 Large for Printing

Key Port Points

♦ The Lookup transformation “Associated Port” matches a Lookup input port with the corresponding port in the Lookup cache.

♦ The “Ignore Null Inputs for Updates” should be checked for ports where null data in the input stream may overwrite the corresponding field in the Lookup cache.

♦ The “Ignore in Comparison” should be checked for any port that is not to be compared.

♦ The flag “New Lookup Row” indicates the type of row manipulation of the cache. If an input row creates an insert in the Lookup cache the flag is set to “1”. If an input row creates an update of the lookup cache the flag is set to “2”. If no change is detected the flag is set to “0”. A Filter or Router transformation can be used with an Update Strategy transformation to set the proper row tag to update a target table.

Performance Considerations

A large lookup table may require more memory resources than available. A SQL override in the Lookup transformation can be used to reduce the amount of memory used by the Lookup cache.

New Lookup Row Description

0 The Integration Service does not update or insert the row in the cache.

1 The Integration Service inserts the row into the cache.

2 The Integration Service updates the row in the cache.

Unit 12: Dynamic Lookup and Error Logging

Informatica PowerCenter 8 Level I Developer 275

Page 298: PC8LID 20061204 Large for Printing

Lesson 12-2. Error Logging

PowerCenter recognizes the following types of errors:

♦ Transformation. An error occurs within a transformation. The data row has only passed partway through the mapping transformation logic.

♦ Data reject. The data row is fully transformed according to the mapping logic but due to a data issue, it cannot be written to the target. For example:

♦ Target database constraint violations, out-of-space errors, log space errors, null values not accepted

♦ Target table properties 'reject truncated/overflowed rows'

A data reject can also be forced by an Update Strategy.

These error types are recorded as follows:

Error Type Logging OFF (Default) Logging ON

Transformation errors All errors written to session log then row discarded Fatal errors written to session log. All errors

appended to flat file or relational tables.

Data rejects Appended to reject (.bad) file configured for

session target

Written to row error tables or file

Unit 12: Dynamic Lookup and Error Logging

276 Informatica PowerCenter 8 Level I Developer

Page 299: PC8LID 20061204 Large for Printing

Error logging is set in the Session task:

Error Log Types

Unit 12: Dynamic Lookup and Error Logging

Informatica PowerCenter 8 Level I Developer 277

Page 300: PC8LID 20061204 Large for Printing

Affects location attributes. Values are:

♦ None (no external error logging)

♦ Relational Database - produces 4 tables:

♦ PMERR_SESS: Session metadata e.g. workflow name, session name, repository name

♦ PMERR_MSG: Error messages for a row of data

♦ PMERR_TRANS: Transformation metadata e.g. transformation group name, source name, port names with datatypes

♦ PMERR_DATA: Error row and source row data in string format e.g. [indicator1: data1 | indicator2: data2]

♦ Flat File - produces one file containing session metadata followed by de-normalized error information in the following format:

Transformation || Transformation Mapplet Name || Transformation Group || Partition Index || Transformation Row ID || Error Sequence || Error Timestamp || Error UTC Time || Error Code || Error Message || Error Type || Transformation Data || Source Mapplet Name || Source Name || Source Row ID || Source Row Type || Source Data

Unit 12: Dynamic Lookup and Error Logging

278 Informatica PowerCenter 8 Level I Developer

Page 301: PC8LID 20061204 Large for Printing

Log Row Data

Log Source Row Data

Unit 12: Dynamic Lookup and Error Logging

Informatica PowerCenter 8 Level I Developer 279

Page 302: PC8LID 20061204 Large for Printing

Source row logging does not work downstream of active transformations (where output rows are not uniquely correlated with input rows).

Unit 12: Dynamic Lookup and Error Logging

280 Informatica PowerCenter 8 Level I Developer

Page 303: PC8LID 20061204 Large for Printing

Unit 12 Lab: Load Customer Dimension Table

Business Purpose

Mersche Motors data warehouse has a customer table that is loaded on a daily basis. Many customers visit dealership locations more than once on a daily basis so the warehouse logic has to be able to track multiple visits on the same day. This logic must be able to test if a customer record has already been loaded in the current run and if so, what, if anything, has changed about the customer.

Technical Description

PowerCenter will source from the staging table STG_CUSTOMERS and load the dimension table DIM_CUSTOMERS. Customer data may have more than one occurrence in the source. Data will have to be tested for new rows, existing rows and invalid rows. A Dynamic Lookup will need to be used since a customer row could occur more than once in the source. Some rows will have null data so flat file error logging will be used to capture these.

Objectives

♦ Introduce Dynamic Lookups.

♦ Reinforce the Update Strategy.

♦ Introduce error logging.

Duration

50 minutes

Unit 12 Lab: Load Customer Dimension Table

Informatica PowerCenter 8 Level I Developer 281

Page 304: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

The source staging table contains customer data that needs to be tested dynamically against the target dimension table in order to process possible duplicate customers. Based on the test results, records will be marked for insertion if new, update if row requires an update or rejection if a row is determined to be invalid. Invalid rows will be rejected during the update strategy and sent to an error logging file.

Mapping Name m_DIM_CUSTOMERS_DYN_DAILY_LOAD_xx

Source System Oracle Table Target System Oracle Table

Initial Rows 6177 Rows/Load 6147

Short Description Customer data will be loaded into the customer dimension table.

Load Frequency Daily

Preprocessing

Post Processing

Error Strategy Relational table error logging

Reload Strategy

Unique Source

Fields

CUST_ID

Tables

Table Name Schema/Owner Selection/Filter

STG_CUSTOMERS TDBUxx

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

DIM_CUSTOMERS X X X CUST_ID

Relational

SourceLookup Filter

Update

Strategy

Relational

Target

Unit 12 Lab: Load Customer Dimension Table

282 Informatica PowerCenter 8 Level I Developer

Page 305: PC8LID 20061204 Large for Printing

SOURCE TO TARGET FIELD MATRIX

Target Table Target Column Source Table Source Column ExpressionDefault Value

if Null

DIM_CUSTOMER CUST_ID STG_CUSTOMER CUST_ID

DIM_CUSTOMER CUST_NAME STG_CUSTOMER CUST_NAME

DIM_CUSTOMER CUST_ADDRESS STG_CUSTOMER CUST_ADDRESS

DIM_CUSTOMER CUST_CITY STG_CUSTOMER CUST_CITY

DIM_CUSTOMER CUST_STATE STG_CUSTOMER CUST_STATE

DIM_CUSTOMER CUST_ZIP_CODE STG_CUSTOMER CUST_ZIP_CODE

DIM_CUSTOMER CUST_COUNTRY STG_CUSTOMER CUST_COUNTRY

DIM_CUSTOMER CUST_PHONE_NMBR STG_CUSTOMER CUST_PHONE_NMBR

DIM_CUSTOMER CUST_GENDER STG_CUSTOMER CUST_GENDER

DIM_CUSTOMER CUST_AGE_GROUP STG_CUSTOMER CUST_AGE_GROUP

DIM_CUSTOMER CUST_INCOME STG_CUSTOMER CUST_INCOME

DIM_CUSTOMER CUST_E_MAIL STG_CUSTOMER CUST_E_MAIL

DIM_CUSTOMER CUST_AGE STG_CUSTOMER CUST_AGE

Unit 12 Lab: Load Customer Dimension Table

Informatica PowerCenter 8 Level I Developer 283

Page 306: PC8LID 20061204 Large for Printing

Instructions

Step 1: Create a Relational Source Definition

1. Launch the Designer and sign into your assigned folder.

2. Verify you are in the Source Analyzer tool and create a shortcut to the STG_CUSTOMERS source table found in the DEV_SHARED folder.

3. Rename it to SC_STG_CUSTOMERS.

Step 2: Create a Relational Target Definition

1. Open the Target Designer tool.

2. Create a shortcut to the DIM_CUSTOMERS target table found in the DEV_SHARED folder.

3. Rename it to SC_DIM_CUSTOMERS.

Step 3: Create a Mapping

1. Open the Mapping Designer tool.

2. Create a new mapping named m_DIM_CUSTOMERS_DYN_DAILY_LOAD_xx.

3. Add the SC_STG_CUSTOMERS relational source to the new mapping.

4. Add the SC_DIM_CUSTOMERS relational target to the new mapping.

5. Save your work.

Step 4: Create a Lookup Transformation

1. Create a new Lookup transformation using the SC_DIM_CUSTOMERS table.

2. Drag the lookup window and make it taller.

3. Select all the ports from SQ_SC_STG_CUSTOMERS and drop them on to an empty port at the bottom of the Lookup.

4. Edit the Lookup transformation.

a. Rename it lkp_DIM_CUSTOMERS.

b. Select the Properties tab.

i. Click on the Dynamic Lookup Cache value.

ii. Click on the Insert Else Update value.

c. Select the Ports tab and for all ports coming from SQ_SC_STG_CUSTOMERS prefix them with IN_ and remove the “1” from the end of the name.

d. Select the Condition tab and create the condition CUST_ID = IN_CUST_ID.

Unit 12 Lab: Load Customer Dimension Table

284 Informatica PowerCenter 8 Level I Developer

Page 307: PC8LID 20061204 Large for Printing

e. Select the Ports tab again; It should look the same as Figure 12-1. Notice the new port entry called NewLookupRow.

Figure 12-1. Port tab view of a dynamic Lookup

Note: Dynamic lookups allow for inserts and updates to take place in cache as the same operations take place against the target table.

Note: The Associated port column is there to allow the association of input ports with lookup ports of different names. This enables PowerCenter to update the Lookup Cache with correct values.

Note: NewLookupRow is used to store the values; 0, 1, 2.

0 = no change

1 = Insert

2 = Update

Unit 12 Lab: Load Customer Dimension Table

Informatica PowerCenter 8 Level I Developer 285

Page 308: PC8LID 20061204 Large for Printing

f. Under the Associated Port column, click the box where it says “N/A” and select the port names from the list that you want to associate. See Figure 12-2.

g. Associate the remaining ports.

h. Clear the Output checkmarks for all of the ports prefixed with “IN_”.

5. Click OK and save your work.

Step 5: Create a Filter Transformation

1. Create a Filter transformation named fil_ROWS_UNCHANGED.

2. Drag all output ports from the Lookup transformation to the Filter transformation.

3. Create a condition that allows all rows that are marked for update or insert, or all rows where the CUST_ID is NULL to pass through. Any rows where NewLookupRow != 0 are deemed to be inserts or updates. If you need assistance refer to the reference section at the end of the lab.

Step 6: Create an Update Strategy

1. Create an Update Strategy transformation named upd_DIM_CUSTOMERS.

2. Drag all ports from the Filter transformation to the Update Strategy transformation.

3. Edit the upd_DIM_CUSTOMERS Update Strategy transformation.

a. Add an Update Strategy Expression that marks the row as an insert, update or reject. Use the following pseudo code to construct your expression. If you need assistance refer to the reference section at the end of the lab.

If CUST_ID is NULL then reject the row Else If NewLookupRow equals 1 then mark the row for insert Else if NewLookupRow equals 2 then mark the row for update.

b. Ensure the Forward Rejected Rows option is checked. This will send any rejected rows to error logs which will be created later.

4. Autolink ports by name to the SC_DIM_CUSTOMERS target.

Figure 12-2. Port to Port Association

Tip: Refer to the Unit 11 lab for details on the Update Strategy Transformation.

Unit 12 Lab: Load Customer Dimension Table

286 Informatica PowerCenter 8 Level I Developer

Page 309: PC8LID 20061204 Large for Printing

5. Save your work.

Step 7: Create and Run the Workflow

1. Launch the Workflow Manager and sign into your assigned folder.

2. Create a new workflow named wkf_DIM_CUSTOMERS_DYN_DAILY_LOAD_xx.

3. Add a new Session task using m_DIM_CUSTOMERS_DYN_DAILY_LOAD_xx mapping.

4. Edit the s_m_DIM_CUSTOMERS_DYN_DAILY_LOAD_xx session task.

a. Set the connection value for the SQ_STG_CUSTOMERS source to your assigned NATIVE_STGxx connection object.

b. Set the connection value for the SC_DIM_CUSTOMERS target to your assigned NATIVE_EDWxx connection object.

c. In the Config Object tab:

i. Change Error Handling section for the entry Error Log Type from None to Flat File as shown in Figure 12-4.

ii. Change the Error Log File Name to PMErrorxx.log where xx refers to your student number.

5. Save your work and start the workflow.

Figure 12-3. Iconic View of the Completed Mapping

Figure 12-4. Error Log Choice Screen

Note: In a Production environment, error logging tables or files would be created in a different schema or location than the production schema or file location.

Unit 12 Lab: Load Customer Dimension Table

Informatica PowerCenter 8 Level I Developer 287

Page 310: PC8LID 20061204 Large for Printing

6. Review the Task Details, your information should appear similar to Figure 12-5.

7. Select the Source/Target Statistics tab. Your statistics should be the same as displayed in Figure 12-6.

Figure 12-5. Task Details of the Completed Session Run

Figure 12-6. Source/Target Statistics for the Session Run

Unit 12 Lab: Load Customer Dimension Table

288 Informatica PowerCenter 8 Level I Developer

Page 311: PC8LID 20061204 Large for Printing

Data Results

Preview the target data from the Designer, your data should appear the same as displayed in Figure 12-7.

Figure 12-7. Data preview of the DIM_CUSTOMERS table

Unit 12 Lab: Load Customer Dimension Table

Informatica PowerCenter 8 Level I Developer 289

Page 312: PC8LID 20061204 Large for Printing

Error Log Results

The error log is written to the BadFiles directory configured for the Integration Service process under the default name of PMErrorxx.log. Look in this location for the error log and look at the rows that were written there. The log should appear similar to Figure 12-8.

Reference

1. fil_ROWS_UNCHANGED Condition

NewLookupRow != 0 OR ISNULL(CUST_ID)

2. upd_DIM_CUSTOMERS Expression

IIF(ISNULL(CUST_ID), DD_REJECT, IIF(NewLookupRow = 1, DD_INSERT, IIF(NewLookupRow = 2, DD_UPDATE)))

Figure 12-8. Flat file error log

Unit 12 Lab: Load Customer Dimension Table

290 Informatica PowerCenter 8 Level I Developer

Page 313: PC8LID 20061204 Large for Printing

Unit 12 Lab: Load Customer Dimension Table

Informatica PowerCenter 8 Level I Developer 291

Page 314: PC8LID 20061204 Large for Printing

Unit 12 Lab: Load Customer Dimension Table

292 Informatica PowerCenter 8 Level I Developer

Page 315: PC8LID 20061204 Large for Printing

Unit 13: Unconnected Lookup, Parameters and Variables

After completing this unit, you should be able to:

♦ Describe these features:

♦ Unconnected Lookup transformation

♦ System variables

♦ Mapping parameters and variables

♦ Use these features in mappings and workflows

Lesson 13-1. Unconnected Lookup Transformations

Unit 13: Unconnected Lookup, Parameters and Variables

Informatica PowerCenter 8 Level I Developer 293

Page 316: PC8LID 20061204 Large for Printing

Type

Passive.

Description

The unconnected Lookup transformation allows the inclusion of additional information in the transformation process from an external database or flat file source when it is referenced within any transformation that supports expressions.

Unit 13: Unconnected Lookup, Parameters and Variables

294 Informatica PowerCenter 8 Level I Developer

Page 317: PC8LID 20061204 Large for Printing

Business Purpose

A source table or file may have a percentage of records with incomplete data. The holes in the data can be filled by performing a look up to another table or tables. As only a percentage of the rows are affected it is better to perform the look up on only those rows that need it and not the entire data set.

Example

In the following example an insurance business received records of policy renewals, a small percentage of records have the CUSTOMER_ID field data missing. The following mapping uses an Unconnected Lookup transformation to fill in the missing data.

Unit 13: Unconnected Lookup, Parameters and Variables

Informatica PowerCenter 8 Level I Developer 295

Page 318: PC8LID 20061204 Large for Printing

Key Points

♦ Use the lookup function within a conditional statement.

♦ The condition is evaluated for each row but the lookup function is only called if the condition evaluates to TRUE.

♦ The unconnected Lookup transformation is called using the key expression :lkp.lookupname.

♦ Data from several input ports may be passed to the Lookup transformation but only one port may be returned.

♦ An Unconnected Lookup transformation returns on one value designated by the Lookup transformation R (return) port.

♦ If the R port is not checked that mapping will be valid but the session created from the mapping will fail at run time.

Performance Considerations

Using a cached Lookup attribute can improve performance if the Lookup table is static.

Connected versus Unconnected Lookup Transformations

Unit 13: Unconnected Lookup, Parameters and Variables

296 Informatica PowerCenter 8 Level I Developer

Page 319: PC8LID 20061204 Large for Printing

Joins versus Lookups

Lesson 13-2. System Variables

Unit 13: Unconnected Lookup, Parameters and Variables

Informatica PowerCenter 8 Level I Developer 297

Page 320: PC8LID 20061204 Large for Printing

Description

System variables hold information that is derived from the system. The user cannot control the content of the variable but can use the information contained within the variable. Three variables that we will discuss are described in the table shown below.

Business Purpose

The main reason that system variables are utilized to build mappings in PowerCenter is that they can provide consistency to program execution. Business and systems professionals will find this very useful when building systems.

Example

Setting a port to the system date.

To set a value of a port to the system date the developer needs to do this in an expression within a transformation. For this example we will set the DATE_UPDATED port to the system date.

Port: DATE_UPDATED

Datatype: Date

Expression: SYSDATE

Variable Description

SESSSTARTTIME The time that the session starts execution. This is based on the time of the Integration Service.

SYSDATE The current date/time on the system that PowerCenter is running on.

$$$SESSSTARTTIME The Session Start time returned as a string.

Unit 13: Unconnected Lookup, Parameters and Variables

298 Informatica PowerCenter 8 Level I Developer

Page 321: PC8LID 20061204 Large for Printing

Lesson 13-3. Mapping Parameters and Variables

Description

A mapping can utilize parameters and variables to store information during the execution. Each parameter and variable is defined with a specific data type. Parameters are different from variables in that the value of a parameter is fixed during the run of the mapping while the value for a variable can change. Both parameters and variables can be accessed from anywhere in the mapping.

To create a parameter or variable, select Mapping>Parameters and Variables from within the Mapping Designer in the Designer client.

Unit 13: Unconnected Lookup, Parameters and Variables

Informatica PowerCenter 8 Level I Developer 299

Page 322: PC8LID 20061204 Large for Printing

Scope

Parameters and variables can only be utilized inside of the object that they are created in. For instance a mapping variable created for mapping_1 can only be seen and used in mapping_1 and is not available in any other mapping or mapplet. A parameter or variable's scope is the mapping in which it was created. As a general rule for Informatica, when a variable is created its scope is relative to the object in which it was created.

Unit 13: Unconnected Lookup, Parameters and Variables

300 Informatica PowerCenter 8 Level I Developer

Page 323: PC8LID 20061204 Large for Printing

User-defined variable and parameter names must always begin with $$.

$$PARAMETER_NAME or $$VARIABLE_NAME

To change the value of a variable, you must use one of the following functions within an expression:

At the end of a successful session, the values of variables are saved to the repository. The SetVariable function writes the final value of a variable to the repository based on the Aggregation Type selected when the variable was defined. The final value written to the repository is not necessarily the last value processed by the SetVariable function. The final value written to the repository for a variable that has an Aggregate type of Max will be whichever value is greater, current value or initial value. The final value for a variable with a MIN Aggregation Type will be whichever value is smaller, current value or initial value.

Function Name Usage Notes Example

SetVariable Sets the variable to a value that you specify (executes only if a row is marked as insert or update). At the end of a successful session, the Integration Service saves either the MAX or MIN of (start value.final value) to the repository, depending on the aggregate type of the variable. Unless overridden, it uses the saved value as the start value of the variable for the next session run.

SetVariable($$VAR_NAME, 1)

SetCountVariable Increments a counter variable. If the Row Type is Insert increment +1, if Row Type is Delete increment -1. 0 for Update and Reject.

SetCountVariable($$COUNT_VAR)

SetMaxVariable Compare current value to value passed into the function. Returns the higher value and sets the current value to the higher value.

SetMaxVariable($$MAX_VAR,10)

SetMinVariable Compare current value to the value passed into the function. Returns the lower value and sets the current value to the lower value.

SetMinVariable($$MIN_VAR,10)

Unit 13: Unconnected Lookup, Parameters and Variables

Informatica PowerCenter 8 Level I Developer 301

Page 324: PC8LID 20061204 Large for Printing

Variable Definition

The Integration Service determines the value of a variable by checking for it in a specific order. The following table describes the order of precedence.

Parameter Definition

The Integration Service determines the value of a parameter by checking for it in a specific order. The following table describes the order of precedence.

Purpose

Mapping variables and parameters are used:

♦ To simplify mappings by carrying information within or between transformations.

Number Item Description

1 Parameter File This file can hold information about definitions of variables and parameters

2 Repository Saved Value Values for variables that were saved in the repository upon the successful completion of a

session.

3 Initial Value The Initial value as defined by the user.

4 Default Value The Default value set by the system.

Number Item Description

1 Parameter File This file can hold information about definitions of variables and parameters

2 Initial Value The Initial value as defined by the user.

3 Default Value The Default value set by the system.

Unit 13: Unconnected Lookup, Parameters and Variables

302 Informatica PowerCenter 8 Level I Developer

Page 325: PC8LID 20061204 Large for Printing

♦ To improve maintainability by allowing quick changes to values within a mapping.

We will discuss two examples, one using a variable and one using a parameter. The first example uses a variable to implement incremental extracts from relational sources. The second example uses parameters to replace “naked” numbers and strings within expressions.

Example 1

Tracking Last Execution Date

To set up a mapping to perform an incremental extract, we will utilize a variable to track when the mapping was last executed. We will then use this variable as part of the SQL that extracts the data to ensure that we only pick up new and modified records. The variable will be updated to today's date when the mapping is complete so that we can use it the next time we run.

The SQL WHERE clause will be modified in the Source Qualifier Transformation. The following is an example of a statement that could become part of a Source Qualifier Filter.

F1_LAST_UPDATE_DATE >= '$$LAST_RUN_DT'

Where:

♦ F1_LAST_UPDATE_DATE is a database field that contains the date when the record was last touched

♦ $$LAST_RUN_DT is a user created mapping variable that holds the value of the date of the last run. Note that this variable is surrounded by single quotes. The quotes are required so that the SQL syntax will be proper.

To set the value of $$LAST_RUN_DT, we will use the function Setvariable.

Setvariable( $$LAST_RUN_DT, SESSSTARTTIME)

Example 2

Replacement of Nameless Numbers

It is not always wise to embed a number or character string into expressions because the support team may not understand the meaning of the number or character string. To help eliminate misunderstandings, use parameters to leave a better record of how the value is derived.

Without a parameter, the expression might be:

IIF(ISNULL(SOLD_DT),TO_DATE('1/1/3000','MM/DD/YYYY')

Someone could misinterpret this statement, possibly thinking it might be a mistake. But if we used a parameter then there would less chance of a misunderstanding. For instance, the following statement is much clearer.

IIF(ISNULL(SOLD_DT),$$OFFICIAL_DEFAULT_DT)

$$OFFICIAL_DEFAULT_DT is equal to '1/1/3000'. If all mappings used the same parameter and a common parameter file, then it would be easy to ensure that all processes used the same value. This would ensure

Unit 13: Unconnected Lookup, Parameters and Variables

Informatica PowerCenter 8 Level I Developer 303

Page 326: PC8LID 20061204 Large for Printing

consistency. Additional examples of ways mappings could utilize variables or parameters to replace nameless numbers and strings is shown in the following table.

Variable/Parameter Usage Examples

Reason/GoalPotential

Value

Param

Or VarName

Replace Naked Numbers with Number e.g. an expression that determines if tech support cases have been open greater than 100 days.

100 Param $$MAX_NUM_DAYS_OPEN

Replace Naked Characters - Set the value of Processing Center where the session is executed using a variable that is defined in the mapping and has its value set in a parameter file.

'US’ Var $$REG_PROC_LOCATION

Consistency - Utilize parameters to make sure that everyone uses the same value in expressions. Create two parameters that represent yes and no. Have all mappings use the same values via a parameter file.

'Y'‘N’

ParamParam

$$YES_1_CHAR$$NO_1_CHAR

Unit 13: Unconnected Lookup, Parameters and Variables

304 Informatica PowerCenter 8 Level I Developer

Page 327: PC8LID 20061204 Large for Printing

Unit 13 Lab: Load Sales Fact Table

Business Purpose

Mersche Motors dealerships sometimes give aggressive discounts that are outside the authorized range. These type discounts are a small percentage compared to the number of rows being processed but the information needs to be processed accordingly. Also, even though the source staging table contains 7 days of data, this production run will load the entire staging table into the SALES_FACT table in a single workflow.

Technical Description

The information needed resides in two separate staging tables. To compound this, the relationship between the two tables does not exist on the database. Referential integrity will have to be created within PowerCenter. Special formulas are needed to process the discounts out of range. To make this more efficient the use of mapping parameters and variables will be used.

Objectives

♦ Unconnected Lookup transformation

♦ Aggregator transformation

♦ Mapping parameters

Duration

35 Minutes

Unit 13 Lab: Load Sales Fact Table

Informatica PowerCenter 8 Level I Developer 305

Page 328: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

LOOKUPS

Mapping Name m_FACT_SALES_LOAD_xx

Source System Oracle Tables Target System Oracle Table

Initial Rows 5475, 5 Rows/Load 5441

Short DescriptionWill have to join two tables to get the payment id. This relationship does not exist in the RDBMS so it will

need to be created in PowerCenter.

Load Frequency Daily

Preprocessing Target Append

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

Tables

Table Name Schema/Owner Selection/Filter

STG_TRANSACTIONS

STG_PAYMENT

TDBUxx

TDBUxx

Where

STG_TRANSACTIONS.PAYMENT_DESC =

STG_PAYMENT.PAYMENT_TYPE_DESC

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

FACT_SALES X X X CUST_ID,

PRODUCT_KEY,

DEALERSHIP_ID,

PROMO_ID,

DATE_KEY

Lookup Name lkp_DIM_DATES

Table DIM_DATES Location TDBUxx

Match

Condition(s)

STG_TRANSACTIONS.TRANSACTION_DATE = DIM_DATES.DATE_VALUE

Filter/SQL

Override

Reuse persistent cache

Unit 13 Lab: Load Sales Fact Table

306 Informatica PowerCenter 8 Level I Developer

Page 329: PC8LID 20061204 Large for Printing

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

The mapping will join two relational tables that do not have a PK-FK relationship. This relationship will have to be created within PowerCenter. There are lookups required to get a date key and promotion indicator. An unconnected Lookup will be used in situations where a valid discount value needs to be obtained. Also, there are a number of values that need to be derived before being loaded into the FACT_SALES table.

Lookup Name lkp_DIM_PROMOTIONS

Table DIM_PROMOTIONS Location TDBUxx

Match

Condition(s)

STG_TRANSACTIONS.PROMO_ID = DIM_PROMOTIONS.PROMO_ID

Lookup Type Unconnected

Lookup Name lkp_DIM_PRODUCT

Table DIM_PRODUCT Location TDBUxx

Match

Condition(s)

STG_TRANSACTIONS.PRODUCT_ID = DIM_PRODUCT.PRODUCT_ID

Filter/SQL

Override

Source1

Expression Aggregator Target

Lookup

Source2

Lookup

Lookup

Unit 13 Lab: Load Sales Fact Table

Informatica PowerCenter 8 Level I Developer 307

Page 330: PC8LID 20061204 Large for Printing

SO

UR

CE

TO

TA

RG

ET

FIE

LD

MA

TR

IX

Target Table

Target Column

Source Table

Source Column

Expression

Default

Value if

Null

FACT_SALES

CUST_ID

STG_TRANSACTIONS

CUST_ID

FACT_SALES

PRODUCT_KEY

DIM_PRODUCT

PRODUCT_KEY

Lookup from STG_TRANSACTIONS to

DIM_PRODCUT using PRODUCT_ID as the lookup

value.

FACT_SALES

DEALERSHIP_ID

STG_TRANSACTIONS

DEALERSHIP_ID

FACT_SALES

PAYMENT_ID

STG_PAYMENT

PAYMENT_ID

Source Qualifier join on payment description.

(stg_transactions/stg_payment)

FACT_SALES

PROMO_ID

STG_TRANSACTIONS

PROMO_ID

FACT_SALES

DATE_KEY

DIM_DATES

DATE_KEY

Lookup from STG_TRANSACTIONS to DIM_DATES

using TRANSACTION_DATE as the lookup value.

FACT_SALES

UNITS_SOLD

STG_TRANSACTIONS

Derived

sum of SALES_QTY

FACT_SALES

REVENUE

STG_TRANSACTIONS

Derived

sum of (SELLING_PRICE * SALES_QTY) - DISCOUNT

- HOLDBACK - REBATE)

FACT_SALES

COST

STG_TRANSACTIONS

Derived

sum (UNIT_COST * SALES_QTY)

FACT_SALES

DISCOUNT

STG_TRANSACTIONS

STG_PROMOTIONS

DISCOUNT/Derived

If the discount is > 17.75 then look up to the

STG_PROMOTIONS table to select a discount rate.

The discount is the discount rate divided by 100 times

the selling price.

FACT_SALES

HOLDBACK

STG_TRANSACTIONS

HOLDBACK

FACT_SALES

REBATE

STG_TRANSACTIONS

REBATE

Unit 13 Lab: Load Sales Fact Table

308 Informatica PowerCenter 8 Level I Developer

Page 331: PC8LID 20061204 Large for Printing

Instructions

Step 1: Create an Internal Relationship Between two Source Tables

1. Launch the Designer and sign into your assigned folder.

2. Drag the STG_TRANSACTIONS and STG_PAYMENT relational source tables into the Source Analyzer workspace.

3. The PAYMENT_DESC column from STG_TRANSACTIONS and the PAYMENT_TYPE_DESC column of the STG_PAYMENT table are logically related so we can build a join on them. They both contain the payment type description.

Link the PAYMENT_DESC column from STG_TRANSACTIONS to the PAYMENT_TYPE_DESC column of the STG_PAYMENT table. This will create a PK-FK relationship between the two tables.

Your source definitions should look the same as displayed in Figure 13-1.

Step 2: Create a Mapping Parameter

1. Open the mapping named m_FACT_SALES_LOAD_xx.

2. Add a mapping parameter by clicking Mappings > Parameters and Variables.

Note: Creating the PK_FK relationship within the Source Analyzer does not create this relationship on the actual database tables. The relationship is created on the source definitions within PowerCenter only.

Figure 13-1. Source Analyzer view of the STG_TRANSACTIONS and STG_PAYMENT tables

Unit 13 Lab: Load Sales Fact Table

Informatica PowerCenter 8 Level I Developer 309

Page 332: PC8LID 20061204 Large for Printing

3. On the next screen, click the Add a new variable to this table icon. See Figure 13-2:

4. Create a new parameter.

♦ Parameter Name = $$MAX_DISCOUNT

♦ Type = Parameter

♦ Datatype = decimal

♦ Precision = 15,2

♦ For the initial value, enter 17.25.

5. Click OK.

6. Save your work.

Step 3: Step Three: Create an Unconnected Lookup

1. Create a Lookup transformation using the SC_DIM_PROMOTIONS relational target table and name it lkp_DIM_PROMOTIONS.

Figure 13-2. Declare Parameters and Variables screen

Figure 13-3. Parameter entry

Unit 13 Lab: Load Sales Fact Table

310 Informatica PowerCenter 8 Level I Developer

Page 333: PC8LID 20061204 Large for Printing

2. Under the Ports tab:

a. Click on PROMO_ID, then click the Copy icon , and then the Paste icon .

b. Name the new port IN_PROMO_ID and make it an input only port.

c. Make DISCOUNT the Return port.

d. Uncheck the Output ports for all other ports except PROMO_ID and DISCOUNT.

The Lookup should look the same as Figure 13-4:

3. Create the lookup condition comparing PROMO_ID to IN_PROMO_ID.

4. Click OK and save the repository.

Step 4: Add Unconnected Lookup Test to Expression

1. Edit the exp_DISCOUNT_TEST Expression transformation.

2. If the IN_DISCOUNT port has a value greater than the value passed in via a mapping parameter, then we need to get an acceptable value from the DIM_PROMOTIONS table. The variable port v_DISCOUNT will be used to hold the return value. Edit the v_DISCOUNT variable port and add the expression:

IIF(IN_DISCOUNT > $$MAX_DISCOUNT, :LKP.LKP_DIM_PROMOTIONS(PROMO_ID),IN_DISCOUNT)

3. The discount is held as a whole number. We need to change this to a percentage and apply it against the selling price to derive the dollar value of the discount. Edit the output port OUT_DISCOUNT and add the expression:

v_DISCOUNT / 100 * SELLING_PRICE

Step 5: Create Aggregator Transformation

1. Create an Aggregator transformation named agg_FACT_SALES.

2. Drag the PRODUCT_KEY port from lkp_DIM_PRODUCT to agg_FACT_SALES.

3. Drag the DATE_KEY port from lkp_DIM_DATES to agg_FACT_SALES.

Figure 13-4. Lookup Ports tab showing input, output and return ports checked/unchecked

Unit 13 Lab: Load Sales Fact Table

Informatica PowerCenter 8 Level I Developer 311

Page 334: PC8LID 20061204 Large for Printing

4. Drag the following ports from the Expression transformation to the Aggregator:

♦ PAYMENT_ID

♦ CUST_ID

♦ DEALERSHIP_ID

♦ PROMO_ID

♦ SELLING_PRICE

♦ UNIT_COST

♦ SALES_QTY

♦ HOLDBACK

♦ REBATE

♦ OUT_DISCOUNT

5. Open the Aggregator and re-order the key ports in the following order:

CUST_ID, PRODUCT_KEY, DEALERSHIP_ID, PAYMENT_ID, PROMO_ID, DATE_KEY.

6. Group by the ports in Figure 13-5:

7. Uncheck the output ports for SELLING_PRICE, UNIT_COST and SALES_QTY.

8. Rename the following ports:

♦ SELLING_PRICE to IN_SELLING_PRICE.

♦ UNIT_COST to IN_UNIT_COST.

♦ SALES_QTY to IN_SALES_QTY.

♦ OUT_DISCOUNT to DISCOUNT.

9. Add the following new ports:

♦ Create a new output port after the DISCOUNT port.

Figure 13-5. Aggregator ports with Group By ports checked

Port Name OUT_UNITS_SOLD

Datatype decimal

Unit 13 Lab: Load Sales Fact Table

312 Informatica PowerCenter 8 Level I Developer

Page 335: PC8LID 20061204 Large for Printing

♦ Create a new output port after the OUT_UNITS_SOLD port.

♦ Create a new output port after the OUT_REVENUE port.

The Aggregator ports should be the same as displayed in Figure 13-6.

10. Use Autolink by name to link the ports from the agg_FACT_SALES transformation to the SC_FACT_SALES target table. You will need to use the prefix of OUT_ to link all of the ports.

Precision 3

Expression SUM(IN_SALES_QTY)

Port Name OUT_REVENUE

Datatype decimal

Precision 15,2

Expression SUM( ( IN_SELLING_PRICE * IN_SALES_QTY) - DISCOUNT - HOLDBACK - REBATE)

Port Name OUT_COST

Datatype decimal

Precision 15,2

Expression SUM( IN_UNIT_COST)

Figure 13-6. Finished Aggregator

Unit 13 Lab: Load Sales Fact Table

Informatica PowerCenter 8 Level I Developer 313

Page 336: PC8LID 20061204 Large for Printing

The results should appear the same as Figure 13-7.

11. Save your work.

12. Iconize the mapping.

Step 6: Create and Run the Workflow

1. Launch the Workflow Manager client and sign into your assigned folder.

2. Create a new workflow named wkf_FACT_SALES_LOAD_xx.

3. Add a new Session task named s_m_FACT_SALES_LOAD_xx that uses the m_FACT_SALES_LOAD_xx mapping.

4. Edit the s_m_FACT_SALES_LOAD_xx session.

a. Set the connection value for the sq_STG_TRANSACTIONS_PAYMENT source table to NATIVE_STGxx where xx is your student number.

b. Set the connection value for the SC_FACT_SALES target table to NATIVE_EDWxx where xx is your student number.

c. Change the Target load type to Normal.

Figure 13-7. Aggregator to Target Links

Figure 13-8. Iconic view of the completed mapping

Unit 13 Lab: Load Sales Fact Table

314 Informatica PowerCenter 8 Level I Developer

Page 337: PC8LID 20061204 Large for Printing

d. Under the mapping tab, select the lkp_DIM_DATES transformation and ensure that the Cache File Name Prefix is set to your pre-defined persistent cache (LKPSTUxx).

e. Under the Session Properties tab, set $Target connection value to NATIVE_EDWxx.

5. Save your work.

6. Start the workflow.

7. Review the Task Details.

8. Review the Source/Target Statistics. Your statistics should be the same as displayed in Figure 13-10.

Figure 13-9. Task Details of the completed session run

Figure 13-10. Source/Target Statistics of the completed session run

Unit 13 Lab: Load Sales Fact Table

Informatica PowerCenter 8 Level I Developer 315

Page 338: PC8LID 20061204 Large for Printing

Data Results

Preview the target data from the Designer, your data should appear the same as displayed in Figure 13-11.

Note: Not all rows and columns are shown.

Figure 13-11. Data Preview of the FACT_SALES target table

Unit 13 Lab: Load Sales Fact Table

316 Informatica PowerCenter 8 Level I Developer

Page 339: PC8LID 20061204 Large for Printing

Unit 13 Lab: Load Sales Fact Table

Informatica PowerCenter 8 Level I Developer 317

Page 340: PC8LID 20061204 Large for Printing

Unit 13 Lab: Load Sales Fact Table

318 Informatica PowerCenter 8 Level I Developer

Page 341: PC8LID 20061204 Large for Printing

Unit 14: Mapplets

After completing this unit, you should be able to:

♦ Describe mapplets

♦ Use a mapplet in a mapping

Lesson 14-1. Mapplets

Mapplets

Description

Mapplets can combine multiple mapping objects for reusability; they can also simplify complex mapping maintenance. A mapplet can receive the input data from either an internal source or from the mapping pipeline that calls the mapplet. A mapplet must pass data out of the mapplet via a Mapplet Output transformation.

Unit 14: Mapplets

Informatica PowerCenter 8 Level I Developer 319

Page 342: PC8LID 20061204 Large for Printing

Unit 14: Mapplets

320 Informatica PowerCenter 8 Level I Developer

Page 343: PC8LID 20061204 Large for Printing

Mapping Input Transformation

Type

Passive or Active.

Unit 14: Mapplets

Informatica PowerCenter 8 Level I Developer 321

Page 344: PC8LID 20061204 Large for Printing

Description

The Mapplet Input transformation acts as an input to a Mapplet.

Example

In the following example a business as part of its daily sales needs to apply discounts to the data, perform a number of lookups and aggregate the sales values. This functionality is used in several types of feeds so a Mapplet was created to provide this functionality to many mappings.

The Mapplet Input transformation is used to receive the sales transactions by customers, discounts are applied and then two lookups are used to find the product key and date keys. An Aggregator is used to sum the cost and revenue. A Mapplet Output transformation is used pass the output of the Mapplet back into the mapping that called it.

Unit 14: Mapplets

322 Informatica PowerCenter 8 Level I Developer

Page 345: PC8LID 20061204 Large for Printing

Mapping Output Transformation

Type

Passive.

Description

The Mapplet Output transformation acts as an output from a Mapplet.

Example

The following example illustrates the Mapplet Output transformation.

Unit 14: Mapplets

Informatica PowerCenter 8 Level I Developer 323

Page 346: PC8LID 20061204 Large for Printing

The following example illustrates a Mapplet with multiple output groups.

Warning: When the mapplet is expanded at runtime, an unconnected output group can result in a transformation having no output connections. If that is illegal, the mapping will be invalid.

Unit 14: Mapplets

324 Informatica PowerCenter 8 Level I Developer

Page 347: PC8LID 20061204 Large for Printing

Examples:

♦ If the mapplet outputs are fed by an Expression transformation, the mapping is invalid because an Expression requires a connected output.

♦ If the mapplet outputs are fed by a Router, the mapping is valid because a Router can have unconnected output groups.

Unit 14: Mapplets

Informatica PowerCenter 8 Level I Developer 325

Page 348: PC8LID 20061204 Large for Printing

Unit 14: Mapplets

326 Informatica PowerCenter 8 Level I Developer

Page 349: PC8LID 20061204 Large for Printing

Unit 14 Lab: Create a Mapplet

Business Purpose

The team lead has noticed that there are other situations where we can reuse some of the transformations developed in the FACT_SALES load mapping.

Technical Description

To take advantage of previously created objects, we will create a mapplet from existing objects used in a previous mapping. This mapplet can then be used in other mappings.

Objectives

♦ Create a Mapplet

Duration

15 Minutes

Unit 14 Lab: Create a Mapplet

Informatica PowerCenter 8 Level I Developer 327

Page 350: PC8LID 20061204 Large for Printing

Instructions

Step 1: Create the Mapplet

1. In the Mapping Designer, re-open the m_FACT_SALES_LOAD_xx mapping.

2. Highlight the following five transformations by holding down the Ctrl key and pressing the left mouse button:

♦ lkp_DIM_PROMOTIONS

♦ lkp_DIM_PRODUCT

♦ lkp_DIM_DATES

♦ exp_DISCOUNT_TEST

♦ agg_FACT_SALES.

3. Select Edit > Copy or type Ctrl+C.

4. Open Mapplet Designer. Create a mapplet named mplt_AGG_SALES.

a. Select Edit > Paste or type Ctrl+V.

b. Right click in the workspace and Arrange All.

c. Select the Scale to Fit icon.

Your mapplet definition should look the same as Figure 14-1.

d. Add a mapplet Input transformation.

e. Name the Mapplet Input Transformation in_TRANSACTIONS.

f. Add a mapplet Output transformation.

g. Name the Mapplet Output Transformation out_TRANSACTIONS.

Figure 14-1. Mapplet Designer view of mplt_AGG_SALES

Unit 14 Lab: Create a Mapplet

328 Informatica PowerCenter 8 Level I Developer

Page 351: PC8LID 20061204 Large for Printing

h. From the exp_DISCOUNT_TEST transformation, drag all Input ports to the Input transformation.

i. From the Aggregator agg_FACT_SALES, drag all Output ports to the Output transformation.

j. Select the Scale to Fit icon.

k. The mapplet should look similar to Figure 14-2.

l. Save your work.

Notice the mapplet is invalid. Scroll through the messages in the output window. They point to the expression exp_DISCOUNT_TEST as having an invalid symbol reference. The reference to the parameter $$MAX_DISCOUNT is invalid as it does not exist within the mapplet parameter definition.

m. Create a new parameter:

♦ Parameter Name = $$MAX_DISCOUNT

♦ Type = Parameter

♦ Datatype = decimal

♦ Precision = 15,2

♦ Initial Value = 17.25

5. Save your work.

Step 2: Add Mapplet to Mapping

1. Make a copy of the m_FACT_SALES_LOAD_xx mapping and open it in the Mapping Designer.

2. Rename the mapping to m_FACT_SALES_LOAD_MAPPLET_xx.

3. Delete the 5 transformations that you previously copied to the mapplet.

4. Drag the mapplet mplt_AGG_SALES into the mapping.

5. Use Autolink by name to link the ports from the sq_STG_TRANSACTIONS_PAYMENT to the mplt_AGG_SALES input.

Figure 14-2. Mapplet Designer view of MPLT_AGG_SALES with Input and Output transformations

Note: Mapping parameters and variables that are created in a mapping are not available for use in a mapplet that is called from the mapping.

Unit 14 Lab: Create a Mapplet

Informatica PowerCenter 8 Level I Developer 329

Page 352: PC8LID 20061204 Large for Printing

6. Manually link the DISCOUNT port to the IN_DISCOUNT port.

7. Use Autolink by name to link the Output portion of the mapplet to the target. You will need to specify OUT_ for the prefix and 1 for the suffix.

8. Arrange All Iconic.

9. Save your work.

Your mapping should look the same as Figure 14-3.

Step 3: Create and Run the Workflow

1. Launch the Workflow Manager and sign into your assigned folder.

2. Create a new workflow named wkf_FACT_SALES_LOAD_MAPPLET

3. Add the Session you just created, edit, and link. (You know how to do this by now!)

4. Save your work, start the workflow, and review the details.

Figure 14-3. Iconic view of the m_FACT_SALES_LOAD_MAPPLET_xx mapping

Unit 14 Lab: Create a Mapplet

330 Informatica PowerCenter 8 Level I Developer

Page 353: PC8LID 20061204 Large for Printing

Unit 14 Lab: Create a Mapplet

Informatica PowerCenter 8 Level I Developer 331

Page 354: PC8LID 20061204 Large for Printing

Unit 14 Lab: Create a Mapplet

332 Informatica PowerCenter 8 Level I Developer

Page 355: PC8LID 20061204 Large for Printing

Unit 15: Mapping Design

After completing this unit, you should be able to:

♦ Name key considerations for mapping design

♦ Describe the practice of designing a mapping

The workshop will give you practice in designing your own mappings.

Lesson 15-1. Designing Mappings

Description

This is designed to provide the user a checklist of topics to consider during the mapping development process. This document will cover a variety of situations users will have to address and help them ask the right questions before and during the design process.

What to Consider

The mapping process requires much more up front research than it appears. Before designing a mapping, it is important to have a clear picture of the end-to-end processes that the data will flow through.

♦ Design a high-level view of the mapping and document a picture of the process with the mapping, using a textual description to explain exactly what the mapping is supposed to accomplish and the methods or steps it will follow to accomplish its goal.

♦ After the high level flow has been established, document the details at the field level, listing each of the target fields and the source field(s) that are used to create the target field. Document any expression that may take place in order to generate the target field (e.g., a sum of a field, a multiplication of two fields, a comparison of two fields, etc.). Whatever the rules, be sure to document them at this point and remember to keep it at a physical level. The designer may have to do some investigation at this point for some business rules. For example, the business rules may say 'For active customers, calculate a late fee rate'. The designer of the mapping must determine that, on a physical level, that translates to 'for customers with an ACTIVE_FLAG of “1”, multiply the DAYS_LATE field by the LATE_DAY_RATE field'.

♦ Create an inventory of Mappings and Reusable objects. This list is a 'work in progress' list and will have to be continually updated as the project moves forward. The lists are valuable to all but particularly for the lead developer. These objects can be assigned to individual developers and progress tracked over the course of the project.

♦ The administrator or lead developer should gather all of the potential Sources, Targets and Reusable objects and place these in a shared folder accessible to all who may need access to them.

♦ As for Reusable objects, they need to be properly documented to make it easier for other developers to determine if they can/should use them in their own development.

♦ As a developer the specifications for a mapping should include required Sources, Targets and additional information regarding derived ports and finally how the ports relate from the source to the target.

♦ The Informatica Velocity methodology provides a matrix that assists in detailing the relationship between source fields and target fields. It also depicts fields that are derived from values in the Source and eventually linked to ports in the target.

Unit 15: Mapping Design

Informatica PowerCenter 8 Level I Developer 333

Page 356: PC8LID 20061204 Large for Printing

♦ If a shared folder for Sources and Targets is not available, the developer will need to obtain the source and target database schema owners, passwords and connect strings. With this information ODBC connections can be created in the Designer tool to allow access to the Source and Target definitions.

♦ Document any other information about the mapping that is likely to be helpful in developing the mapping. Helpful information may, for example, include source and target database connection information, lookups and how to match data in the lookup tables, data cleansing needed at a field level, potential data issues at a field level, any known issues with particular fields, pre or post mapping processing requirements, and any information about specific error handling for the mapping.

♦ The completed mapping design should then be reviewed with one or more team members for completeness and adherence to the business requirements. In addition, the design document should be updated if the business rules change or if more information is gathered during the build process.

High Level Process Overview

Mapping Specifics

The following are tips that will make the mapping development process more efficient. (not in any particular order)

♦ One of the first things to do is to bring in all required source and target objects into the mapping.

♦ Only connect fields that are needed or will be used.

♦ Only connect from the Source Qualifier those fields needed subsequently.

♦ Filter early and often. Only manipulate data that needs to be moved and transformed. Reduce the number non-essential records that are passed through the mapping.

♦ Decide if a Source Qualifier join will net the result needed versus creating a Lookup to retrieve desired results.

♦ Reduce the number of transformations. Excessive number of transformations will increase overhead.

♦ Consider increasing the shared memory from 12MB to 25MB or 40MB when using a large number of transformations.

♦ Make use of variables, local or global, to reduce the number of times functions will have to be used.

♦ Watch the data types. The Informatica engine converts compatible data types automatically. Excessive number of conversions is inefficient.

♦ Make use of variables, reusable transformations and mapplets for reusable code. These will leverage the work done by others.

Relational

SourceExpression Router

LookupRelational

Target

Relational

Target

Relational

Target

Unit 15: Mapping Design

334 Informatica PowerCenter 8 Level I Developer

Page 357: PC8LID 20061204 Large for Printing

♦ Use active transformations early in the process to reduce the number of records as early in the mapping as possible.

♦ When joining sources, select appropriate driving/master table.

♦ Utilize single pass reads. Design mappings to utilize one Source Qualifier to populate multiple targets.

♦ Remove or reduce field-level stored procedures. These will be executed for each record and slow performance.

♦ Lookup Transformation tips:

♦ When the source is large, cache lookup table columns for those lookup tables of 500,000 rows or less.

♦ Standard rule of thumb is not to cache tables over 500,000 rows.

♦ Use equality (=) conditions if possible in the Condition tab.

♦ Use IIF or DECODE functions when lookup returns small row sets.

♦ Avoid date comparisons in lookup; convert to string.

♦ Operations and Expression Transformation tips:

♦ Numeric operations are faster than string.

♦ Trim Char and Varchar fields before performing comparisons.

♦ Operators are faster than functions (i.e. || vs. CONCAT).

♦ Use flat files. File read/writes are faster than database reads/writes on same server. Fixed width files are faster than delimited file processing.

Unit 15: Mapping Design

Informatica PowerCenter 8 Level I Developer 335

Page 358: PC8LID 20061204 Large for Printing

Unit 15: Mapping Design

336 Informatica PowerCenter 8 Level I Developer

Page 359: PC8LID 20061204 Large for Printing

Unit 15 Workshop: Load Promotions Daily Aggregate Table

Business Purpose

The management wants to be able to analyze how certain promotions are performing. They want to be able to gather the promotions by day for each dealership for each product being sold.

Technical Description

The instructions will provide enough detail for you to design and build the mapping necessary to load the promotions aggregate table. It is suggested that you use the Velocity best practices that have been discussed during the course. The workshop will provide tables that can be filled in before you start building the mapping. If you are unclear on any of the instructions please ask the instructor.

OBJECTIVE

Design and create a mapping to load the aggregate table

Duration

120 minutes

Unit 15: Mapping Design

Informatica PowerCenter 8 Level I Developer 337

Page 360: PC8LID 20061204 Large for Printing

Workshop Details

Sources and Targets

SOURCE: STG_TRANSACTIONS

This relational table contains sales transactions for 7 days. It will be located in the TDBUxx schema and contains 5,475 rows. For the purpose of this mapping we will read all 7 days of data. See Figure 15-1 for the source table layout.

TARGET: FACT_PROMOTIONS_AGG_DAILY

This is a relational table is located in the TDBUxx schema. After running the mapping it should contain 1,073 rows. See Figure 15-2 for the target table layout.

Mapping Details

In order to successfully create the mapping you will need to know some additional details. ♦ The management has decided that they don't want to keep track of the Manager Discount or the

Employee Discount (PROMO_ID 105 and 200) so these will need to be excluded from the load.

♦ The PRODUCT_KEY can be obtained from the DIM_PRODUCT table by matching on the PRODUCT_ID.

♦ The DATE_KEY can be obtained from the DIM_DATES table by matching the TRANSACTION_DATE to the DATE_VALUE.

♦ UNITS_SOLD is derived by summing the SALES_QTY.

Figure 15-1. Source table definition

Figure 15-2. Target table definition

Unit 15: Mapping Design

338 Informatica PowerCenter 8 Level I Developer

Page 361: PC8LID 20061204 Large for Printing

♦ REVENUE is derived by taking the SALES_QTY times the SELLING_PRICE and then subtracting the DISCOUNT, HOLDBACK and REBATE.

♦ Most of the discounts are valid but occasionally they may be higher than the acceptable value of 17.25. When this occurs you will need to obtain an acceptable value based on the PROMO_ID. The acceptable value can be obtained from the DIM_PROMOTIONS table by matching the PROMO_ID.

♦ The DISCOUNT is a percentage stored as a number. To calculate the actual discount in dollars you will need to divide the DISCOUNT by 100 and multiply it by the SELLING_PRICE.

♦ REVENUE_PER_UNIT is derived by dividing the REVENUE by the SALES_QTY.

♦ COST is derived by summing the UNIT_COST.

♦ COST_PER_UNIT is derived by summing the UNIT_COST and dividing it by the sum of the SALES_QTY.

♦ Save your work often.

Unit 15: Mapping Design

Informatica PowerCenter 8 Level I Developer 339

Page 362: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

LOOKUPS

Mapping Name m_FACT_PROMOTIONS_AGG_DAILY_LOAD

Source System Oracle Table Target System Oracle Table

Initial Rows Rows/Load

Short Description Load the daily promotions aggregate table from the STG_TRANSACTIONS table.

Load Frequency Daily

Preprocessing Target Append

Post Processing

Error Strategy Default

Reload Strategy

Unique Source

Fields

Tables

Table Name Schema/Owner Selection/Filter

Tables Schema Owner

Table Name Update Delete Insert Unique Key

Lookup Name

Table Location

Match

Condition(s)

Filter/SQL

Override

Lookup Name

Table Location

Unit 15: Mapping Design

340 Informatica PowerCenter 8 Level I Developer

Page 363: PC8LID 20061204 Large for Printing

HIGH LEVEL PROCESS OVERVIEW

PROCESSING DESCRIPTION (DETAIL)

SOURCE TO TARGET FIELD MATRIX

Workflow Details

This is a simple workflow containing a Start task and a Session task.

Match

Condition(s)

Filter/SQL

Override

Lookup Name

Table Location

Match

Condition(s)

Filter/SQL

Override

Target Table Target Column Source File Source Column Expression Default Value if Null

Unit 15: Mapping Design

Informatica PowerCenter 8 Level I Developer 341

Page 364: PC8LID 20061204 Large for Printing

Run Details

Your Task Details should be similar to Figure 15-3.

Your Source/Target Statistics should be similar to Figure 15-4.

Your Preview Data results should be similar to Figure 15-5.

Figure 15-3. Task Details of the completed session run

Figure 15-4. Source/Target Statistics of the completed session run

Figure 15-5. Data Preview of the FACT_PROMOTIONS_AGG_DAILY table

Unit 15: Mapping Design

342 Informatica PowerCenter 8 Level I Developer

Page 365: PC8LID 20061204 Large for Printing

Unit 15: Mapping Design

Informatica PowerCenter 8 Level I Developer 343

Page 366: PC8LID 20061204 Large for Printing

Unit 15: Mapping Design

344 Informatica PowerCenter 8 Level I Developer

Page 367: PC8LID 20061204 Large for Printing

Unit 16: Workflow Variables and Tasks

After completing this unit, you should be able to:

♦ Describe these features:

♦ Link Conditions

♦ Workflow variables

♦ Assignment tasks

♦ Decision tasks

♦ Email tasks

♦ Use these features in workflows and mappings

Lesson 16-1. Link Conditions

You can set conditions on workflow links:

♦ If the link condition is True, the next task is executed.

♦ If the link condition is False, the next task is not executed.

To set a condition, right-click a link and enter an expression that evaluates to True or false. You can use workflow variables in the condition (see later).

Unit 16: Workflow Variables and Tasks

Informatica PowerCenter 8 Level I Developer 345

Page 368: PC8LID 20061204 Large for Printing

Lesson 16-2. Workflow Variables

Workflow variables can either be Pre-defined or User-defined.

User-defined workflow variables are created by selecting Workflows > Edit and then selecting the Variables tab.

Unit 16: Workflow Variables and Tasks

346 Informatica PowerCenter 8 Level I Developer

Page 369: PC8LID 20061204 Large for Printing

Description

Workflow variables can be user-defined or pre-defined:

User-defined workflow variables can be used to pass information from one point in a workflow to another.

1. Declare workflow variables in the workflow Variables tab.

2. Selecting persistent will write the last value out to the repository and make it available the next time the workflow is executed.

3. Use an Assignment task to set the value of the variable.

4. Use the variable value later in the workflow.

Pre-defined workflow variables come in two types:

♦ System Variables (SYSDATE and WORKFLOWSTARTTIME) can be used for example when calculating variable dates and times in the Assignment task link conditions.

♦ Task-specific workflow variables are available in Decision, Assignment and Timer tasks, and in link conditions. They include EndTime, ErrorCode, ErrorMsg, FirstErrorCode, FirstErrorMsg, PrevTaskStatus, SrcFailedRows, SrcSuccessRows, StartTime, Status, TgtFailedRows, TgtSuccessRows and TotalTransErrors.

Workflow variables are discussed in more detail in the Workflow Administration Guide.

Business Purpose

A workflow can contain multiple tasks and multiple pipelines. One or more tasks or pipelines may be dependent on the status of previous tasks.

Example

S2 may be dependent on the successful running of S1. Success may be defined as session status = Successful and the number of source and target failed rows = zero. The link that precedes S2 can be coded such that S2 will not run if all 3 of the criteria are not true. Use the Task Specific Workflow Variables 'Status', 'SrcFailedRows' and 'TgtFailedRows' in the Link Condition Expression. In this proposed case, there is no allowance for only 1 of the 3 conditions being true.

S4 may be desired not to run if S3 took more than 1 hour past the workflow start time. A truncation and testing of WORKFLOWSTARTTIME in the Link Condition preceding S4, is appropriate.

Unit 16: Workflow Variables and Tasks

Informatica PowerCenter 8 Level I Developer 347

Page 370: PC8LID 20061204 Large for Printing

Lesson 16-3. Assignment Task

Description

The Assignment task can establish the value of a Workflow Variable (refer to the subsequent Workflow Variables section of this document) whose value can be used at a later point in the workflow, as testing criteria to determine if (or when) other workflow tasks/pipelines should be run.

It is a 3-step process: create a Workflow Variable in the workflow properties; establish the value of that variable with an Assignment task; test that variable value at some subsequent point in the workflow.

Business Purpose

Running a workflow task may depend on the results of other tasks or calculations in the workflow. An Assignment task can do certain calculations, establish a variable value for a Workflow Variable. What that value is may determine whether other tasks or pipelines are run.

Example

S5 should run at least 1 hour after S2 completes. ASGN1 can be coded to set a time that TIMER1 will wait for, before proceeding to S5. To prevent ASGN1 from running until S2 completes, use a Link Condition (refer to Workflow Design section of this document).

Unit 16: Workflow Variables and Tasks

348 Informatica PowerCenter 8 Level I Developer

Page 371: PC8LID 20061204 Large for Printing

Code the Assignment task ASGN1 (in part) using the PowerCenter TRUNC date function, and the pseudocode for the variable date value > = Session2's EndTime + 1 hour. The Timer task TIMER1 will wait for that variable time to exist before running S5.

Lesson 16-4. Decision Task

Description

Decision tasks enable the workflow designer to set criteria by which the workflow will or will not proceed to the next set of tasks, depending on whether the set criteria is true or false.

Business Purpose

Commonly, workflows have multiple paths. Some are simply concurrent tasks. Others are pipelines of tasks that should only be run if results of preceding tasks are successful. Still others are pipelines of tasks that should only be run if those results are not successful. What determines the success or failure of a task or group of tasks is User Defined, depending on the business-defined rules and operational rules of processing. That criterion is set as the Decision Condition in a Decision Task and subsequently tested for a True or False condition.

Example

If a session, group of sessions or any combination of workflow tasks is successful, a subsequent set of sessions should run. If any one of the tasks fails or does not produce desired results, those sessions should

Unit 16: Workflow Variables and Tasks

Informatica PowerCenter 8 Level I Developer 349

Page 372: PC8LID 20061204 Large for Printing

not be run. Instead, an email should be sent to the processing operator to perhaps run a back out session, or simply notify the Development Team Lead or Business Unit Lead, that an error condition existed.

Lesson 16-5. Email Task

Description

Email tasks enable PowerCenter to send email messages at various points in a workflow. Users can define email addresses, a subject line and the email message text. When called from within a Session task, the message text can contain variable session-related metadata.

Business Purpose

Various business and operational staff may need to be notified of the progress of a workflow, the status of tasks (or combination of tasks) within it, or various metadata results of a session.

Example

The Business Unit Team Lead may request to receive an email detailing the time a load finished, the total number of rows loaded and the number of rows rejected. This could be accomplished with either a

Unit 16: Workflow Variables and Tasks

350 Informatica PowerCenter 8 Level I Developer

Page 373: PC8LID 20061204 Large for Printing

reusable email task (which allows variable session metadata) called from within a session. If session-specific variable metadata is not required, a standard text message could be send by using a non-reusable email task which follows the session in the workflow.

Operational staff may request receipt of an email if a session-required source file does not arrive by the time the session is scheduled to run. Receipt of the email message would be the operator's signal that some type of manual intervention or restore routine is required to correct the problem.

Performance Considerations

A running, configured email server is required; however, the impact of the Integration Service sending emails is minimal.

Unit 16: Workflow Variables and Tasks

Informatica PowerCenter 8 Level I Developer 351

Page 374: PC8LID 20061204 Large for Printing

Unit 16: Workflow Variables and Tasks

352 Informatica PowerCenter 8 Level I Developer

Page 375: PC8LID 20061204 Large for Printing

Unit 16 Lab: Load Product Weekly Aggregate Table

Business Purpose

The Mersche Motors data warehouse contains a number of aggregate tables. The management wants to be able to report on total sales for a product on a weekly basis. A weekly product sales aggregate table needs to be loaded for this purpose.

Technical Description

The source for the weekly product aggregate table will be the daily product aggregate table. The mapping to load this table is located in the DEV_SHARED folder. A workflow needs to be created that will run the weekly aggregate load session after the daily aggregate load session has run 7 times. This can be accomplished using an assignment task, a decision task, link conditions and session tasks. A load date equal to the beginning day of the week will be used to provide the date key for the weekly aggregate table. The mapping to accomplish this has already been created and will need to be copied from the DEV_SHARED folder. It contains a mapping variable that will be incremented by 1 at the end of the session/mapping run.

Objectives

♦ Assigning Workflow Variables

♦ Incrementing Workflow Variables using the Assignment Task

♦ Branching in a workflow using a Decision Task

♦ Using Link Conditions

Duration

35 Minutes

Unit 16 Lab: Load Product Weekly Aggregate Table

Informatica PowerCenter 8 Level I Developer 353

Page 376: PC8LID 20061204 Large for Printing

Velocity Deliverable: Mapping Specifications

SOURCES

TARGETS

HIGH LEVEL PROCESS OVERVIEW (WORKFLOW)

PROCESSING DESCRIPTION (DETAIL)

A workflow variable will be defined and set to 0 for the start of the workflow. The first Session task runs the load to the daily product aggregate table. The Assignment task increments the workflow variable by 1. The Decision task uses the MOD function to divide the workflow variable by 7 and see if it returns a 0. If it returns a 0, the second Session task runs and loads the weekly aggregate table. If it returns a non zero value, then an Email task runs (the Email task only runs if the Integration Service is associated with a mail server). This workflow must be run 7 times, emulating a week, to verify the process works properly.

Mapping Name m_FACT_PRODUCT_AGG_WEEKLY_LOAD_xx

Source System Oracle Tables Target System Oracle Table

Initial Rows 1390 Rows/Load 1390

Short Description Load the weekly product aggregate table from the daily product aggregate table.

Load Frequency Weekly

Preprocessing Target Append

Post Processing

Error Strategy Default

Reload Strategy

Unique Source Fields PRODUCT_KEY, DEALERSHIP_ID, DATE_KEY

Tables

Table Name Schema/Owner Selection/Filter

FACT_PRODUCT_AGG_DAILY TDBUxx

Tables Schema Owner TDBUxx

Table Name Update Delete Insert Unique Key

FACT_PRODUCT_AGG_WEEKLY X X PRODUCT_KEY, DEALERSHIP_ID,

DATE_KEY

Session

Task

Decision

Task

Assignment

Task

Session

Task

Start

Task

Email

Task

Unit 16 Lab: Load Product Weekly Aggregate Table

354 Informatica PowerCenter 8 Level I Developer

Page 377: PC8LID 20061204 Large for Printing

Instructions

Step 1: Copy the Mappings

1. In the Designer, copy the m_FACT_PRODUCT_AGG_DAILY_LOAD mapping and the m_FACT_PRODUCT_AGG_WEEKLY_LOAD mapping from the DEV_SHARED folder.

2. Select Yes for any Target Dependencies.

3. Select Skip or Reuse to resolve any conflicts.

4. Rename the mappings to include your student number.

5. Save your work.

Step 2: Copy the Existing Workflow

1. In the Workflow Manager, copy the wkf_FACT_PRODUCT_AGG_WEEKLY_LOAD workflow from the DEV_SHARED folder.

2. Resolve the conflict by selecting the m_FACT_PRODUCT_AGG_DAILY_LOAD_xx mapping.

3. Drag the new workflow into the Workflow Designer.

4. Edit the session and make the following changes:

a. Rename it to include your assigned student number.

b. In the Properties tab.

i. Change the $Target Connection Value to reflect your assigned student connection NATIVE_EDWxx.

ii. Change the Session Log File Name to include your student number.

c. Change the source and target connections to reflect your assigned student connection, NATIVE_STGxx and NATIVE_EDWxx respectively.

5. Select the menu option Workflows > Edit.

a. Rename it to include your student number.

b. In the Properties tab change the Workflow Log File Name to include your student number.

c. In the Variables tab create a new workflow variable:

♦ Variable Name = $$WORKFLOW_RUNS

♦ Datatype = integer

♦ Persistent = checked

♦ Default Value = 0

Unit 16 Lab: Load Product Weekly Aggregate Table

Informatica PowerCenter 8 Level I Developer 355

Page 378: PC8LID 20061204 Large for Printing

Figure 16-1 shows the defined workflow variable:

Step 3: Create the Assignment Task

1. Add an Assignment Task to the workflow.

2. Link the s_m_FACT_PRODUCT_AGG_DAILY_LOAD_xx Session task to the Assignment task.

3. Double click the link.

Figure 16-1. Workflow variable declaration

Unit 16 Lab: Load Product Weekly Aggregate Table

356 Informatica PowerCenter 8 Level I Developer

Page 379: PC8LID 20061204 Large for Printing

4. Add a link condition to ensure that the assignment task only executes if the s_m_FACT_PRODUCT_AGG_DAILY_LOAD_xx Session task was successful. See Figure 16-2 for details.

5. Edit the Assignment task.

6. Rename it to asgn_WORKFLOW_RUNS.

7. In the Expressions tab, create an expression that increments the User Defined Variable named $$WORKFLOW_RUNS by 1. See Figure 16-3 for details.

8. Save your work.

Figure 16-2. Link condition testing if a session run was successful

Figure 16-3. Assignment Task expression declaration

Unit 16 Lab: Load Product Weekly Aggregate Table

Informatica PowerCenter 8 Level I Developer 357

Page 380: PC8LID 20061204 Large for Printing

Step 4: Create the Decision Task

1. Add a Decision task to the workflow.

2. Link the asgn_WORKFLOW_RUNS Assignment task to the Decision task.

3. Double click the link.

4. Add a link condition to ensure that the decision task only executes if the asgn_WORKFLOW_RUNS Assignment task was successful (refer to previous step).

5. Edit the Decision task.

a. Rename it to dcn_RUN_WEEKLY.

b. In the Properties tab.

Create a Decision Name expression using the modulus function that checks to see if this is the seventh run of the workflow. This can be done by dividing the workflow variable by seven and checking to see if the remainder is 0. See Figure 16-4 for details.

Step 5: Create the Session Task

1. Create a session task named s_m_FACT_PRODUCT_AGG_WEEKLY_LOAD_xx that uses the m_FACT_PRODUCT_AGG_WEEKLY_LOAD_xx mapping.

2. Link the dcn_RUN_WEEKLY Decision task to the s_m_FACT_PRODUCT_AGG_WEEKLY_LOAD_xx session task.

3. Double click the link.

Figure 16-4. Decision Task Expression

Tip: The decision task will evaluate the expression and return a value of either TRUE or FALSE. This can be checked in a link condition to determine the direction taken.

Unit 16 Lab: Load Product Weekly Aggregate Table

358 Informatica PowerCenter 8 Level I Developer

Page 381: PC8LID 20061204 Large for Printing

4. Add a link condition that checks to see if the dcn_RUN_WEEKLY Decision task has returned a value of TRUE, meaning that it is time to load the weekly aggregate table. See Figure 16-5 for details.

5. Edit the s_m_FACT_PRODUCT_AGG_WEEKLY_LOAD_xx session.

a. Set the relational connection value for the SQ_SC_FACT_PRODUCT_AGG_DAILY source to NATIVE_EDWxx, where xx is your student number.

b. Set the relational connection value for the SC_FACT_PRODUCT_AGG_WEEKLY target to NATIVE_EDWxx, where xx is your student number.

c. Verify the Target load type is set to Normal.

Step 6: Create the Email Task

1. Add an Email task to the workflow.

2. Link the dcn_RUN_WEEKLY Decision task to the Email task.

3. Double click the link.

Add a link condition that checks to see if the dcn_RUN_WEEKLY Decision task has returned a value of FALSE, meaning that the daily load has completed and that it is NOT time to load the weekly aggregate table.

Figure 16-5. Link condition testing for a Decision Task condition of TRUE

Unit 16 Lab: Load Product Weekly Aggregate Table

Informatica PowerCenter 8 Level I Developer 359

Page 382: PC8LID 20061204 Large for Printing

4. Edit the Email task.

a. Rename it to eml_DAILY_LOAD_COMPLETE.

b. In the Properties tab.

i. Add [email protected] as the Email User Name.

ii. Add appropriate text for the Email Subject and Email Text. See Figure 16-6 for details.

5. Right-click in the workspace and select Arrange > Horizontal.

6. Save your work.

Your workflow should appear the same as displayed in Figure 16-7.

Figure 16-6. Email Task Properties

Figure 16-7. Completed Workflow

Unit 16 Lab: Load Product Weekly Aggregate Table

360 Informatica PowerCenter 8 Level I Developer

Page 383: PC8LID 20061204 Large for Printing

Step 7: Start the Workflow and Monitor the Results

The workflow will need to be run seven times in order to see the weekly aggregate session running.

1. Start the workflow.

2. Review the workflow in the Gantt view of the Workflow Monitor. It should appear similar to Figure 16-8.

3. Return to the Workflow Manager.

4. Right-click the wkf_FACT_PRODUCT_AGG_WEEKLY_LOAD_xx workflow in the Navigator window and select View Persistent Values. The value should be set to 1. See Figure 16-9 and Figure 16-10.

Figure 16-8. Gantt chart view of the completed workflow run

Figure 16-9. View Workflow Variables

Unit 16 Lab: Load Product Weekly Aggregate Table

Informatica PowerCenter 8 Level I Developer 361

Page 384: PC8LID 20061204 Large for Printing

Note: Each time you run the workflow this value will increase by one.

5. Click Cancel to exit.

6. Run the workflow six more times to emulate a week's normal runs and after the last run the Gantt Chart view should be similar to Figure 16-11.

7. Review the Task Details of the s_m_FACT_PRODUCT_AGG_WEEKLY_LOAD_xx session.

Figure 16-10. Value of the $$WORKFLOW_RUNS variable after first run

Figure 16-11. Gantt chart view of the completed workflow run after the weekly load runs

Figure 16-12. Task Details of the completed session run

Unit 16 Lab: Load Product Weekly Aggregate Table

362 Informatica PowerCenter 8 Level I Developer

Page 385: PC8LID 20061204 Large for Printing

Unit 16 Lab: Load Product Weekly Aggregate Table

Informatica PowerCenter 8 Level I Developer 363

Page 386: PC8LID 20061204 Large for Printing

Unit 16 Lab: Load Product Weekly Aggregate Table

364 Informatica PowerCenter 8 Level I Developer

Page 387: PC8LID 20061204 Large for Printing

Unit 17: More Tasks and Reusability

After completing this unit, you should be able to:

♦ Describe the following features:

♦ Event Wait task

♦ Event Raise task

♦ Command task

♦ Reusable tasks

♦ Reusable Session task

♦ Reusable Session configuration

♦ pmcmd Utility

♦ Use these features in mappings and workflows

Lesson 17-1. Event Wait Task

Description

Event Wait tasks wait for either the presence of a named flat file (pre-defined event) or some other user-defined event to occur in the workflow processing.

Unit 17: More Tasks and Reusability

Informatica PowerCenter 8 Level I Developer 365

Page 388: PC8LID 20061204 Large for Printing

For a predefined event, the task waits for the physical presence of a file in a directory local to the Integration Service process machine. This file is known as an indicator file. If the file does not exist, the Event Wait task will not complete. When the file is found, the Event Wait task completes and the workflow proceeds to subsequent tasks. The Event Wait task can optionally delete the indicator file once detected or the file can be deleted by a subsequent process.

For a user-defined event, the developer:

1. Defines an event in the workflow properties (prior to workflow processing)

2. Includes an Event Wait task at a suitable point in the workflow, where further processing must await some specific event.

3. Includes an Event Raise task at a suitable point in the workflow, e.g. after a parallel pipeline has completed. The Event Raise task sets the event to active. (Event Raise task is described later).

This lesson examines the two types separately.

Pre-Defined Event

Business Purpose

An Event Wait task watching for a flat file by name is placed in a workflow because some subsequent processing is dependent on the presence of the file.

Example

A Session task may be expecting to process a flat file as source data. Inserting a Pre-Defined Event Wait task containing the specific name and location of the flat file causes the workflow to proceed if the file is found. If not found, the workflow goes into a Wait status.

Unit 17: More Tasks and Reusability

366 Informatica PowerCenter 8 Level I Developer

Page 389: PC8LID 20061204 Large for Printing

Performance Considerations

The only known consideration is the length of time the Integration Service may have to wait if the file does not arrive. This potential load window slowdown can be averted by proper workflow design which will provides alternatives in case a file does not arrive in a reasonable length of time. Refer to the Email task earlier in this section.

User-Defined Event

Business Purpose

An Event Wait task waiting for the occurrence of a user-defined event will be strategically placed such that the workflow should not proceed further unless a different but specific series of pre-determined tasks and conditions have occurred. It will always work in concert with an Event Raise task. Per the 3 steps mentioned above: the user creates the workflow Event, the Event Raise triggers the Event (or sets it 'active') and the Event Wait task does not proceed to subsequent tasks until it detects that the specific Event was triggered.

Example

A workflow may have 2 concurrent pipelines containing various tasks, in this order. Pipeline 1 contains S1 and S2; Pipeline 2 contains S3 and S4 and S5. S5 cannot run until S4 runs.

One way to ensure that S5 does not run unless S1 and S2 have run, is to create a workflow Event in the workflow properties, insert an Event Raise task after S2 that triggers (activates) the Event and place a User-Defined Event Wait task after S4 to detect whether the Event has been triggered. If not, the workflow waits until it is triggered.

Performance Considerations

The only known performance consideration is the length of time the Integration Service may have to wait if the Event is not raised. This potential load window slowdown can be averted by proper workflow design which will provides alternatives in case the Event does not occur within a reasonable length of time. (Refer to the Email and Timer tasks earlier in this section.)

Unit 17: More Tasks and Reusability

Informatica PowerCenter 8 Level I Developer 367

Page 390: PC8LID 20061204 Large for Printing

Lesson 17-2. Event Raise Task

Description

Event Raise tasks are always used in conjunction with User-Defined Event Wait tasks. They send a signal to an Event Wait task that a particular set of pre-determined events have occurred. A user-defined event is defined as the completion of the tasks from the Start task to the Event Raise task.

It is the same 3 step process previously mentioned: the developer defines an 'Event' in the workflow properties; the Event Raise task 'raises' the event at some point in the running workflow; an Event Wait task is placed at a different point in the workflow to determine if the Event has been raised.

Business Purpose

This task allows signals to be passed from one spot in the workflow, to another that a particular series of pre-determined events have occurred.

Example

This example is the same as the one in the Event Wait task section of this document.

A workflow may have 2 concurrent pipelines containing various tasks, in this order. Pipeline 1 contains S1 and S2; Pipeline 2 contains S3 and S4 and S5. S5 cannot run until S4 runs.

Unit 17: More Tasks and Reusability

368 Informatica PowerCenter 8 Level I Developer

Page 391: PC8LID 20061204 Large for Printing

One way to ensure that S5 is not run unless S1 and S2 have run, is to create a workflow Event in the workflow properties, insert an Event Raise task after S2 that triggers (activates) the Event and place a User-Defined Event Wait task after S4 to detect whether the Event has been triggered. If not, the workflow waits until it is triggered.

Performance Considerations

As before, the only known performance consideration is the length of time the Integration Service may have to wait if the Event is not raised. This potential load window slowdown can be averted by proper workflow design which will provides alternatives in case the Event does not occur within a reasonable length of time. (Refer again to the Email and Timer tasks earlier in this section.)

Lesson 17-3. Command Task

Unit 17: More Tasks and Reusability

Informatica PowerCenter 8 Level I Developer 369

Page 392: PC8LID 20061204 Large for Printing

Description

Command tasks are inserted in workflows and worklets to enable the Integration Service to run one or more OS commands of any nature. All commands or batch files referenced must be executable by the OS login that owns the Integration Service process.

Business Purpose

OS commands can be used for any operational or Business Unit related procedure and can be run at any point in a workflow. Command tasks can be set to run one or more OS commands or scripts/batch files, before proceeding to the next task in the workflow. If more than one command is coded into a Command task, the entire task can be set to fail if any one of the individual commands fails. Additionally and optionally, each individual command can be set not to run if a preceding command has failed.

Example

A Session task that produces an output file could be followed by a Command task that copies the file to another directory or FTPs the file to another box location. The command syntax would be the same as the command syntax that would accomplish this at the OS command prompt on the Integration Service process machine.

A Session task that is relying on a flat file data as its source data could be preceded by a Command task which contains a script that step-by-step verifies the presence of the file, opens it and verifies/compares control totals or record counts to some external source of information (again, any sequence of steps that could be accomplished at the OS level).

A series of multiple concurrent or sequential Sessions in a workflow could all be followed by one Command task coded to copy (or move) all session logs created by the workflow to a special daily backup directory.

Unit 17: More Tasks and Reusability

370 Informatica PowerCenter 8 Level I Developer

Page 393: PC8LID 20061204 Large for Printing

Performance Considerations

The only known consideration is the length of time the OS commands collectively take to run on the Integration Service process machine. This is not within the control of the Integration Service.

Lesson 17-4. Reusable Tasks

♦ Session, Email and Command tasks can be reusable.

♦ Use the Task Developer to create reusable tasks.

♦ Reusable tasks appear in the Navigator Tasks node and can be dragged and dropped into any workflow.

♦ In a workflow, a reusable task is indicated by a special symbol.

Lesson 17-5. Reusable Session Tasks

A Session created directly in a workflow is a Non-reusable session; it is specific only to that workflow. A session created in the Task Developer workspace is reusable. An instance of a Reusable Session can be run in any workflow or worklet. Some of the properties of the session instance are customizable, workflow-by-workflow.

Business Purpose

Occasionally, a certain mapping logic may be required to be run in multiple workflows. Since a mapping is a reusable object, the developer could code multiple sessions, all based on the same mapping. However, there is a simpler way to create 'like-sessions' that are all based on the same mapping - a Reusable Session.

Once created in the Task Developer, an instance of the Reusable Session can be placed in any workflow or worklet.

Unit 17: More Tasks and Reusability

Informatica PowerCenter 8 Level I Developer 371

Page 394: PC8LID 20061204 Large for Printing

Examples

If the same mapping needs to be used a number of times and if a number of Session properties need to be changed in each of the uses of a session (e.g. - time-stamped logs, increased Line Sequential Buffer Length, special error handling), the changes could be made in the parent session and every time an instance of the session were placed in a workflow, the session would automatically take on all those customized properties. This results in less developer effort versus creating separate new sessions, each with multiple customized session properties.

A business receives 25 data file sources for its 25 customers. The data structure of each customer is different enough that a different mapping is required for each, to get the data into one common format. Once data is structured the same, each needs to be subsequently run using common mapping logic to further transform the data in a like manner. If 25 output files were created, 25 instances of one Reusable Session could be used to process all data files. Each workflow would contain one customer-specific session/mapping and one instance of the Reusable Session, pre-coded with common session properties.

Performance Considerations

It is recommended to use reusable session tasks sparingly because retrieving the metadata for a reusable session task and its child instances from the repository takes longer than retrieving the metadata for a non-reusable session task.

Lesson 17-6. Reusable Session Configurations

♦ Define session properties that can be reused by multiple sessions within a folder.

♦ Use Tasks - Session Configuration menu option or Tasks toolbar icon.

♦ Opens Session Config Browser where you set session properties.

♦ Invoke in Session tasks, in Config Object tab, Config Name box.

♦ Can override these session properties further down in the Config Object tab.

Unit 17: More Tasks and Reusability

372 Informatica PowerCenter 8 Level I Developer

Page 395: PC8LID 20061204 Large for Printing

Lesson 17-7. pmcmd Utility

Description

The pmcmd command line utility allows the developer to perform most Workflow Manager operations outside of the PowerCenter Client tool.

Syntax Example

pmcmd startworkflow -sv integrationservicename -u yourusername -p yourpassword workflowname

This command will start the workflow located on the named Integration Service. You must supply the user name and password to sign in to the Integration Service, as well as the workflow name.

Unit 17: More Tasks and Reusability

Informatica PowerCenter 8 Level I Developer 373

Page 396: PC8LID 20061204 Large for Printing

Unit 17: More Tasks and Reusability

374 Informatica PowerCenter 8 Level I Developer

Page 397: PC8LID 20061204 Large for Printing

Unit 18: Worklets and More Tasks

After completing this unit, you should be able to:

♦ Describe these features:

♦ Worklets

♦ Timer task

♦ Control task

♦ Use these features in a workflow

Lesson 18-1. Worklets

Description

Worklets are optional processing objects inside workflows. They contain PowerCenter tasks that represent a particular grouping of, or functionally-related set of tasks. They can be created directly in a workflow (non-reusable) or from within the Workflow Designer (reusable).

Unit 18: Worklets and More Tasks

Informatica PowerCenter 8 Level I Developer 375

Page 398: PC8LID 20061204 Large for Printing

Unit 18: Worklets and More Tasks

376 Informatica PowerCenter 8 Level I Developer

Page 399: PC8LID 20061204 Large for Printing

Business Purpose

A workflow may contain dozens of tasks, whether they are concurrent or sequential. During workflow design they will be developed naturally into 'groupings' of meaningfully-related tasks, run in the appropriate operational order. The workflow can run as-is, from start to finish, executing task-by-task or the developer can place natural groupings of tasks into worklets.

A worklet's relationship to a workflow is like a subroutine to a program or an applet to an application. Worklets can be used in a very large workflow to encapsulate the natural groupings of tasks.

Example

This example is similar to the one in the Event Wait task section of this document.

Workflow with individual tasks: A workflow may have 2 concurrent pipelines containing various tasks, in this order. Pipeline 1 contains S1 and S2; Pipeline 2 contains S3 and S4. S5 cannot run until all 4 sessions run.

Workflow converted to internal worklets: Worklet1 contains S1 and S2; Worklet2 contains S3 and S4; S5 will run after both worklets complete.

Unit 18: Worklets and More Tasks

Informatica PowerCenter 8 Level I Developer 377

Page 400: PC8LID 20061204 Large for Printing

Lesson 18-2. Timer Task

Description

Timer tasks are used to keep track of the time an object of the workflow started. They can be based on:

♦ Absolute Time. The user can specify the date and time to start the timer from.

♦ Datetime Variable. The user can provide a variable that lets the Timer task know when to start the timer from.

♦ Relative Time. The Timer task can start the timer from the start time of the Timer task, the start time of the workflow or worklet, or from the start time of the parent workflow.

Business Purpose

Business or operational processing specifications may require that a workflow run to a certain point then sit idle for a length of time or until a fixed physical point in time.

Example

A workflow may contain sessions that should only run for a maximum amount of time. A Timer task can be set to wait for the maximum amount of time and then send an email or abort the workflow of the time limit is exceeded. The Timer task could be set to execute one hour after the start of the workflow.

Unit 18: Worklets and More Tasks

378 Informatica PowerCenter 8 Level I Developer

Page 401: PC8LID 20061204 Large for Printing

Performance Considerations

There are no real performance considerations for the timer task.

Lesson 18-3. Control Task

Description

Control tasks are used to alter the normal processing of a workflow. They can stop, abort or fail any workflow or worklet.

Unit 18: Worklets and More Tasks

Informatica PowerCenter 8 Level I Developer 379

Page 402: PC8LID 20061204 Large for Printing

Control Options

Business Purpose

When error condition exists, operational staff may prefer that a workflow or worklet simply stop, abort or fail rather than be emailed that an error exists.

Example

As with the example in the Pre-Defined Event (Event Wait task section) a workflow may have a session which is expecting a flat file for source data. If the file does not arrive within one hour after the workflow start time, the desired action may be to fail the workflow.

A workflow may have a session that runs through to a successful processing conclusion but could contain data row errors. A Control task could be placed subsequent to the session, with conditions set to stop, abort or fail the workflow (and use an Email task to notify someone of the issue).

Unit 18: Worklets and More Tasks

380 Informatica PowerCenter 8 Level I Developer

Page 403: PC8LID 20061204 Large for Printing

Unit 18: Worklets and More Tasks

Informatica PowerCenter 8 Level I Developer 381

Page 404: PC8LID 20061204 Large for Printing

Unit 18: Worklets and More Tasks

382 Informatica PowerCenter 8 Level I Developer

Page 405: PC8LID 20061204 Large for Printing

Unit 18 Lab: Load Inventory Fact Table

Business Purpose

The Inventory fact table load runs directly after the Inventory staging table load. Sometimes the Inventory staging table load runs longer than is acceptable and delays the rest of the process. An email needs to be sent to the administrator if the staging load takes longer than one hour.

Technical Description

The support team has suggested that a worklet be created that limits the time for the workflow. This worklet will contain Timer task to keep track of the run time, an Email task to inform the administrator should the load take longer than 1 hour, and a Control task to stop the workflow if it runs more than one hour. A workflow will be created that contains the worklet and then runs the staging and fact loads.

Objectives

♦ Create a Worklet

♦ Create a Timer task

♦ Create a Control task

♦ Create an Email task

♦ Use a Worklet within a Workflow

Duration

25 minutes

Unit 18 Lab: Load Inventory Fact Table

Informatica PowerCenter 8 Level I Developer 383

Page 406: PC8LID 20061204 Large for Printing

PROCESSING DESCRIPTION (DETAIL)

Worklet: The worklet runs a Timer task that sets the timer to 1 hour beginning from the start of the task. This is followed by an email task to send an email to the administrator if the timer task completes, and a control task to fail the workflow if the time limit is exceeded.

Workflow: The workflow runs the time limit worklet in parallel with the Staging and Fact session tasks. If the two tasks exceed the time limit, the workflow will be failed and an email sent to the administrator. If the two tasks complete within the time limit a control task ends the workflow, killing the timer task in the worklet so no email is sent.

Unit 18 Lab: Load Inventory Fact Table

384 Informatica PowerCenter 8 Level I Developer

Page 407: PC8LID 20061204 Large for Printing

Instructions

Step 1: Copy the Mappings

1. Copy the m_STG_INVENTORY_LOAD and the m_FACT_INVENTORY_LOAD mappings from the DEV_SHARED folder.

2. Rename them m_STG_INVENTORY_LOAD_xx and m_FACT_INVENTORY_LOAD_xx.

3. Save your work.

Step 2: Create a Worklet

1. Launch the Workflow Manager client and sign into your assigned folder.

2. Open the Worklet Designer workspace.

3. Select the menu option Worklets Create.

4. Delete the default Worklet name and enter wklt_RUNTIME_LIMIT_xx.

Step 3: Create a Timer Task

1. Add a Timer task to the worklet.

2. Edit the Timer task.

a. Rename the task tim_MAX_RUN_TIME.

Velocity Best Practice: The wklt_ as a prefix for a Worklet name is specified in the Informatica Velocity Methodology.

Velocity Best Practice: The tim_ as a prefix for a Timer task name is specified in the Informatica Velocity Methodology.

Unit 18 Lab: Load Inventory Fact Table

Informatica PowerCenter 8 Level I Developer 385

Page 408: PC8LID 20061204 Large for Printing

b. In the Timer tab.

Set the Relative time to start after 1 hour from the start time of this task. See Figure 18-1 for details.

3. Link the Start task to the tim_MAX_RUN_TIME Timer task.

4. Save your work.

Step 4: Create an Email Task

1. Add an Email task to the worklet.

2. Edit the Email task.

a. Rename the task eml_MAX_RUN_TIME_EXCEEDED.

b. In the Properties tab:

i. Enter [email protected] as the Email User Name.

ii. Enter Workflow wkf_FACT_INVENTORY_LOAD_xx exceeded max time allotted as the Email Subject.

Figure 18-1. Timer Task Relative time setting

Velocity Best Practice: The eml_ as a prefix for an Email task name is specified in the Informatica Velocity Methodology.

Unit 18 Lab: Load Inventory Fact Table

386 Informatica PowerCenter 8 Level I Developer

Page 409: PC8LID 20061204 Large for Printing

iii. Enter something appropriate for the Email Text.

See Figure 18-2 for details.

3. Link the tim_MAX_RUN_TIME Timer task to the eml_MAX_RUN_TIME_EXCEEDED Email task.

4. Save your work.

Step 5: Create a Control Task

1. Add a Control task to the worklet.

2. Edit the Control task.

a. Rename the task ctl_WORKFLOW_TIMEOUT_FAIL.

Figure 18-2. Email Task Properties Tab

Velocity Best Practice: The ctl_ as a prefix for a Control task name is specified in the Informatica Velocity Methodology.

Unit 18 Lab: Load Inventory Fact Table

Informatica PowerCenter 8 Level I Developer 387

Page 410: PC8LID 20061204 Large for Printing

b. In the Properties tab.

c. Select Stop parent for the Control Option Value. See Figure 18-3 for details.

3. Link the eml_MAX_RUN_TIME_EXCEEDED Session task to the ctl_STOP_TIMEOUT Control task.

4. Save your work.

5. Right Click anywhere in the Worklet workspace and select Arrange > Horizontal.

Your Worklet should appear the same as displayed on Figure 18-4.

Step 6: Create the Workflow

1. Create a workflow named wkf_FACT_INVENTORY_LOAD_xx.

2. Drag the wklt_RUNTIME_LIMIT_xx Worklet from the Worklets folder in the Navigator Window into the workflow.

3. Link the Start task to the wklt_STG_INVENTORY_LOAD_xx worklet.

4. Create a session task named s_m_STG_INVENTORY_LOAD_xx that uses the m_STG _INVENTORY_LOAD_xx mapping.

Figure 18-3. Control Task Properties Tab

Figure 18-4. Completed Worklet

Unit 18 Lab: Load Inventory Fact Table

388 Informatica PowerCenter 8 Level I Developer

Page 411: PC8LID 20061204 Large for Printing

5. Edit the s_m_STG_INVENTORY_LOAD_xx session.

a. Ensure that the filename for the SQ_inventory flat file source is set to inventory.txt.

b. Set the relational connector value for the STG_INVENTORY target to NATIVE_STGxx.

c. Set the Target truncate table option to on.

d. In the General tab, select the “Fail parent if this task fails” checkbox.

6. Link the Start task to the s_m_STG_INVENTORY_LOAD_xx Session task.

7. Create a session task named s_m_FACT_INVENTORY_LOAD_xx that uses the m_FACT_INVENTORY_LOAD_xx mapping.

8. Edit the s_m_FACT_INVENTORY_LOAD_xx session:

a. Set the relational connection value for the SQ_STG_INVENTORY source to NATIVE_STGxx where xx is your student number.

b. Set the relational connection value for the FACT_INVENTORY target to NATIVE_EDWxx where xx is your student number.

c. Set the Target load type to Normal.

9. Link the s_m_STG_INVENTORY_LOAD_xx Session task to the s_m_FACT_INVENTORY_LOAD_xx Session task.

Your Workflow should appear the same as displayed on Figure 18-5.

Figure 18-5. Completed Workflow

Unit 18 Lab: Load Inventory Fact Table

Informatica PowerCenter 8 Level I Developer 389

Page 412: PC8LID 20061204 Large for Printing

Step 7: Start the Workflow and Monitor the Results

1. Start the workflow.

2. Review the workflow in the Gantt Chart view of the Workflow Monitor. When completed it should appear similar to Figure 18-6.

3. Review the run statistics for the s_m_STG_INVENTORY_LOAD Session task. They should appear similar to Figure 18-7.

Figure 18-6. Gantt chart view of the completed workflow run

Figure 18-7. Gantt chart view of the completed workflow run

Unit 18 Lab: Load Inventory Fact Table

390 Informatica PowerCenter 8 Level I Developer

Page 413: PC8LID 20061204 Large for Printing

Unit 18 Lab: Load Inventory Fact Table

Informatica PowerCenter 8 Level I Developer 391

Page 414: PC8LID 20061204 Large for Printing

Unit 18 Lab: Load Inventory Fact Table

392 Informatica PowerCenter 8 Level I Developer

Page 415: PC8LID 20061204 Large for Printing

Unit 19: Workflow Design

After completing this unit, you should beable to:

♦ Name key considerations for designing workflows

♦ Describe the process of designing a workflow

The workshop will give you practice in designing your own mappings.

Lesson 19-1. Designing Workflows

Description

This is designed to provide the user a checklist of topics to entertain during the workflow development process. This document will cover a variety of situations users will have to address and help them ask the right questions before and during the design process.

Considerations

The workflow process requires some up front research. Before designing a workflow, it is important to have a clear picture of the task-to-task processes.

♦ Design a high-level view of the workflow and document the process within the workflow, using a textual description to explain exactly what the workflow is supposed to accomplish and the methods or steps it will follow to accomplish its goal.

♦ The load development process involves the following steps:

♦ Clearly define and document all dependencies

♦ Analyze the processing resources available

♦ Develop operational requirement

♦ Develop tasks, worklets and workflows based on the results

♦ Create an inventory of Worklets and Reusable tasks. This list is a 'work in progress' list and will have to be continually updated as the project moves forward. The lists are valuable to all but particularly for the lead developer. Making an up front decision to make all Session, Email and Command tasks reusable will make this easier.

♦ The administrator or lead developer should put together a list of database connections to be used for Source and Target connection values.

♦ Reusable tasks need to be properly documented to make it easier for other developers to determine if they can/should use them in their own development.

♦ If the volume of data is sufficiently low for the available hardware to handle, you may consider volume analysis optional, developing the load process solely on the dependency analysis. Also, if the hardware is not adequate to run the sessions concurrently, you will need to prioritize them. The highest priority within a group is usually assigned to sessions with the most child dependencies.

♦ Another possible component to add into the load process is sending e-mail. Three e-mail options are available for notification during the load process:

♦ Post-session e-mails can be sent after a session completes successfully or when it fails

♦ E-mail tasks can be placed in workflows before or after an event or series of events

♦ E-mails can be sent when workflows are suspended

Unit 19: Workflow Design

Informatica PowerCenter 8 Level I Developer 393

Page 416: PC8LID 20061204 Large for Printing

♦ Document any other information about the workflow that is likely to be helpful in developing the workflow. Helpful information may, for example, include source and target database connection information, pre or post workflow processing requirements, and any information about specific error handling for the workflow.

♦ Create a Load Dependency Analysis. This should list all sessions by dependency, along with all other events (Informatica or other) that they depend on. Also be sure to specify the dependency relationship between each session or event, the algorithm or logic needed to test the dependency condition during execution, and the impact of any possible dependency test results (e.g., don't run a session, fail a session, fail a parent or worklet, etc.)

♦ Create a Load Volume Analysis. This should list all the sources and row counts and row widths expected for each session. This should include all Lookup transformations in addition to the extract sources. The amount of data that is read to initialize a lookup cache can materially affect the initialization and total execution time of a session.

♦ The completed workflow design should then be reviewed with one or more team members for completeness and adherence to the business requirements. And, the design document should be updated if the business rules change or if more information is gathered during the build process.

Workflow Overview

Workflow Specifics

The following are tips that will make the workflow development process more efficient (not in any particular order):

♦ If developing a sequential workflow, use the Workflow Wizard to create Sessions in sequence. There is also the option to create dependencies between the sessions

♦ Use a parameter file to define the values for parameters and variables used in a workflow, worklet, mapping, or session. A parameter file can be created by using a text editor such as WordPad or Notepad. List the parameters or variables and their values in the parameter file. Parameter files can contain the following types of parameters and variables:

♦ Workflow variables

♦ Worklet variables

♦ Session parameters

♦ Mapping parameters and variables

♦ When using parameters or variables in a workflow, worklet, mapping, or session, the Integration Service checks the parameter file to determine the start value of the parameter or variable. Use a parameter file to initialize workflow variables, worklet variables, mapping parameters, and mapping

Session

Task 1

Start

Task

Session

Task 2

Command

Task

Email

Task

Control

Task

Decision

Task

Unit 19: Workflow Design

394 Informatica PowerCenter 8 Level I Developer

Page 417: PC8LID 20061204 Large for Printing

variables. If not defining start values for these parameters and variables, the Integration Service checks for the start value of the parameter or variable in other places.

♦ Session parameters must be defined in a parameter file. Since session parameters do not have default values, when the Integration Service cannot locate the value of a session parameter in the parameter file, it fails to initialize the session. To include parameter or variable information for more than one workflow, worklet, or session in a single parameter file, create separate sections for each object within the parameter file.

♦ Also, create multiple parameter files for a single workflow, worklet, or session and change the file that these tasks use, as necessary. To specify the parameter file that the Integration Service uses with a workflow, worklet, or session, do either of the following:

♦ Enter the parameter file name and directory in the workflow, worklet, or session properties.

♦ Start the workflow, worklet, or session using pmcmd and enter the parameter filename and directory in the command line.

♦ On hardware systems that are under-utilized, you may be able to improve performance by processing partitioned data sets in parallel in multiple threads of the same session instance running on the Integration Service node. However, parallel execution may impair performance on over-utilized systems or systems with smaller I/O capacity

♦ Incremental aggregation is useful for applying captured changes in the source to aggregate calculations in a session. If the source changes only incrementally, and you can capture changes, you can configure the session to process only those changes. This allows the Integration Service to update your target incrementally, rather than forcing it to process the entire source and recalculate the same calculations each time you run the session.

♦ Target Load Based Strategies:

♦ Loading directly into the target is possible when the data is going to be bulk loaded.

♦ Load into flat files and bulk load using an external loader.

♦ Load into a mirror database.

♦ From the Workflow Manager Tools menu, select Options and select the option to 'Show full names of task'. This will show the entire name of all tasks in the workflow.

Unit 19: Workflow Design

Informatica PowerCenter 8 Level I Developer 395

Page 418: PC8LID 20061204 Large for Printing

Unit 19: Workflow Design

396 Informatica PowerCenter 8 Level I Developer

Page 419: PC8LID 20061204 Large for Printing

Unit 19 Workshop: Load All Staging Tables in Single Workflow

Business Purpose

All of the staging tables need to be loaded in a single workflow.

Technical Description

The instructions will provide enough detail for you design and build the workflow that will load all of the staging tables in a single run. If you are unclear on any of the instructions please ask the instructor.

Objectives

♦ Design and create a workflow to load all of the staging tables

Duration

60 minutes

Workshop Details

Mappings Required

This section contains a list of the mappings that will be used in the workflow.

♦ m_Stage_Payment_Type

♦ m_Stage_Product

♦ m_Dealership_Promotions

♦ m_Stage_Customer_Contacts

♦ m_STG_TRANSACTIONS

♦ m_STG_EMPLOYEES

Workflow/Worklet Details

This section contains the workflow processing details.

1. Name the workflow wkf_LOAD_ALL_STAGING_TABLES.

The workflow needs to start at a certain time each day. For this workshop you can set the start time to be a couple of minutes from the time you complete the workflow. Remember that the start time is relative to the time on the Integration Service process machine.

2. No sessions can begin until an indicator file shows up. The indicator file will be named fileindxx.txt and will be created by you using any text editor. You will need to place this file in the directory indicated by the instructor after you start the workflow. If you are in a UNIX environment you may skip this requirement.

3. In order to utilize the CPU’s in a more efficient manner you will want to run some of the sessions concurrently and some of them sequentially:

a. The sessions containing mappings m_Stage_Payment_Type, m_Stage_Product and m_Dealership_Promotions can be run sequentially.

b. The session containing mapping m_Stage_Customer_Contacts can be run concurrently to the sessions in the previous bullet point.

Unit 19: Workflow Design

Informatica PowerCenter 8 Level I Developer 397

Page 420: PC8LID 20061204 Large for Printing

c. If any of the previous sessions fails then an email should be sent to the administrator and the workflow aborted. Use [email protected] as the Email User Name.

d. The session containing mapping m_STG_EMPLOYEES can only be run after the 4 previously mentioned sessions complete successfully.

e. The session containing mapping m_STG_TRANSACTIONS needs to be run concurrently to the m_STG_EMPLOYEES.

f. If either of the previous sessions fails an email should be sent to the administrator.

4. All sessions need to truncate the target tables. You may want to create reusable sessions from previously created workflows.

5. The management only wants the workflow to run a maximum of 50 minutes. Should the workflow take longer than the 50 minutes an email must be sent to the Administrator. Should the workflow finish in the allotted time the timer task will need to be stopped.

There is more than one solution to the workshop. You will know that your solution has worked when all of the sessions have completed successfully.

Unit 19: Workflow Design

398 Informatica PowerCenter 8 Level I Developer

Page 421: PC8LID 20061204 Large for Printing

Unit 19: Workflow Design

Informatica PowerCenter 8 Level I Developer 399

Page 422: PC8LID 20061204 Large for Printing

Unit 19: Workflow Design

400 Informatica PowerCenter 8 Level I Developer

Page 423: PC8LID 20061204 Large for Printing

Unit 20: Beyond This Course

Note: For more information on PowerCenter training, see http://www.informatica.com/services/education_services

Unit 20: Beyond This Course

Informatica PowerCenter 8 Level I Developer 401

Page 424: PC8LID 20061204 Large for Printing

Note: For more information and to register to take an exam, see http://www.informatica.com/services/education_services/certification/default.htm

Unit 20: Beyond This Course

402 Informatica PowerCenter 8 Level I Developer