IBM Information Server WebSphere DataStage 8 - IBM - United States

46
IBM Information Server WebSphere DataStage 8.0 Richard Hedges Program Director, Product Management IBM Information Server

Transcript of IBM Information Server WebSphere DataStage 8 - IBM - United States

Page 1: IBM Information Server WebSphere DataStage 8 - IBM - United States

IBM Information ServerWebSphere DataStage 8.0Richard HedgesProgram Director, Product ManagementIBM Information Server

Page 2: IBM Information Server WebSphere DataStage 8 - IBM - United States

Agenda

IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0

Page 3: IBM Information Server WebSphere DataStage 8 - IBM - United States

IBM Information ServerDelivering information you can trust

Understand Cleanse Transform Deliver

Parallel Processing

Rich Connectivity to Applications, Data, and Content

IBM Information Server

Discover, model, and govern information

structure and content

Standardize, merge,and correct information

Combine and restructure information for new uses

Synchronize, virtualizeand move information for

in-line delivery

Unified Deployment

Unified Metadata Management

Page 4: IBM Information Server WebSphere DataStage 8 - IBM - United States

IBM Information Server Architecture

AnalysisInterface

Web AdminInterface

DevelopmentInterface

UNIFIED USER INTERFACE

COMMON SERVICES

MetadataServices

SecurityServices

Logging &ReportingServices

UNIFIED METADATA

Design Operational

UNIFIED PARALLEL PROCESSING

Understand Cleanse Transform

COMMON CONNECTIVITY

UnifiedService

Deployment

Deliver

Structured, Unstructured, Applications, Mainframe

Supporting IBM WebSphere

Application Server

Supporting IBM DB2, Oracle, and MS SQL Server

Page 5: IBM Information Server WebSphere DataStage 8 - IBM - United States

Agenda

IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0

Page 6: IBM Information Server WebSphere DataStage 8 - IBM - United States

DataStage and QualityStage Designer

Page 7: IBM Information Server WebSphere DataStage 8 - IBM - United States

Quick Find - Basic

Find item in Repository tree– In-place find

– Find by Name (Full or Partial)

– Wild card support– Find next…– Filter on type

Page 8: IBM Information Server WebSphere DataStage 8 - IBM - United States

Find – Advanced Search Criteria

Search on following criteria:– Object type

• Job, Table Definition, Stage etc.– Creation

• Date/Time• By User

– Last Modification• Date/Time• By User

– Where Used• What other objects use this object?

– Dependencies of• What does this object use?

Options– Case– Match on “name & description” or

“name or description”

Page 9: IBM Information Server WebSphere DataStage 8 - IBM - United States

Impact Analysis – Graphical View

Results shown using the Advanced Find window

-Find dependencies…What does this item depend on?

-Find where used…Where is this item used?

Impact Analysis:

Page 10: IBM Information Server WebSphere DataStage 8 - IBM - United States

Impact Analysis – Tabular View

Results can be saved to html or xml file for additional processing or remote user viewing. Within application, results list can feed export, reporting or compilation functions

Page 11: IBM Information Server WebSphere DataStage 8 - IBM - United States

Job, Table or Routine Difference

Tables

Available for Jobs, Tables & Routines

Textual report with hot links to the relevant editor in Designer.

Page 12: IBM Information Server WebSphere DataStage 8 - IBM - United States

Job Parameter Sets

New object in repository that contains the names and values of job parameters

A Parameter Set can be referenced by one or more jobs

Page 13: IBM Information Server WebSphere DataStage 8 - IBM - United States

Job Parameter SetsCan use Impact Analysis to determine which Jobs are using a Parameter Set

Works for DataStage Server and DataStage Enterprise Edition

Easier to share job parameters across jobs

Easier to deploy jobs across machines

Easier to propagate a changed job parameter value

Page 14: IBM Information Server WebSphere DataStage 8 - IBM - United States

Collaboration: Multi-User Environment

Locking to prevent concurrent update clashesOptional “read-only” view when items already locked in RepositoryVisible lock “owner” to aid identification

– By Name & Session ID

Identified user for “last modified” or “created by” actions– Searchable using Advanced Find– E.g. “Find all items created by user x today”

Page 15: IBM Information Server WebSphere DataStage 8 - IBM - United States

Export Improvements

Export based on a result of a search

Available from

The new GUI allows modification of the original populated export list. Items can be added, removed, filtered out.

Page 16: IBM Information Server WebSphere DataStage 8 - IBM - United States

Meta Data SharingDataStage, QualityStage & Information Analyzer

Sharing meta data with WebSphere Information Analyzer– Both tools store Table meta data in the common repository– DataStage users can see the table meta from Information Analyzer

• Allows sharing of meta data definitions• Provides single meta data import from data source ~ for use in both tools• Enables DS user to see IA analysis data for shared tables

– Where is the IA analysis informationavailable in DS/QS Designer?

• “Analytical Information” tab on theEditRow dialog when looking at thedetails of an individual column from…

– …a Table Definition– …a stage editor

• “Analytical Information” tab on the TableDefinition dialog

Page 17: IBM Information Server WebSphere DataStage 8 - IBM - United States

Agenda

IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0

Page 18: IBM Information Server WebSphere DataStage 8 - IBM - United States

Lookup Stage – New Range Capabilities

Range check box allows you to specify a range key for a 1 to 2 type range lookup

Key Type drop down allows you to specify a range key for a 2 to 1 type range lookup

Double clicking on the Key Expression field of a range key will bring up the Range Expression dialog

Page 19: IBM Information Server WebSphere DataStage 8 - IBM - United States

New Range Expression Dialog

Column selection for the range key from the reference table

Column selection for the bounding columns from the primary input

Range expression operator drop down. Specifies whether the range bounds are inclusive or exclusive

Page 20: IBM Information Server WebSphere DataStage 8 - IBM - United States

Surrogate Key ManagementNew engine functionalityExposed in 2 new stages and 1 old one– Surrogate Key Generator– Slowly Changing Dimension– Transformer – Initialize(), GetNextKey()

How it works– Uses built-in state files or DBMS sequences (DB2 & Oracle)– Supports large integer (uint64) surrogate key values– Can be used to discover surrogate key values which are already

being used so that use of duplicate key values will be avoided– Customizable block size to manage key gaps vs. performance

Page 21: IBM Information Server WebSphere DataStage 8 - IBM - United States

New Functionality to Support SCD

New engine capabilities– Surrogate Key

management– Updatable in-memory

lookups New & enhanced stages– Surrogate Key

Generator– Slowly Changing

Dimension

Page 22: IBM Information Server WebSphere DataStage 8 - IBM - United States

Agenda

IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0

Page 23: IBM Information Server WebSphere DataStage 8 - IBM - United States

Connectivity Updates

New functionality and more DB supported in SQL builders– SQL Server, Teradata, ODBC

New Stored Procedures functionality and for more DBs– SQL Server, Teradata

Latest/Greatest version support (not all listed)– DB2 9.1– Oracle 10gR2– SQL Server 2005– Teradata v2r6.1 (DB server) / 8.1 (TTU)– Sybase ASE 15, Sybase IQ 12.7– Informix 10 (IDS)– SAS 9.1– IBM WS MQ 6.1, WS MB 5.1– Netezza v3.1

Page 24: IBM Information Server WebSphere DataStage 8 - IBM - United States

New Connectivity– Stages for WebSphere Federation and Classic Federation

• Server and Enterprise stages • DRS Support• Native integration with Federation and Classic Federation

– Netezza Enterprise Stage• Parallel Loader leveraging NZ_Load and External Tables

– SFTP Enterprise Stage • Secure data transmission

– iWay Enterprise Stage• Integration with over 250 disparate/legacy sources

Page 25: IBM Information Server WebSphere DataStage 8 - IBM - United States

Connection ObjectsNew top-level repository objectAllows saving of a re-usable connection path to a specific source or target– Username, password, db

name etc.Supported on specific stagetypes– New Rich Connectors– Enterprise Stages: DB2,

Informix, Oracle, Teradata– For Plug-ins…– For Server built-ins

• ODBC, UniVerse, UniData

Page 26: IBM Information Server WebSphere DataStage 8 - IBM - United States

Next Generation “Rich” ConnectorsCombining the best of the plug-ins, operators, plus more.....

ODBC– Embedded DataDirect v5.2 Connect for ODBC drivers

DB2 – Q107– For DPF and non-DPF

Teradata – Q107– New support for Teradata Parallel Transport (TPT)

Oracle – Q107– New support for 10gR2

WebSphere MQ – Q107– Adding support for “client only” configuration

Page 27: IBM Information Server WebSphere DataStage 8 - IBM - United States

Next Generation “Rich” Connectors

Connection objects allow properties to be dropped onto stage

Diagram lets you select the link to edit as though you’re on the canvas

Warning sign tells you which

fields are mandatory

Test the connection

instantly

Parameter button on every field

Graphical SQL builder

Page 28: IBM Information Server WebSphere DataStage 8 - IBM - United States

Enterprise Packs Updates

– New Validations for enterprise apps versions• SAP ECC 6.0• SAP BI 7.0• Siebel 7.8• JD Edwards EnterpriseOne 8.12

– New SAP Unicode Certifications• BW-STA 3.5 : Staging BAPI certification for BW Load • BW-OHS 3.5 : Open-Hub service certification for BW Extract• CA-ALE 4.0 : IDoc Load and Extract supports Web AS 6.40• IA-BAPI : BAPI Load and Extract supports Web AS 6.40

– New Functionality• Enhanced support for Siebel EIM and Business Components• New Metadata browser and importer for Oracle Applications• Greater support for large enterprise class deployments

Page 29: IBM Information Server WebSphere DataStage 8 - IBM - United States

CFF Stage – Multi-Format Record Support

Complex Flat File stage now processes Multi Format Flat (MFF) file

Constraints can be specified on the output links to filter data and/or define when a record should be sent down the link

New Fast Path feature provides guided creation

Page 30: IBM Information Server WebSphere DataStage 8 - IBM - United States

Agenda

IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0

Page 31: IBM Information Server WebSphere DataStage 8 - IBM - United States

Performance ImprovementsImproved Job Startup Time– Allow efficient use of DS EE against smaller data sets

Buffer Optimization– Improved buffer placement algorithm– E.g., Removed unnecessary buffer before parallel sort in some

instances

Combinability Optimizations– More combinable stages– Intelligent combining

Adaptive Job Monitoring– The Adaptive Job Monitoring feature detects when CPU utilization

by the conductor reaches 80% and throttles the volume of job monitoring data

– Note: only monitor messages will be throttled, metadata and summary messages are not affected

– Time-based monitoring is now supported

Page 32: IBM Information Server WebSphere DataStage 8 - IBM - United States

Job Performance Analysis

A new visualization tool which:Provides deeper insight into runtime job behavior.Offers several categories of visualizations, including:

– Record Throughput– CPU Utilization– Job Timing– Job Memory Utilization– Physical Machine

UtilizationHides runtime complexity by emphasizing the stages on the designer canvas.

Page 33: IBM Information Server WebSphere DataStage 8 - IBM - United States

Resource EstimationDifficult to estimate resources required for job execution– Scratch space, CPU, etc.

What happens if data volume increases?How do I prevent job aborting due to lack of system resources?

Page 34: IBM Information Server WebSphere DataStage 8 - IBM - United States

Resource Estimation Tool Layout Overview

Page 35: IBM Information Server WebSphere DataStage 8 - IBM - United States

Agenda

IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0

Page 36: IBM Information Server WebSphere DataStage 8 - IBM - United States

New IBM Information Server Installation

Page 37: IBM Information Server WebSphere DataStage 8 - IBM - United States

Create Users, Assign Roles, and Map Credentials

1. Administration tab click on users then select create new users

2. Enter values for the different user attributes. Id, Password, First Name and Last Name are required

3. Assign Suite and Product Roles as appropriate

4. Click on Save

5. Map Credentials

Page 38: IBM Information Server WebSphere DataStage 8 - IBM - United States

Security ServicesInternal Directory– Defines users, groups, roles– Support browsing/creation/deletion/update operations

External Directories– LDAP, Active Directory, Unix– External directories password are not stored– Support browsing/partial update operations

Roles– Suite roles: Suite User, Suite Administrator– Product roles: e.g. DataStage user– Project roles: e.g. Information Analyzer User

Standard Based Authentication– JAAS– Work against the supported directories

Page 39: IBM Information Server WebSphere DataStage 8 - IBM - United States

LoggingA new common logging facility– Used by all the products of the Suite– Logs go into the operational repository

DataStage Client log viewer does not changeLogging administration done from the administration consoleLogging Views are “saved queries”– Opening a view displays the log events corresponding to the

“saved query”– Example

• Severity level: Error• Category: DataStage• Timestamp: past 12 hours

– A user can now view logs in a Production environment via a browser and perform nothing else in that environment

Page 40: IBM Information Server WebSphere DataStage 8 - IBM - United States

Reporting Console

Can publish reports from DataStage to the IBM Information Server Reporting Console

Job Reports, Advanced Find, Impact Analysis, etc.

Page 41: IBM Information Server WebSphere DataStage 8 - IBM - United States

Source-to-Target and Target-to-Source

Page 42: IBM Information Server WebSphere DataStage 8 - IBM - United States

Agenda

IBM Information Server Overview & ArchitectureWebSphere DataStage Usability ImprovementsBest in class Data TransformationFocus on ConnectivityPerformance, Performance, and PerformanceInstallation, Configuration, Administration, ReportingUpgrade to WebSphere DataStage v8.0

Page 43: IBM Information Server WebSphere DataStage 8 - IBM - United States

UpgradeAll objects from DataStage v7 projects upgrade into DataStage v8.0– Export projects and Import into DataStage v8.0– All jobs (Server, Parallel, Mainframe, and Sequencer)

along with all other objects will migrate

Unix users can install IBM Information Server and previous versions on the same server

Note: DataStage Version Control not in v8.0.

Page 44: IBM Information Server WebSphere DataStage 8 - IBM - United States

PlatformsAt GA– DS & QS Client: Windows XP– Windows Server 2003– AIX 5.2, 5.3– Red Hat Enterprise Linux AS 3.0– Red Hat Enterprise Linux AS 4.0– SuSE Enterprise Linux 9, 10– HP-UX 11i1 (11.11), 11i2 (11.23) – PA-RISC– Solaris 2.9, 2.10

NLS Support, but not localized

Page 45: IBM Information Server WebSphere DataStage 8 - IBM - United States

The IBM Information Server AdvantageA Complete Information Infrastructure

A comprehensive, unified foundation for enterprise information architectures, scalable to any volume and processing requirement

Auditable data quality as a foundation for trusted information across the enterprise

Metadata-driven integration, providing breakthrough productivity and flexibility for integrating and enriching information

Consistent, reusable information services—along with application services and process services, an enterprise essential

Accelerated time to value with proven, industry-aligned solutionsand expertise

Broadest and deepest connectivity to information across diverse sources: structured, unstructured, mainframe, and applications

Page 46: IBM Information Server WebSphere DataStage 8 - IBM - United States

Thank You!