Post on 22-Dec-2015
description
Parameters and Variables in Informatica
December 12, 2014·
Aaron Gendle ·
2 Comments·
346 Views
An Overview of Parameters and Variables in Informatica
Mapping parameters and variables in Informatica are very useful. These function similarly to other
programming languages like Java and C++. We can leverage parameters and variables to re-use a
value within a mapping or mapplet. These values can be constant (parameter) or dynamic (variable).
This flexibility gives us control as developers to create more versatile mappings and workflows.
Lets take a closer look at the differences between mapping parameters and variables in Informatica
PowerCenter.
Parameters vs Variables in InformaticaParameters in Informatica are constant values (datatypes strings, numbers, etc…). Variables on the
other hand can be constant or change values within a single session run. We declare a parameter or
variable within a mapping by selecting the Parameters and Variables Mappings menu item within the
Mapping Designer.
After selecting the Parameters and Variables Mappings menu item, aDeclare Parameters and
Variables dialog box will open up.
We define the value of the parameter in a parameter file prior to running a session.
We can do this for a variable as well, but is not necessary. If we define a variables value within a
parameter file, the mapping/session will use the parameter file’s value as the variables initial/start
value. If we do not declare the variables value in a parameter file, the last value saved by the
session/mapping will be the variables initial/start value.
One final place the Informatica Integration Service will look for an initial value is in the “Initial Value:”
setting of the Declare Parameters and Variables dialog box. If a parameter has not been defined
within a parameter file, this initial value will be used as the start value. Similarly, if a variable has not
been defined within a parameter file and the Integration Service cannot find a saved variable value
within the repository.
If none of these parameter/variable initial values are set explicitly, Informatica will default a string
datatypes to an empty string, numbers to 0, and dates to 1/1/1753 A.D. or 1/1/1.
Parameter and Variable Start Value Setting Order
Parameter Variable
1. Value in parameter file2. Value in pre-session variable assignment3. Initial value saved in the repository
1. Value in parameter file2. Value in pre-session variable assignment3. Value saved in the repository4. Initial value
4. Datatype default value 5. Datatype default value
Where to Use Parameters and VariablesWe can use the a parameter or variable in the Expression editor of any transformation in a mapping or
mapplet. Source Qualifier transformationsand reusable transformation are also places that can
leverage parameters and variables. I have personally used parameters in many SQL override
statements in Source Qualifier transformations.
One use case is to create a parameter for your schema in case the schema was to change for the
tables in your SQL statement. For example, lets say you are migrating from DB2 to an Oracle
database. Your schema definition for a set of tables might be DB2.
For example…
SELECT * DB2.Contract WHERE CONTRACT_NUM LIKE ‘ABC%’
When migrating to Oracle, the DBAs may insist on changing our schema from DB2 to ORACLE. So in
order for a our custom SQL statement to work within our Source Qualifier transformation, we will need
to update every SQL statement and table being referenced with DB2 to ORACLE.
Now if we consider the above scenario and only have a handful of mappings referencing DB2 tables,
then its probably not that big a deal to make this update. However, if our entire data warehouse
resides on DB2 and we have hundreds of mappings to analyze and update, then we have a bunch of
work to do. So, if we were to think ahead about this scenario, we could have easily created a
parameter file to get around this problem.
For example, lets say we created a parameter called $$DW_SCHEMA and set the parameter value in
a parameter file to DB2. Now our SQL statement referenced before would look like this…
SELECT * ‘$$DW_SCHEMA’.Contract WHERE CONTRACT_NUM LIKE ‘ABC%’
Now, when the DBA’s say they are going to migrate to Oracle and switch the schema name from DB2
to ORACLE, this issue become a simple value change in our parameter file. All we have to do is
change DB2 to ORACLE…
$$DW_SCHEMA = DB2
gets updated to…
$$DW_SCHEMA = ORACLE
While this does take some forward thinking, it is a real world scenario that can happen and should be
considered as part of your ETL architecture.
More About VariablesWhile, I want to reserve the nitty gritty details of Informatica variables for a later post, I do not want to
share a few more things.
As already stated, variables can change values throughout a session. At the start of a session the
variable’s initial value is set according the order specified in our table listed earlier in the post. Once
the Integration Service reads in the initial value, the value can be changed within the session and
saved for use for the next session run.
We can set or change our variable with use of variable function like SetVariable, SetMaxVariable,
SetMinVariable, and SetCountVariable. These functions can be used within the following
transformations…
1. Expression
2. Filter
3. Router
4. Update Strategy
The only time the Integration Service will not save our variable is due to one of the below conditions…
1. The session fails to complete.
2. The session is configured for a test load.
3. The session is a debug session.
4. The session runs in debug mode and is configured to discard session output.
Remember, variables stored in a parameter file will override our saved session variables.
Quick TipMost databases require a single quote around string values, so make sure to add these in any custom
SQL override statments. If $$Country is defined in a parameter file as USA for example, add single
quotes around $$COUNTRY…
SELECT * FROM STATE WHERE COUNTRY = ‘$$COUNTRY’
Will become…
SELECT * FROM STATE WHERE COUNTRY = ‘USA’
SummaryAfter reading through this post, I hope you can see the value of Parameters and Variables in
Informatica. With parameters and variables we can simplify architecture and reduce rework. There are
many more use cases for parameters and variables within Powercenter.
I invited you to share a creative way you have been able to leverage Informatica parameters and
variables!
SUBSTR in Informatica with Examples
October 1, 2014·
Aaron Gendle ·
982 Views
An Overview of the SUBSTR Function in Informatica
SUBSTR in Informatica is a function that returns a subset of characters from a larger string. We can
use this data as part of some additional mapping logic or map it to a target table to be consumed by
business. SUBSTR is used primarily within the Expression Transformation in Informatica. This
function works perfectly with pattern based string values like zip codes or phone numbers.
Lets take a look at a quick SUBSTR in Informatica example.
Phone Number ExampleLets say we have the below phone numbers passing through our mapping into an expression
transformation:
209-555-1234
714-555-5678
515-555-9123
Assume we want to populate a PHONE table along with AREA_CODE and MAIN_LINE fields.
SUBSTR in Informatica works perfectly for extracting these pieces of data out of the full phone
number.
Lets take a quick look at the sytax we must use:
SUBSTR( string, start [,length] )
Our first two parameters are required, the third is optional.
1. “string” is defined as the character/string that we want to search. Generally we would pass an
expression string variable or input port.
2. “start”, defined by an integer, is merely the starting position to begin counting. We can pass a
positive or negative value here. If we pass a positive value, we count left to right for our starting
position. Conversely, if we pass a negative value, we count right to left for our starting position. The
integration service considers a 0 equal to 1, the first character in our string.
3. “length” is an optional parameter. If entered, it must be an integer greater than 0. It tells the
integration service how many characters of the string we want to return based on our starting position.
If left blank, the entire string will be returned from the start location specified.
Ok now that we understand the SUBSTR in Informatica syntax, lets continue our phone number
example.
Area Code
Using the below SUBSTR in Inforamtica parameter values, we can return the first three characters
from our PHONE_NUMBER data:
SUBSTR(PHONE_NUMBER, 1, 3)
PHONE_NUMBER AREA_CODE
209-555-1234 209
714-555-5678 714
515-555-9123 515
I named this expression output port OUT_AREA_CODE.
Lets add another expression output port, OUT_MAIN_LINE. We will define it with the below SUBSTR
statement. We start at the 5th character of our PHONE_NUMBER and return the next 8 characters.
SUBSTR(PHONE_NUMBER, 5, 8)
PHONE_NUMBER MAIN_LINE
209-555-1234 555-1234
714-555-5678 555-5678
515-555-9123 555-9123
Putting it all together, our expression transformation will produce the following:
PHONE_NUMBER AREA_CODE MAIN_LINE
209-555-1234 209 555-1234
714-555-5678 714 555-5678
515-555-9123 515 555-9123
Below is a snapshot of our expression transformation ports tab. I defined our new fields using
SUBSTR as OUT_AREA_CODE and OUT_MAIN_LINE.
Common QuestionsQuestion 1 – What will the SUSTR in Informatica fuction return when my “string” value is NULL?
A. When the string value is NULL, SUBSTR will return NULL.
Question 2 – What if my “string” does not follow a character length pattern. How would I return the
domain names in an email address for example?
A. Many times our data is not simple. It may follow a pattern of some kind, but perhaps not as straight
forward as our PHONE_NUMBER example.
In these situations, we need to use the INSTR function to determine either our start position, length of
characters to return or both.
In the case of an email domain, we would need to do something like the below…
SUBSTR(EMAIL_ADDRESS, INSTR(EMAIL_ADDRESS, ‘@’))
We passed the EMAIL_ADDRESS port into our SUBSTR string value parameter. Since we cannot
predict the starting position for every email address ahead of time, I used the INSTR function to get
my start position. I passed the same EMAIL_ADDRESS port into INSTR as the string to search in,
and then the @ symbol as the character to search for.
The INSTR function in Informatica will then return the start postion of the first occurrence of the @
symbol. Since I do not know how long any domain will be, I left the SUBSTR length optional
parameter empty so the entire domain will be returned.
Using some real data, our results might look something like this:
EMAIL_ADDRESS DOMAIN
12345Go@gmail.com gmail.com
hello@hotmail.com hotmail.com
dataintegration@yahoo.com yahoo.com
SummarySUBSTR in Informatica is a very useful function. It helps us extract specific characters from a string
that might be useful on their own. The phone number use case is a perfect example of how SUBSTR
can be used on strings with simple, consistent patterns. For more complex patterns we might use the
INSTR function in Informatica to compliment SUBSTR.
Let me know how you have used the SUBSTR function in Informatica?
Expression Transformation in Informatica
July 7, 2014·
Aaron Gendle ·
1 Comments·
1603 Views
An Overview of the Expression Transformation in Informatica
The Expression Transformation in Informatica is a passive, connectedtransformation. It allows
data be be transformed one row at a time. Dozens of built in functions and variables can be
programmed into the expression transformation. This flexibility provides almost limitless field
transformation possibilities for any given record. Keep in mind however, aggregation across multiple
rows is not possible within the expression transformation and should be performed by Informatica’s
Aggregator Transformation.
Lets take a quick look at the different tabs within Informatica’s expression transformation.
Informatica Expression Transformation TabBy double clicking on the expression transformation within a mapping, we get an edit transformation
window that defaults to the transformation tab.
Within the transformation tab, we have the ability to rename the transformation by clicking the
Rename button and updating the name in transformation name text box. Notice how I have renamed
our the expression transformation to EXP_SALES_HIGH.
Additionally, we can type a useful transformation description, and/or make the transformation reusable
by checking the “make reusable” checkbox.
Informatica Expression Properties TabLets skip to the properties tab where we have control over one transformation attribute option, tracing
level. The tracing level attribute lets us control the amount of detail provided in the session log for this
specific transformation. By default, the tracing level is set to normal.
Informatica Expression Metadata Extensions TabThe metadata extensions tab lets us associate additional metadata about our transformation object to
the existing repository metadata. In the example below, I added a specific agent name “Joe Smith” as
an extension name for our EXP_SALES_HIGH expression transformation. This agent name will now
be associated with this object in the repository manager.
Informatica Expression Ports TabFinally, lets review the expression transformation ports tab. This tab contains the heart and soul of the
expression transformation in Informatica. Within the port tab we get a view of which ports are feeding
into and out of the transformation. Additionally, we can add variable ports that allow us to manipulate
our input ports, transforming data to fit our business needs. Starting from left to right, lets review each
field attribute on the ports tab.
Port Field Attribute Name
Description
Port Name This textbox is used to name each input, output, and variable port. Names must be unique within the transformation
Datatype This drop down box allows us to select from several datatype (bigint, decimal, double, string, etc…) options. We should align this value with our actual data’s specific datatype.
Length/Precision Generally, when configuring any numeric datatype, we have the option of also selecting a length or precision for our data. If for example we new the length of our integer field would never exceed 10 digits, we might select 10 for our precision
Scale Scale allows us to fine tune the number of digits we want in our decimal position. For example, in our example above, v_NET_SALES is being rounded to two decimal places, therefore our scale has been set to two.
Input Port The input port checkbox, if checked, signifies that specific port is an input from another transformation in the mapping. Input ports
generally have the prefix IN_
Output Port The output port checkbox, if checked, signifies that specific port is an output port for the expression transformation and can be passed to downstream transformations. Output ports generally have the prefix of OUT_
Variable/Local Port
The local variable port allows us to generate our own variables within the transformation. As a general rule, make sure to place the port above any other ports that are calling the variable. local variable ports generally have the prefix of V_. When checked, the input and output port checkboxes will get unchecked automatically.
Expression The expression attribute allows us to transform data coming into the transformation one row at a time. This is where we can use functions, variables, calulations, and conditional statements to derive data that conforms to our business rules.
Lets take a deeper look into the expression attribute. Notice, we can single click each fields
expression box if the field is an output or variable port. After clicking, we get the below expression
editor box. Just to provide a high level view of what we can do here, all the action will occur within the
formula box. This is where we will place all our data manipulating functions and calculations.
On the left side of our expression editor, we have functions, ports, and variables tabs. The functions
tab gives a list and functional description of the over 90 built in Informatica expression transformation
functions. The ports tab lists all available ports that can be manipulated within the expression. Finally,
the variables ports provides a list of all built in and user specified variables.
On the bottom right of the expression editor, a validate button is available to validate the formula we
have placed in the formula box. If our formula is successfully validated we will get a dialog box that
looks like this…
If our formula is not parsed successfully, we will get a dialog box that looks like this…
If we just click OK, our formula will automatically be validated and we will only receive an
unsuccessful message if our formula is invalid. Our expression editor will close without a message if
our formula is valid.
Finally, we have a COMMENT button that generates a comment box. This box allows us to generate
more detailed comments around each individual field within our expression.
Velocity Naming StandardVelocity recommends the below format when naming a Expression Transformation in Informatica:
EXP_{DESCRIPTOR}
Ex: EXP_SALES
SummaryThe Expression Transformation in Informatica is one of the most common transformation in a
PowerCenter mapping. The middle portion of most mappings are all about transforming data and that
is exactly what Informatica’s expression transformation was designed for. With it, we can transform
any port passed through it, one record at a time.
Filter Transformation in Informatica March 2, 2014·
Ravneet Ghuman ·
1239 Views
An overview of the Filter Transformation in Informatica.
The Filter Transformation in Informatica is both active and connected. The Filter Transformation
evaluates filter conditions as True or False, and the Integration Service lets all records that satisfy the
True condition through.
Filter Transformation in Informatica ExampleLets work through a filter transformation in Informatica example.
To give a marketing offer to customers spending more than $50,000/ year, we could add a filter
condition like ‘AnnualSpend >= 50000’ to let only such records through.
All input ports for a filter transformation in Informatica should be mapped to a single preceding
transformation.
Filter conditions can be added using the Informatica Expression editor, and should evaluate to either
True or False. Filter condition can also be created using constants that return numeric values such as
any non-zero value(representing True) or a zero value(representing False). Any condition that returns
a NULL value is treated as False.
Properties TabThe properties tab below allows us to set the filter condition. The default value is True, which lets all
records through.
Common Questions1. Where should I place filter transformation in Informatica mapping?
To improve performance of a workflow, filter out unwanted records as early as possible in the
mapping – this would ensure fewer records flow through a major part of the map, thereby improving
performance.
2. Is it better to use filter conditions in Source Qualifier Transformations over a Filter Transformation in
Informatica to drop records?
Yes, set filter conditions in Source Qualifier transformations where possible. The advantage of this
approach is that it limits the number of records read from the source itself as against reading all
records and then using a filter transformation to process fewer records. However, filter conditions can
be set in Source Qualifier Transformation only for relational sources, whereas Filter transformations
can work with any source type, all it needs is a condition that returns either a True or a False.
Velocity Naming Standard
Velocity recommends the below format when naming a Filter Transformation in Informatica mapping:
FIL_{DESCRIPTOR}
Ex: FIL_EmployeeSalary50k
Filter Transformation in Informatica SummaryThe Filter Transformation in Informatica is a very common Informatica Powercenter mapping
object. Its basic purpose is to filter data not matching a developer specified conditional statement.
Remember to improve performance, place this transformation early in a mapping to ensure as few
records are process in later transformations. Enjoy using the filter transformation in Informatica.
Router Transformation in Informatica June 28, 2014·
Aaron Gendle ·
1317 Views
An overview of the Router Transformation in Informatica
The Router Transformation in Informatica is an active, connectedtransformation. A router
transformation allows us to test a single source of data for multiple conditions. We may then filter and
route data that fits specific conditions to different pipelines and targets. The router behaves similarly
to Informatica’s filter transformation .
Much of what you can accomplish with the Router Transformation in Informatica can also be
accomplished with a filter transformation. However, the router allows us to route and filter within a
single transformation. We would need to create and configure multiple filter transformations to
accomplish the same task.
Lets take a deeper look at what I am talking about. Below is an example of a single source of sales
data being passed through a router transformation.
Within the router transformation’s group tab (see below) I created three different output groups:
SALES_HIGH, SALES_MEDIUM, and SALES_LOW. Within each group filter condition field, I coded
some conditional logic. This logic will check for the amount of sales of each sales agent record
passed through the transformation. If a given record meets the conditional logic for a group, it will be
passed down its data path, through its output ports, to one of the three distinct expressions connected
to the router.
For example, lets assume sales agent John Smith has a total sales of $11,594.54. This record would
be meet the condition for HIGH_SALES and continue to EXP_HIGH_SALES since 11,594.54 is
greater than 10,000.00.
As previously mentioned, this same functionality can be accomplished with multiple filter
transformations as seen below. We could do this by creating three different filter transformations and
adding each individual group output filter logic to our three distinct filters.
FIL_SALES_HIGH Filter Condition Logic
TOTAL_SALES > 10000
FIL_SALES_Medium Filter Condition Logic
TOTAL_SALES <= 10000 AND TOTAL_SALES > 5000
FIL_SALES_HIGH Filter Condition Logic
TOTAL_SALES <= 5000
Informatica Router Transformation Example SummaryWhile is almost always more than one way to do something, there is general a right an a wrong way.
In this example, our mapping will perform more quickly and we will save on repository space if we use
the single router transformation instead of three distinct filter conditions.
Informatica Router Transformation Default PortThe router transformation output groups we previously discussed (SALES_HIGH, SALES_MEDIUM,
and SALES_LOW) were all user-defined. However, there is one final output group that is auto
generated when the router transformation is created. This group, is the DEFAULT output group. The
DEFAULT output group passes data that does not meet any of the other output group conditional
statements. For example, lets say we changed our SALES_LOW output group condition logic to
TOTAL_SALES <= 4000 instead of 5000. This change would redirect any record with a total sales
from 4001 to 5000 to the DEFAULT port. We could then decide if we wanted to route these DEFAULT
port records to an additional data path/target or just let them get filtered out of our mapping
completely.
Velocity Naming StandardVelocity recommends the below format when naming a Router Transformation in Informatica:
RTR_{DESCRIPTOR}
Ex: RTR_SALES
Router Transformation in Informatica SummaryThe Router Transformation in Informatica is very helpful for splitting a single data pipeline into
many. We might split and route data for many reasons, but most times I have used it for splitting data
for insert, update, or delete into the same target table. Make sure you understand the router
transformation in Informatica, as you will surely use it when developing your data warehouse
environment.
Update Strategy Transformation In Informatica
July 3, 2014·
Aaron Gendle ·
2620 Views
An Overview of the Update Strategy Transformation in Informatica
The Update Strategy Transformation in Informatica is an active,connected transformation. Its
purpose is to control how data is inserted, updated, deleted, and rejected from a given target table. It
is vital in the data integration/warehousing world to have this control as it allows us to store data in a
manner that fits our business need.
Configuring the Update Strategy TransformationThe update strategy transformation can be configured from within the session or the mapping itself.
Lets take a look at these options one at a time.
Configuring the Mapping Session
Lets take a look at a quick example. To configure the update strategy from within the session, start in
the workflow manager and double click the mapping session named s_UPD_DEMO below.
After double clicking the session, click on the properties tab.
This tab will show us a “treat source rows as” drop down with four different options. This attribute
allows us to control, at a session level, if rows are inserted, updated, or deleted from our target
table(s). If we select insert, our mapping will attempt to insert each record directed to our target table.
We need to ensure our data includes a primary key mapped to our target table’s primary key. If for
some reason we attempt to insert a record with a primary key that already exists in our target table,
this record will be rejected. These same rules apply to update and delete options. The difference
being, our mapping records will attempt to be updated or deleted in our target table instead of
inserted.
In addition to selecting the correct “treat source rows as” attribute option, we must set target table
level attributes on the mapping tab of our session. If for example we have selected the insert option
for our “treat source rows as” option, we need to click on our target table, then check the insert
attribute checkbox. Make sure to uncheck all other database operation checkboxes.
If we want to update records and have selected update as our “treat source rows as” attribute option,
then we have three options at the target table level to choose from: Update as Update, Update as
Insert, and Update else Insert.
Attribute Value
Update as Update
Update each row in the target table if it exists
Update as Insert
Insert each row in the target table
Update else Insert
First try to update a row in the target table if it exist, otherwise insert it.
Similarly to our “treat source rows as” insert example, if we want to delete rows, we should select
delete as our “treat source rows as” attribute option and check the delete attribute checkbox for each
target table on the mapping tab.
Our last table attribute option is the truncate target table option. This will truncate all data within the
target table prior running any records through our mapping.
Our final “treat source rows as” attribute option is data driven. This is the default option when we add
an update strategy transformation in our mapping. This option tells our mapping to use the logic within
our update strategy transformation when determining whether to insert, update, delete or reject
records. This finer control is very nice to have when building a data warehouse and a best practice
when flagging records for the same target table with different database operations.
Configuring the Mapping
Lets take a look at how to configure our update strategy transformation in an Informatica mapping.
Below we have an example mapping M_Sales.
Notice we have a single source of sales agent data coming from a flat file. Our data is being routed
through a router transformation, then to 3 different update strategy transformations
(UPD_INSERT_HIGH, UPD_UPDATE_MEDIUM, and UPD_DELETE_LOW), all flagging our records
for different database operations. Finally we are sending our sales agent records to the same target
table, SALES.
Lets take a quick look at our group router criteria…
Notice how our filter condition separates agents with high, medium, and low total sales amounts. We
are routing agents with high sales to UPD_INSERT_HIGH update strategy transformation, medium
sales to UPD_UPDATE_MEDIUM update strategy transformation, and low sales to
UPD_DELETE_LOW update strategy transformation. Lets take a look at each of these
transformations in more detail.
In the mapping, double clicking on the UPD_INSERT_HIGH update strategy transformation and
clicking on the properties tab, we get the below view.
Notice how I have programmed DD_INSERT into the update strategy expression transformation
attribute. This tells the transformation to flag all records passed through it, for insert into the target
table. We can also use numeric values here, but I would recommend using the constants as a best
practice since the operation is much more intuitive. Below are all of our options for this attribute along
with their corresponding operations.
Operation Constant Numeric Value
Insert DD_INSERT 0
Update DD_UPDATE 1
Delete DD_DELETE 2
Reject DD_REJECT 3
Lets quickly review our two additional update strategy transformation in this mapping.
UPD_UPDATE_MEDIUM is set to update rows it matches on by primary key in our target SALES
table.
UPD_DELETE_LOW is set to delete rows it matches on by primary key in our target SALES table.
Forward Rejected Rows
Notice how the forward rejected rows transformation attribute is checked. This is the default setting for
a new update strategy transformation. This really didn’t come into play in our example, but if we were
to set some conditional logic within our update strategy expression, we might reject some rows and
decide we do not want them to pass to our next transformation. For example, we could put a
statement like the below in our UPD_UPDATE_MEDIUM update strategy transformation:
IIF ( TOTAL_SALES <= 10000 AND TOTAL_SALES > 6000, DD_UPDATE, DD_REJECT)
This statement would instruct the transformation to flag rows for update if TOTAL_SALES was less
than our equal to 10000 and greater than 6000. However, if TOTAL_SALES was less or equal to
6000, then we would reject the update. This actual logic may not actual be something we would do in
real life, but I think you get the point.
Back to our forward rejected rows attribute, if we leave the checkbox unchecked, these records would
not pass to our target table and would be dropped by the Integration Service. Additionally, they would
get written the session log file.
If keep the forward rejected rows attribute checked, not much would change. The records would get
passed to the target table, but would still be rejected and dropped. However the records would be
written to the session reject file instead of the session log file.
Velocity Naming StandardVelocity recommends the below format when naming a Update Strategy Transformation in
Informatica:
UPD_{DESCRIPTOR}
Ex: UPD_SALES
SummaryThe Update Strategy Transformation in Informatica is a great tool to control how data passed
through a mapping is flagged for insert, update, delete, and reject in the target database table. We
can control this at either a mapping session level or a mapping level through the transformation itself.
If your in the business intelligence and data warehouse world, you will definitely want a deep
understanding of how the Update Strategy Transformation in Informatica works. Happy integrating…
Source Qualifier Transformation in Informatica
September 24, 2014·
Aaron Gendle ·
683 Views
An Overview of the Source Qualifier Transformation in Informatica
OverviewThe Source Qualifier Transformation in Informatica is an active,connected transformation. It
selects records from flat files and relational sources. Attributes or columns used in the Source
Qualifier output connections are then passed to additional mapping transformations. Additionally, it
converts data from the source’s native datatype to a compatible PowerCenter transformation
datatype. For relational sources, if we do not code a custom SQL statement, SQL is generated
dynamically to extract data.
Business PurposeThe Source Qualifier Transformation in Informatica provides an efficient way to filter input
fields/columns. Many times we do this through performing homogeneous joins, a join made on the
same data source (Ex: Oracle + Oracle, DB2 + DB2, Teradata + Teradata).
Ports Tab Example
Properties Tab
Below we describe the different properties within the properties tab:
Property Description
SQL Query Allows you to override the default SQL query that PowerCenter creates at runtime.
User Defined Join
Allows you to specify a join that replaces the default join created by PowerCenter.
Source Filter
Allows you to create a where clause that will be inserted into the SQL query that is generated at runtime. The “where” portion of the statement is not required. For example: Employee.ID = Person.ID
Number of Sorted Ports
PowerCenter will insert an order by clause in the generated SQL query. The order by will be on the number of ports specified, from the top down. For example, in the SQ_SALES Source Qualifier, if the
number of sorted ports = 2, the order by will be:ORDER BY SALES.SALES_ID, SALES.AGENT_ID.
Tracing Level
Specifies the amount of detail written to the session log.
Select Distinct
Allows you to select distinct values only.
Pre SQL Allows you to specify SQL that will be run prior to the pipeline being run. The SQL will be run using the connection specified in the session task.
Post SQL Allows you to specify SQL that will be run after the pipeline has been run. The SQL will be run using the connection specified in the session task.
Custom SQL QueryWe touched on the SQL Query property in the last section, but there are some extra tips and tricks I
thought needed some more attention. As previously stated, this property lets us code a custom SQL
statement into the Source Qualifier. Instead of the default SQL statement, the Integration Service
uses our custom SQL to extract data from a data source.
SQL Example
SELECT FIRST_NAME, LAST_NAME, FULL_NAME
FROM PERSON
One tip when overriding SQL is to make sure the columns in your SELECT statement align with the
ports you map from the Source Qualifier to your next transformations. Using the example SQL
statement above, we want to make sure FIRST_NAME, LAST_NAME, and FULL_NAME are the only
ports in our Source Qualifier transformation. If we have more ports, we have to be careful to connect
sort these ports in the same order as the SELECT statement. The point is, the number of lines we
map and the order we map them need to align with the number and order in our SELECT statement.
Velocity Best PracticeVelocity recommends the following naming standard for the Source Qualifier Transformation in
Informatica.
SQ_{FUNCTION}
Example: SQ_PERSON
Summary
The Source Qualifier Transformation in Informatica helps us select records from flat files and
relational database sources. In my experience, the SQL Query property is heavily used to code
custom SQL statements. These simple to complex custom SQL statements help us extract specific
data according to our individual mapping needs.