Monday, 31 March 2014

Update Strategy Transformation in Informatica

Active and Connected Transformation
Till now, we have only inserted rows in our target tables. What if we want to update, delete or reject rows coming from source based on some condition?
Example: If Address of a CUSTOMER changes, we can update the old address or keep both old and new address. One row is for old and one for new. This way we maintain the historical data.
Update Strategy is used with Lookup Transformation. In DWH, we create a Lookup on target table to determine whether a row already exists or not. Then we insert, update, delete or reject the source record as per business need.
In Power Center, we set the update strategy at two different levels:
1.      Within a session
2.    Within a Mapping
1. Update Strategy within a session:
When we configure a session, we can instruct the IS to either treat all rows in the same way or use instructions coded into the session mapping to flag rows for different database operations.
Session Configuration:
Edit Session -> Properties -> Treat Source Rows as: (Insert, Update, Delete, and Data Driven). Insert is default. Specifying Operations for Individual Target Tables:
[clip_image002[4].jpg]
You can set the following update strategy options:
Insert: Select this option to insert a row into a target table.
Delete: Select this option to delete a row from a table.
Update: We have the following options in this situation:
  •  Update as Update. Update each row flagged for update if it exists in the target table.
  •  Update as Insert. Inset each row flagged for update.
  •  Update else Insert. Update the row if it exists. Otherwise, insert it.
Truncate table: Select this option to truncate the target table before loading data.
2. Flagging Rows within a Mapping
Within a mapping, we use the Update Strategy transformation to flag rows for insert, delete, update, or reject.
Operation
Constant
Numeric Value
INSERT
DD_INSERT
0
UPDATE
DD_UPDATE
1
DELETE
DD_DELETE
2
REJECT
DD_REJECT
3
Update Strategy Expressions:
Frequently, the update strategy expression uses the IIF or DECODE function from the transformation language to test each row to see if it meets a particular condition.
IIF( ( ENTRY_DATE > APPLY_DATE), DD_REJECT, DD_UPDATE )
Or
IIF( ( ENTRY_DATE > APPLY_DATE), 3, 2 )
  • The above expression is written in Properties Tab of Update Strategy T/f.
  • DD means DATA DRIVEN
Forwarding Rejected Rows:
We can configure the Update Strategy transformation to either pass rejected rows to the next transformation or drop them.
Steps:
1.      Create Update Strategy Transformation
2.    Pass all ports needed to it.
3.    Set the Expression in Properties Tab.
4.    Connect to other transformations or target.
Performance tuning:
1.      Use Update Strategy transformation as less as possible in the mapping.
2.    Do not use update strategy transformation if we just want to insert into target table, instead use direct mapping, direct filtering etc.
3.    For updating or deleting rows from the target table we can use Update Strategy transformation itself.





Union Transformation in Informatica

Active and Connected transformation.
Union transformation is a multiple input group transformation that you can use to merge data from multiple pipelines or pipeline branches into one pipeline branch. It merges data from multiple sources similar to theUNION ALL SQL statement to Combine the results from two or more SQL statements.
Union Transformation Rules and Guidelines
  • we can create multiple input groups, but only one output group.
  • we can connect heterogeneous sources to a Union transformation.
  • all input groups and the output group must have matching ports. The Precision, data type, and scale must be identical across all groups.
  • The Union transformation does not remove duplicate rows. To remove Duplicate rows, we must add another transformation such as a Router or Filter Transformation.
  • we cannot use a Sequence Generator or Update Strategy transformation upstream from a Union transformation.
Union Transformation Components
When we configure a Union transformation, define the following components:
Transformation tab: We can rename the transformation and add a description.
Properties tab: We can specify the tracing level.
Groups tab: We can create and delete input groups. The Designer displays groups we create on the Ports tab.
Group Ports tab: We can create and delete ports for the input groups. The Designer displays ports we create on the Ports tab.
We cannot modify the Ports, Initialization Properties, Metadata Extensions, or Port Attribute Definitions tabs in a Union transformation.
Create input groups on the Groups tab, and create ports on the Group Ports tab. We can create one or more input groups on the Groups tab. The Designer creates one output group by default. We cannot edit or delete the default output group.
Example: to combine data of tables EMP_10, EMP_20 and EMP_REST
  • Import tables EMP_10, EMP_20 and EMP_REST in shared folder in Sources.
  • Create a target table EMP_UNION_EXAMPLE in target designer. Structure should be same EMP table.
  • Create the shortcuts in your folder.
[clip_image002[4].jpg]
Creating Mapping:
1.      Open folder where we want to create the mapping.
2.    Click Tools -> Mapping Designer.
3.    Click Mapping-> Create-> Give mapping name. Ex: m_union_example
4.    Drag EMP_10, EMP_20 and EMP_REST from source in mapping.
5.     Click Transformation -> Create -> Select Union from list. Give name and click Create. Now click done.
6.    Pass ports from SQ_EMP_10 to Union Transformation.
7.     Edit Union Transformation. Go to Groups Tab
8.    One group will be already there as we dragged ports from SQ_DEPT_10 to Union Transformation.
9.    As we have 3 source tables, we 3 need 3 input groups. Click add button to add 2 more groups. See Sample Mapping
10.                        We can also modify ports in ports tab.
11.  Click Apply -> Ok.
12.Drag target table now.
13.Connect the output ports from Union to target table.
14.Click Mapping -> Validate
15. Repository -> Save
  • Create Session and Workflow as described earlier. Run the Workflow and see the data in target table.
  • Make sure to give connection information for all 3 source Tables.
Sample mapping picture


[clip_image002[5][3].jpg]

Transaction Control Transformation in Informatica

  
Power Center lets you control commit and roll back transactions based on a set of rows that pass through a Transaction Control transformation. A transaction is the set of rows bound by commit or roll back rows. You can define a transaction based on a varying number of input rows. You might want to define transactions based on a group of rows ordered on a common key, such as employee ID or order entry date.
In Power Center, you define transaction control at the following levels:
  • Within a mapping. Within a mapping, you use the Transaction Control transformation to define a transaction. You define transactions using an expression in a Transaction Control transformation. Based on the return value of the expression, you can choose to commit, roll back, or continue without any transaction changes.
  • Within a session. When you configure a session, you configure it for user-defined commit. You can choose to commit or roll back a transaction if the Integration Service fails to transform or write any row to the target.
When you run the session, the Integration Service evaluates the expression for each row that enters the transformation. When it evaluates a commit row, it commits all rows in the transaction to the target or targets. When the Integration Service evaluates a roll back row, it rolls back all rows in the transaction from the target or targets. If the mapping has a flat file target you can generate an output file each time the Integration Service starts a new transaction. You can dynamically name each target flat file.
Properties Tab
On the Properties tab, you can configure the following properties:
  • Transaction control expression
  • Tracing level
Enter the transaction control expression in the Transaction Control Condition field. The transaction control expression uses the IIF function to test each row against the condition. Use the following syntax for the expression:
IIF (condition, value1, value2)
The expression contains values that represent actions the Integration Service performs based on the return value of the condition. The Integration Service evaluates the condition on a row-by-row basis. The return value determines whether the Integration Service commits, rolls back, or makes no transaction changes to the row.
When the Integration Service issues a commit or roll back based on the return value of the expression, it begins a new transaction. Use the following built-in variables in the Expression Editor when you create a transaction control expression:
  • TC_CONTINUE_TRANSACTION. The Integration Service does not perform any transaction change for this row. This is the default value of the expression.
  • TC_COMMIT_BEFORE. The Integration Service commits the transaction, begins a new transaction, and writes the current row to the target. The current row is in the new transaction.
  • TC_COMMIT_AFTER. The Integration Service writes the current row to the target, commits the transaction, and begins a new transaction. The current row is in the committed transaction.
  • TC_ROLLBACK_BEFORE. The Integration Service rolls back the current transaction, begins a new transaction, and writes the current row to the target. The current row is in the new transaction.
  • TC_ROLLBACK_AFTER. The Integration Service writes the current row to the target, rolls back the transaction, and begins a new transaction. The current row is in the rolled back transaction.
If the transaction control expression evaluates to a value other than commit, roll back, or continue, the Integration Service fails the session.
Mapping Guidelines and Validation
Use the following rules and guidelines when you create a mapping with a Transaction Control transformation:
  • If the mapping includes an XML target, and you choose to append or create a new document on commit, the input groups must receive data from the same transaction control point.
  • Transaction Control transformations connected to any target other than relational, XML, or dynamic MQSeries targets are ineffective for those targets.
  • You must connect each target instance to a Transaction Control transformation.
  • You can connect multiple targets to a single Transaction Control transformation.
  • You can connect only one effective Transaction Control transformation to a target.
  • You cannot place a Transaction Control transformation in a pipeline branch that starts with a Sequence Generator transformation.
  • If you use a dynamic Lookup transformation and a Transaction Control transformation in the same mapping, a rolled-back transaction might result in unsynchronized target data.
  • A Transaction Control transformation may be effective for one target and ineffective for another target. If each target is connected to an effective Transaction Control transformation, the mapping is valid.
  • Either all targets or none of the targets in the mapping should be connected to an effective Transaction Control transformation.
Example to Transaction Control:
Step 1: Design the mapping.
[clip_image002[5].jpg]
Step 2: Creating a Transaction Control Transformation.
  • In the Mapping Designer, click Transformation > Create. Select the Transaction Control transformation.
  • Enter a name for the transformation.[ The naming convention for Transaction Control transformations is TC_TransformationName].
  • Enter a description for the transformation.
  • Click Create.
  • Click Done.
  • Drag the ports into the transformation.
  • Open the Edit Transformations dialog box, and select the Ports tab.
Select the Properties tab. Enter the transaction control expression that defines the commit and roll back behavior.
[clip_image004[4].jpg]
Go to the Properties tab and click on the down arrow to get in to the expression editor window. Later go to the Variables tab and Type IIF(EMpno=7654,) select the below things from the built in functions.
IIF (EMPNO=7654,TC_COMMIT_BEFORE,TC_CONTINUE_TRANSACTION)
  • Connect all the columns from the transformation to the target table and save the mapping.
  • Select the Metadata Extensions tab. Create or edit metadata extensions for the Transaction Control transformation.
  • Click OK.
Step 3: Create the task and the work flow.
Step 4: Preview the output in the target table.
[clip_image006[3].jpg]




Stored Procedure Transformation in Informatica

  • Passive Transformation
  • Connected and Unconnected Transformation
  • Stored procedures are stored and run within the database.
A Stored Procedure transformation is an important tool for populating and Maintaining databases. Database administrators create stored procedures to Automate tasks that are too complicated for standard SQL statements.
Use of Stored Procedure in mapping:
  • Check the status of a target database before loading data into it.
  • Determine if enough space exists in a database.
  • Perform a specialized calculation.
  • Drop and recreate indexes. Mostly used for this in projects.
Data Passes Between IS and Stored Procedure One of the most useful features of stored procedures is the ability to send data to the stored procedure, and receive data from the stored procedure. There are three types of data that pass between the Integration Service and the stored procedure:
Input/output parameters: Parameters we give as input and the parameters returned from Stored Procedure.
Return values: Value returned by Stored Procedure if any.
Status codes: Status codes provide error handling for the IS during a workflow. The stored procedure issues a status code that notifies whether or not the stored procedure completed successfully. We cannot see this value. The IS uses it to determine whether to continue running the session or stop. Specifying when the Stored Procedure Runs
Normal: The stored procedure runs where the transformation exists in the mapping on a row-by-row basis. We pass some input to procedure and it returns some calculated values. Connected stored procedures run only in normal mode.
Pre-load of the Source: Before the session retrieves data from the source, the stored procedure runs. This is useful for verifying the existence of tables or performing joins of data in a temporary table.
Post-load of the Source: After the session retrieves data from the source, the stored procedure runs. This is useful for removing temporary tables.
Pre-load of the Target: Before the session sends data to the target, the stored procedure runs. This is useful for dropping indexes or disabling constraints.
Post-load of the Target: After the session sends data to the target, the stored procedure runs. This is useful for re-creating indexes on the database.
Using a Stored Procedure in a Mapping :
1.      Create the stored procedure in the database.
2.    Import or create the Stored Procedure transformation.
3.    Determine whether to use the transformation as connected or unconnected.
4.    If connected, map the appropriate input and output ports.
5.     If unconnected, either configure the stored procedure to run pre- or post-session, or configure it to run from an expression in another transformation.
6.    Configure the session.
Stored Procedures:
Connect to Source database and create the stored procedures given below:
CREATE OR REPLACE procedure sp_agg (in_deptno in number, max_sal out number,
min_sal out number, avg_sal out number, sum_sal out number)
As
Begin
select max(Sal),min(sal),avg(sal),sum(sal) into max_sal,min_sal,avg_sal,sum_sal
from emp where deptno=in_deptno group by deptno;
End;
/
CREATE OR REPLACE procedure sp_unconn_1_value(in_deptno in number, max_sal out number)
As
Begin
Select max(Sal) into max_sal from EMP where deptno=in_deptno;
End;
/
1. Connected Stored Procedure T/F
Example: To give input as DEPTNO from DEPT table and find the MAX, MIN, AVG and SUM of SAL from EMP table.
  • DEPT will be source table. Create a target table SP_CONN_EXAMPLE with fields DEPTNO, MAX_SAL, MIN_SAL, AVG_SAL & SUM_SAL.
  • Write Stored Procedure in Database first and Create shortcuts as needed.
Creating Mapping:
1. Open folder where we want to create the mapping.
2. Click Tools -> Mapping Designer.
3. Click Mapping-> Create-> Give name. Ex: m_SP_CONN_EXAMPLE
4. Drag DEPT and Target table.
5. Transformation -> Import Stored Procedure -> Give Database Connection -> Connect -> Select the procedure sp_agg from the list.
[clip_image030[4].jpg]
6. Drag DEPTNO from SQ_DEPT to the stored procedure input port and also to DEPTNO port of target.
7. Connect the ports from procedure to target as shown below:
8. Mapping -> Validate
9. Repository -> Save
  • Create Session and then workflow.
  • Give connection information for all tables.
  • Give connection information for Stored Procedure also.
  • Run workflow and see the result in table.
[clip_image034[4].jpg]
2. Unconnected Stored Procedure T/F :
An unconnected Stored Procedure transformation is not directly connected to the flow of data through the mapping. Instead, the stored procedure runs either:
  • From an expression: Called from an expression transformation.
  • Pre- or post-session: Runs before or after a session.
Method of returning the value of output parameters to a port:
  • Assign the output value to a local variable.
  • Assign the output value to the system variable PROC_RESULT. (See Later)
Example 1: DEPTNO as input and get MAX of Sal as output.
  • DEPT will be source table.
  • Create a target table with fields DEPTNO and MAX_SAL of decimal data type.
  • Write Stored Procedure in Database first and Create shortcuts as needed.
Creating Mapping:
1. Open folder where we want to create the mapping.
2. Click Tools -> Mapping Designer.
3. Click Mapping-> Create-> Give name. Ex: m_sp_unconn_1_value
4. Drag DEPT and Target table.
5. Transformation -> Import Stored Procedure -> Give Database Connection -> Connect -> Select the procedure sp_unconn_1_value from the list. Click OK.
6. Stored Procedure has been imported.
7. T/F -> Create Expression T/F. Pass DEPTNO from SQ_DEPT to Expression T/F.
8. Edit expression and create an output port OUT_MAX_SAL of decimal data type.
9. Open Expression editor and call the stored procedure as below:Click OK and connect the port from expression to target as in mapping below:
[clip_image036[4].jpg]
10. Mapping -> Validate
11. Repository Save.
  • Create Session and then workflow.
  • Give connection information for all tables.
  • Give connection information for Stored Procedure also.
  • Run workflow and see the result in table.
PROC_RESULT use:
  • If the stored procedure returns a single output parameter or a return value, we the reserved variable PROC_RESULT as the output variable.
Example: DEPTNO as Input and MAX Sal as output :
:SP.SP_UNCONN_1_VALUE(DEPTNO,PROC_RESULT)
  • If the stored procedure returns multiple output parameters, you must create variables for each output parameter.
Example: DEPTNO as Input and MAX_SAL, MIN_SAL, AVG_SAL and SUM_SAL
as output then:
1. Create four variable ports in expression VAR_MAX_SAL,VAR_MIN_SAL, VAR_AVG_SAL and iVAR_SUM_SAL.
2. Create four output ports in expression OUT_MAX_SAL, OUT_MIN_SAL, OUT_AVG_SAL and OUT_SUM_SAL.
3. Call the procedure in last variable port says VAR_SUM_SAL.
:SP.SP_AGG (DEPTNO, VAR_MAX_SAL,VAR_MIN_SAL, VAR_AVG_SAL, PROC_RESULT)
Example 2:
DEPTNO as Input and MAX_SAL, MIN_SAL, AVG_SAL and SUM_SAL as O/P Stored Procedure to drop index in Pre Load of Target Stored Procedure to create index in Post Load of Target
  • DEPT will be source table. Create a target table SP_UNCONN_EXAMPLE with fields DEPTNO, MAX_SAL, MIN_SAL, AVG_SAL & SUM_SAL.
  • Write Stored Procedure in Database first and Create shortcuts as needed. Stored procedures are given below to drop and create index on target.Make sure to create target table first. Stored Procedures to be created in next example in Target Database:
clip_image007[4][clip_image038[5].jpg]
Create or replace procedure CREATE_INDEX
As
Begin
Execute immediate 'create index unconn_dept on SP_UNCONN_EXAMPLE(DEPTNO)';
End;
/
Create or replace procedure DROP_INDEX
As
Begin
Execute immediate 'drop index unconn_dept';
End;
/
Creating Mapping:
1. Open folder where we want to create the mapping.
2. Click Tools -> Mapping Designer.
3. Click Mapping-> Create-> Give name. Ex: m_sp_unconn_1_value
4. Drag DEPT and Target table.
5. Transformation -> Import Stored Procedure -> Give Database Connection -> Connect -> Select the procedure sp_agg from the list. Click OK.
6. Stored Procedure has been imported.
7. T/F -> Create Expression T/F. Pass DEPTNO from SQ_DEPT to Expression T/F.
8. Edit Expression and create 4 variable ports and 4 output ports as shown below:
[clip_image039[4].jpg][clip_image041[4].jpg]
9. Call the procedure in last variable port VAR_SUM_SAL.
10. :SP.SP_AGG (DEPTNO, VAR_MAX_SAL, VAR_MIN_SAL, VAR_AVG_SAL, PROC_RESULT)
11. Click Apply and Ok.
12. Connect to target table as needed.
13. Transformation -> Import Stored Procedure -> Give Database Connection for target -> Connect -> Select the procedure CREATE_INDEX and DROP_INDEX from the list. Click OK.
14. Edit DROP_INDEX -> Properties Tab -> Select Target Pre Load as Stored Procedure Type and in call text write drop_index. Click Apply -> Ok.
15. Edit CREATE_INDEX -> Properties Tab -> Select Target Post Load as Stored Procedure Type and in call text write create_index. Click Apply -> Ok.
16. Mapping -> Validate
17. Repository -> Save
  • Create Session and then workflow.
  • Give connection information for all tables.
  • Give connection information for Stored Procedures also.
  • Also make sure that you execute the procedure CREATE_INDEX on database before using them in mapping. This is because, if there is no INDEX on target table, DROP_INDEX will fail and Session will also fail.
  • Run workflow and see the result in table.