Active and connected
transformation
The Rank
transformation allows us to select only the top or bottom rank of data. It
Allows us to select a group of top or bottom values, not just one value.
During the session,
the Power Center Server caches input data until it can perform The rank
calculations.
Rank Transformation
Properties :
- Cache Directory where cache will
be made.
- Top/Bottom Rank as per need
- Number of Ranks Ex: 1, 2 or any
number
- Case Sensitive Comparison can be
checked if needed
- Rank Data Cache Size can be set
- Rank Index Cache Size can be set
Ports in a Rank
Transformation :
Ports
|
Number Required
|
Description
|
I
|
1 Minimum
|
Port to receive
data from another transformation.
|
O
|
1 Minimum
|
Port we want to
pass to other transformation.
|
V
|
not needed
|
can use to store
values or calculations to use in an expression.
|
R
|
Only 1
|
Rank port. Rank is
calculated according to it. The Rank port is an input/output port. We must
link the Rank port to another transformation. Example: Total Salary
|
Rank Index
The Designer
automatically creates a RANKINDEX port for each Rank transformation. The Power
Center Server uses the Rank Index port to store the ranking position for Each
row in a group.
For example, if we create a Rank
transformation that ranks the top five salaried employees, the rank index
numbers the employees from 1 to 5.
- The RANKINDEX is an output port
only.
- We can pass the rank index to
another transformation in the mapping or directly to a target.
- We cannot delete or edit it.
Defining Groups
Rank transformation
allows us to group information. For example: If we want to select the top 3
salaried employees of each Department, we can define a group for Department.
- By defining groups, we create one
set of ranked rows for each group.
- We define a group in Ports tab.
Click the Group By for needed port.
- We cannot Group By on port which
is also Rank Port.
1) Example:
Finding Top 5 Salaried Employees
- EMP will be source table.
- Create a target table
EMP_RANK_EXAMPLE in target designer. Structure should be same as EMP
table. Just add one more port Rank_Index to store RANK INDEX.
- Create the shortcuts in your
folder.
Creating
Mapping:
1. Open folder where we
want to create the mapping.
2. Click Tools ->
Mapping Designer.
3. Click Mapping->
Create-> Give mapping name. Ex: m_rank_example
4. Drag EMP from source
in mapping.
5. Create an EXPRESSION
transformation to calculate TOTAL_SAL.
6. Click Transformation
-> Create -> Select RANK from list. Give name and click Create. Now click
done.
7. Pass ports from
Expression to Rank Transformation.
8. Edit Rank
Transformation. Go to Ports Tab
9. Select TOTAL_SAL as
rank port. Check R type in front of TOTAL_SAL.
10.
Click
Properties Tab and Select Properties as needed.
11. Top in Top/Bottom and
Number of Ranks as 5.
12.Click Apply -> Ok.
13.Drag target table
now.
14.Connect the output
ports from Rank to target table.
15. Click Mapping ->
Validate
16.Repository -> Save
- Create Session and Workflow as
described earlier. Run the Workflow and see the data in target table.
- Make sure to give connection
information for all tables.
2) Example: Finding
Top 2 Salaried Employees for every DEPARTMENT
- Open the mapping made above. Edit
Rank Transformation.
- Go to Ports Tab. Select Group By
for DEPTNO.
- Go to Properties tab. Set Number
of Ranks as 2.
- Click Apply -> Ok.
- Mapping -> Validate and
Repository Save.
Refresh the session
by double clicking. Save the changed and run workflow to see the new result.
RANK CACHE
Sample Rank Mapping
When the Power Center
Server runs a session with a Rank transformation, it compares an input row with
rows in the data cache. If the input row out-ranks a Stored row, the Power
Center Server replaces the stored row with the input row.
Example: Power Center caches
the first 5 rows if we are finding top 5 salaried Employees. When 6th
row is read, it compares it with 5 rows in cache and places it in Cache is
needed.
1) RANK INDEX
CACHE:
The index cache holds
group information from the group by ports. If we are Using Group By on DEPTNO,
then this cache stores values 10, 20, 30 etc.
- All Group By Columns are in RANK
INDEX CACHE. Ex. DEPTNO
2) RANK DATA
CACHE:
It holds row data
until the Power Center Server completes the ranking and is Generally larger
than the index cache. To reduce the data cache size, connect Only the necessary
input/output ports to subsequent transformations.
- All Variable ports if there, Rank
Port, All ports going out from RANK Transformations are stored in RANK
DATA CACHE.
- Example: All ports
except DEPTNO In our mapping example.
No comments:
Post a Comment