Slowly Changing Dimensions (SCDs) are dimensions that
have data that changes slowly, rather than changing on a time-based, regular
schedule
For example, you may have a dimension in your
database that tracks the sales records of your company's salespeople. Creating
sales reports seems simple enough, until a salesperson is transferred from one
regional office to another. How do you record such a change in your sales
dimension?
You could sum or average the sales by
salesperson, but if you use that to compare the performance of salesmen, that
might give misleading information. If the salesperson that was transferred used
to work in a hot market where sales were easy, and now works in a market where
sales are infrequent, her totals will look much stronger than the other
salespeople in her new region, even if they are just as good. Or you could
create a second salesperson record and treat the transferred person as a new
sales person, but that creates problems also.
Dealing with these issues involves SCD
management methodologies:
Type 1:
The Type 1 methodology overwrites old
data with new data, and therefore does not track historical data at all. This
is most appropriate when correcting certain types of data errors, such as the
spelling of a name. (Assuming you won't ever need to know how it used to be
misspelled in the past.)
Here is an example of a database table that
keeps supplier information:
Supplier_Key
|
Supplier_Code
|
Supplier_Name
|
Supplier_State
|
123
|
ABC
|
Acme Supply Co
|
CA
|
In this example, Supplier_Code is the natural
key and Supplier_Key is a surrogate
key. Technically, the surrogate key is not necessary, since the
table will be unique by the natural key (Supplier_Code). However, the joins
will perform better on an integer than on a character string.
Now imagine that this supplier moves their
headquarters to Illinois. The updated table would simply overwrite this record:
Supplier_Key
|
Supplier_Code
|
Supplier_Name
|
Supplier_State
|
123
|
ABC
|
Acme Supply Co
|
IL
|
The obvious disadvantage to this method of
managing SCDs is that there is no historical record kept in the data warehouse.
You can't tell if your suppliers are tending to move to the Midwest, for
example. But an advantage to Type 1 SCDs is that they are very easy to
maintain.
Explanation with an Example:
Source Table: (01-01-11) Target Table:
(01-01-11)
Emp no
|
Ename
|
Sal
|
101
|
A
|
1000
|
102
|
B
|
2000
|
103
|
C
|
3000
|
Emp no
|
Ename
|
Sal
|
101
|
A
|
1000
|
102
|
B
|
2000
|
103
|
C
|
3000
|
The necessity of the lookup transformation is
illustrated using the above source and target table.
Source Table: (01-02-11) Target Table:
(01-02-11)
Emp no
|
Ename
|
Sal
|
Empno
|
Ename
|
Sal
|
|
101
|
A
|
1000
|
101
|
A
|
1000
|
|
102
|
B
|
2500
|
102
|
B
|
2500
|
|
103
|
C
|
3000
|
103
|
C
|
3000
|
|
104
|
D
|
4000
|
104
|
D
|
4000
|
- In
the second Month we have one more employee added up to the table with the
Ename D and salary of the Employee is changed to the 2500 instead of 2000.
Step 1: Is to import Source Table and Target table.
- Create
a table by name emp_source with three columns as shown above in oracle.
- Import
the source from the source analyzer.
- In
the same way as above create two target tables with the names emp_target1,
emp_target2.
- Go
to the targets Menu and click on generate and execute to confirm the
creation of the target tables.
- The
snap shot of the connections using different kinds of transformations are
shown below.
Step 2: Design the mapping and apply the necessary
transformation.
- Here
in this transformation we are about to use four kinds of transformations
namely Lookup transformation, Expression Transformation, Filter
Transformation, Update Transformation. Necessity and the usage of all the
transformations will be discussed in detail below.
Look up Transformation: The purpose of this
transformation is to determine whether to insert, Delete, Update or reject the
rows in to target table.
- The
first thing that we are goanna do is to create a look up transformation
and connect the Empno from the source qualifier to the transformation.
- The
snapshot of choosing the Target table is shown below.
- What
Lookup transformation does in our mapping is it looks in to the target
table (emp_table) and compares it with the Source Qualifier and determines
whether to insert, update, delete or reject rows.
- In
the Ports tab we should add a new column and name it as empno1 and this is
column for which we are gonna connect from the Source Qualifier.
- The
Input Port for the first column should be unchked where as the other ports
like Output and lookup box should be checked. For the newly created column
only input and output boxes should be checked.
- In
the Properties tab (i) Lookup table name ->Emp_Target.
(ii)Look up Policy on Multiple Mismatch ->
use First Value.
(iii) Connection Information ->Oracle.
- In
the Conditions tab (i) Click on Add a new condition
(ii)Lookup Table Column should be Empno,
Transformation port should be Empno1 and Operator should ‘=’.
Expression Transformation: After we are done with
the Lookup Transformation we are using an expression transformation to check
whether we need to insert the records the same records or we need to update the
records. The steps to create an Expression Transformation are shown below.
- Drag
all the columns from both the source and the look up transformation and
drop them all on to the Expression transformation.
- Now
double click on the Transformation and go to the Ports tab and create two
new columns and name it as insert and update. Both these columns are gonna
be our output data so we need to have check mark only in front of the
Output check box.
- The
Snap shot for the Edit transformation window is shown below.
- The
condition that we want to parse through our output data are listed below.
Input à IsNull(EMPNO1)
Output à iif(Not isnull (EMPNO1) and
Decode(SAL,SAL1,1,0)=0,1,0) .
- We
are all done here .Click on apply and then OK.
Filter Transformation: we are gonna have
two filter transformations one to insert and other to update.
- Connect
the Insert column from the expression transformation to the insert column
in the first filter transformation and in the same way we are gonna
connect the update column in the expression transformation to the update
column in the second filter.
- Later
now connect the Empno, Ename, Sal from the expression transformation to
both filter transformation.
- If
there is no change in input data then filter transformation 1 forwards the
complete input to update strategy transformation 1 and same output is
gonna appear in the target table.
- If
there is any change in input data then filter transformation 2 forwards
the complete input to the update strategy transformation 2 then it is
gonna forward the updated input to the target table.
- Go
to the Properties tab on the Edit transformation
(i) The value for the filter condition 1 is
Insert.
(ii) The value for the filter condition 1 is
Update.
- The
Closer view of the filter Connection is shown below.
Update Strategy Transformation: Determines whether to
insert, delete, update or reject the rows.
- Drag
the respective Empno, Ename and Sal from the filter transformations and
drop them on the respective Update Strategy Transformation.
- Now
go to the Properties tab and the value for the update strategy expression
is 0 (on the 1st update transformation).
- Now
go to the Properties tab and the value for the update strategy expression
is 1 (on the 2nd update transformation).
- We
are all set here finally connect the outputs of the update transformations
to the target table.
Step 3: Create the task and Run the work flow.
- Don’t
check the truncate table option.
- Change
Bulk to the Normal.
- Run
the work flow from task.
Step 4: Preview the Output in the target table.
No comments:
Post a Comment