Registration

Dear SAP Community Member,
In order to fully benefit from what the SAP Community has to offer, please register at:
http://scn.sap.com
Thank you,
The SAP Community team.
Skip to end of metadata
Go to start of metadata
Summary

The dataflow as created by Index Designer supports a full load only. But if your Data Warehouse has some kind of indicator what row was changed recently, we could load only these.

Index Designer cannot create a dataflow that loads the changed rows only as the application has no idea how changed records are to be identified. But DataServices is an ETL tool, it is built for these kinds of operations. For simplicity, let us assume each source table has a date column called last_loaded. In that case all we have to do is replicating the dataflow doing the initial load and in the query, add a where clause.


With the replicate we get a new dataflow called "Copy_1_...." which is now independent from the original one. Inside the dataflow we modify the where clause...


...a voila, we have a delta load dataflow. The way BWA is built is that each table has a primary key. If a record arrives in BWA it is checked if that primary key exists already or not. If it does, the record overwrites the old one, otherwise it is a new record. Hence no need for any additional logic.

All that is left now is to make sure the job does the right thing in initial or delta. We open the job, in the top toolbar click on the variable and parameter icon and create two more global variables (you need to be in the job to do that!), one is called $SDATE and that is the date where the delta should be started from. And $G_LOAD_TYPE so we can decide if the job is an FIRST or DELTA load job. Note the datatypes of the variables.


Our job should not run the load_wf always, it should call either this or its delta version depending on the $G_LOAD_TYPE variable. So we remove it from the job and hence it is not called anymore and add a Conditional object instead.


Once this conditional is connected we can define the condition $G_LOAD_TYPE = 'FIRST' and from the object library drag the load_wf into the "Then" workspace. If the job runs in initial load mode, then the original workflow should be called.


For the "Else" area we create a new Workflow by clicking on the WF icon on the right hand tool palette and inside the the workflow the new delta dataflow is dragged into. Similar procedure for all other dataflows, we create delta load versions for them and drag them into this delta WorkFlow, and if one source table happens to have no delta indicator, then we simply drg the initial load version of the dataflow into the delta WorkFlow as well.


Some beautifications later, like changing the names of the created objects, entering descriptions, setting the job defaults for the global variables,.... we are done, well almost. The truncate script at the beginning should be called only for a FIRST load, shouldn't it?


Actually some more things need to be done for a perfect job, the $SDATE should be set automatically to the last time the delta load was successful so it picks up just there. And the entire job structure will probably be built according to the ETL Project Guidelines. But for now that shall be enough, we execute the job and set the global variables in the execution dialog to 'DELTA' and "sysdate()-1" and run it.


 


  • No labels