Updated March 15, 2023
Definition of SSIS Incremental Load
The SSIS has incremental loads to maintain the data in two systems, and the synchronization between the systems remains updated. They apply when the source data is fed to the destination on repeating patterns like a particular time every morning or night or throughout the day. The typical load process has a lookup option to detect changes in the staging table to maintain the database with few restrictions on OLEDB and ADO.NET. However, it restricts the data when the user wants to work on it and spends time editing or configuring it. In this article, the incremental load in SSIS is discussed in brief.
What is SSIS incremental load?
The SSIS components have a few important features which are so reliable for the developers to give enhanced performance incremental load in a few steps. An incremental load is a standard process where the recently entered data from a few sources is fed to the destination, and here the matching data is ignored. In SSIS, the incremental load maintains the synchronization between two systems and keeps the systems updated. It is implemented when the source data is fed to the destination on a periodic pattern. The upsert option assists the user in SSIS with an incremental load.
SSIS incremental load configuration
Here, performing incremental load in SSIS is shown by comparing the source data and target table. If there is any new record in the target table, we must update it to the source data. Likewise, the user wants to update the records in the target table if there is any current value in the source data. Every day data can be like sales amount, labor attendance, gain percentage, etc.
In the toolbox, choose the data flow task, drag and drop it to have a control flow region, and give a proper name to it. Then, if you double-click on it, the data flow tab is opened.
Then OLEDB source should be dragged and dropped in the data flow area. If you double-click on it, the pop-up of the connection manager in OLEDB is opened.
The column tab should be clicked to ensure the columns. So that the user can uncheck if the column is not required. Then close it and open the lookup transformation option in the data flow area.
Then lookup transformation should be double-clicked to configure the reference table. Finally, choose how to manage rows when no entries match in the general tab and change it to ignore failures from the default fail component option.
Choose the connection tab to edit the lookup connections and choose the list from the OLEDB connection manager, or there is a new button to configure it. Use the SQL query in the lookup table as only one column is looked up here.
Click on the input column to join the two datasets. Then conditional split transformation should be dropped in the data flow region and associate the lookup match as the source data for the conditional split.
Double click on it to give the suitable conditions. Then OLDEB command transformation, OLEDB destination, and the output arrow should be connected. Next, a popup appears, and the user needs to choose update or insert, and the input-output type should be selected. Next, the table data should be updated, and an SQL query needs to be executed to enter the update statement.
Select the CDC target from the OLEDB connection manager and OLEDB destination to enter new records.
The mapping column should be clicked to ensure the source and destination columns are correctly mapped. Then OLEDB command transformation should be opened to configure the advanced editor. In the component, the SQL queries are inserted to make it updated. Then choose ok to complete the configuration in the SSIS package for incremental load and execute it accordingly.
SSIS incremental load DateTime
The change data capture method is used to perform incremental load. The source database should depend on the CDC method, and if the user is occupying some data source that doesn’t support the CDC, he should check on the timestamp and DateTime column in the source data set. The source system should have modified DateTime in the table record. The concept behind the CDC method is to store the recent ETL runtime in the log table or config table. So the ETL will get updated from the recent records. It creates a changeset in the data table with existing and updated records. These records can be fed to the staging tables and executed on the fact table if required.
Step-by-step example transaction
At first, the design of the ETL solution should be defined.
The flat file source should be used to retrieve the data from the source flat file.
Then data conversion should be made to change the source column’s data type to match the target table.
Then lookup function option is used to check on no-match output and match output. If no match is found, go for no-match output; if there is any match between source and destination, use match output.
To insert the new records, use the OLEDB destination window and update rows, use the OLEDB update command, and to delete rows, use OLEDB delete command. If there is a small reference data, the user can select full cache, which means that the entire reference set is captured, and the memory is connected to the database when required. Choose rows to no match options if no match is found at the destination. The entire table is surfed with SQL query to find the exact matching data in connections.
Column mapping used to map the source column and destination column
The output name and default output name should match the destination column.
Conclusion
Hence, the incremental load in SSIS can be configured and implemented to make two systems or data tables sync or update. So that there will be no data loss and the system’s performance will be increased.
Recommended Articles
This is a guide to SSIS Incremental Load. Here we discuss the definition, What is SSIS incremental load, and SSIS incremental load configuration, for example. You can also go through our other suggested articles to learn more –