SSIS Merge with Duplicate Rows

The Merge component in SSIS will take two sorted sources and union them while maintaining the original sort order. The question arises, what about duplicate records. These duplicates do not get eliminated.

Here are two tables with the ID 5 and the name Brian repeated on each.

imageimage
 
Here is the layout of the data flow. Both sources are using the same query with the different table names.

Select ID,Name
From Merge1
Order by ID

Select ID,Name
From Merge2
Order by ID

image
 
 
Just having the sources sorted does not tell the Merge they are sorted. You have to adjust the Sort property of the sources.

To do this right click on the sources and open the advanced editor. Click on the  Input/Output Properties tab and click on the OLEDB Source output. Then change the IsSorted property to True on the output. Click the plus next to the OLEDB Source output and select the ID column and change the SortKeyPosition to 1.
 
 image

image
 
 image

Then connect them both to the Merge.

I used a terminator destination from Task Factory, this is an add on from Pragmaticworks.com.
 
 image
 
I added a data viewer between the merge and the terminator destination by righting clicking on the green data flow line and selecting data viewer.
 

image
 
Execute the package and look at the data viewer. Notice the column with the ID of 5 is repeated. The Merge does not remove rows.

image

Advertisement
This entry was posted in SSIS, Syndication and tagged , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s