SSIS: Lookup vs. Merge Join
In many interviews you will be seeing interview asking what is difference between merge join and lookup component.
Most of the people think Merge and lookup are same, well if that was the case than Microsoft would have not created two separate component.
Lookup and Merge join are two different activities they perform in different way and are meant of different purposes.
Area | Merge Join | Lookup |
Work function | Joins two data set based on one or more key columns. This joining is just like T SQL join (actually a hash join) | It reference data for a one and only one possible match of the lookup table (or query) |
Input |
|
|
Output | Only one output with merging (joining) of two input data sets | Two outputs Matched and non matched |
On Match | Matched rows will flow into output pipeline. All matching rows of both the source will be flown
|
Match rows will into Matched output pipeline. Here if multiple matching rows are there in lookup table than only first matched row from lookup table will be selected. |
On Non Match | It gets define by merge join settings-
|
Non match setting is available, when enabled-
|
Null Values | You have setting to specify if NULL to treated as equal. This will treat NULL value equal to other NULL value. Else it will discard the row as non matched one. | Null values in source will match to null values in lookup if both tables has null, else of only source has null it will return an error. Null in lookup table will not have any impact. |
Usage | When two data set needs to be joined (left, right, full) | When data needs to referenced like get PK from master tables based on name. |
Specific requisite | Input needed to be sorted | NA |
I hope above comparison will help you understand the difference between SSIS Lookup and Merge join.