Share via


SSIS: Lookup vs. Merge Join

In many interviews you will be seeing interview asking what is difference between merge join and lookup component.

Most of the people think Merge and lookup are same, well if that was the case than Microsoft would have not created two separate component.

Lookup and Merge join are two different activities they perform in different way and are meant of different purposes.  

Area Merge Join Lookup
Work function Joins two data set based on one or more key columns. This joining is just like T SQL join (actually a hash join) It reference data for a one and only one possible match of the lookup table (or query)
Input
  • Input needs to be sorted
  • Need 2 input source
  • Input need not to be sorted
  • Need one input
Output Only one output with merging (joining) of two input data sets Two outputs Matched and non matched
On Match Matched rows will flow into output pipeline. All matching rows of both the source will be flown
Match rows will into Matched output pipeline. Here if multiple matching rows are there in lookup table than only first matched row from lookup table will be selected.
On Non Match It gets define by merge join settings-
  • Inner  - In case of inner join, non matched rows will be discarded
  • Left – In cases on left join, non matched rows of left of sources (first input) will be considered and other source non matched rows will be discarded.
  • Right - In cases on right join, non matched rows of right of sources (second input) will be considered and other source non matched rows will be discarded.
  • Full – both source non matched rows will be considered in output
Non match setting is available, when enabled-
  • Only non match rows of source will be considered and will be flown in output.
  • Lookup tables’s non matched out put is will not be considered at all.


Null Values You have setting to specify if NULL to treated as equal. This will treat NULL value equal to other NULL value.  Else it will discard the row as non matched one. Null values in source will match to null values in lookup if both tables has null, else of only source has null it will return an error. Null in lookup table will not have any impact.
Usage When two data set needs to be joined (left, right, full) When data needs to referenced like get PK from master tables based on name.  
Specific requisite Input needed to be sorted NA

I hope above comparison will help you understand the difference between SSIS Lookup and Merge join.