Power BI: Merge Query with "M"
Introduction
Normally, creating a merge query in Power BI is effectively a two or three-step process. One must create the desired source queries and then merge them together with a merge step. However, it is possible to create a single query that performs the desired merge.
This article walks through the "normal" process of creating a merge query and then shows how the same thing can be achieved using a single query.
Start out with two CSV files
one.csv
Column1,Column2
one,10
two,20
three,30
four,40
five,50
six,60
seven,70
eight,80
nine,90
two.csv
Column1,Column2
one,100
two,200
three,300
four,400
five,500
six,600
seven,700
eight,800
nine,900
Default process
The "normal" process is to create a query for both one.csv and two.csv, then edit one.csv query and add a merge step to merge two.csv query. With this method, you wind up with two queries. The two.csv query is in effect an intermediate query. In the Advanced editor, the code looks like:
two.csv
let
Source1 = Csv.Document(File.Contents("C:\Temp\two.csv"),[Delimiter=",",Encoding=1252]),
#"Promoted Headers1" = Table.PromoteHeaders(Source1),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers1",{{"Column1", type text}, {"Column2", Int64.Type}})
in
#"Changed Type1"
one.csv
let
Source = Csv.Document(File.Contents("C:\Temp\one.csv"),[Delimiter=",",Encoding=1252]),
#"Promoted Headers" = Table.PromoteHeaders(Source),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Column1", type text}, {"Column2", Int64.Type}}),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Column1"},two,{"Column1"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Column2"}, {"NewColumn.Column2"})
in
#"Expanded NewColumn"
Merging queries
However, you can use the code from these queries to create a blank query that looks like this:
merge query
let
Source = Csv.Document(File.Contents("C:\Temp\one.csv"),[Delimiter=",",Encoding=1252]),
Source1 = Csv.Document(File.Contents("C:\Temp\two.csv"),[Delimiter=",",Encoding=1252]),
#"Promoted Headers" = Table.PromoteHeaders(Source),
#"Promoted Headers1" = Table.PromoteHeaders(Source1),
#"Changed Type" = Table.TransformColumnTypes(#"Promoted Headers",{{"Column1", type text}, {"Column2", Int64.Type}}),
#"Changed Type1" = Table.TransformColumnTypes(#"Promoted Headers1",{{"Column1", type text}, {"Column2", Int64.Type}})
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Column1"},#"Changed Type1",{"Column1"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(#"Merged Queries", "NewColumn", {"Column2"}, {"NewColumn.Column2"})
in
#"Expanded NewColumn"
Now you can get rid of one.csv and two.csv queries and have a single query that does it all in one query!
The important aspects to point out here is that while the Power Query interface generally only allows you to have a single source, by editing the code directly, a single query can actually have multiple sources as long as you make sure to assign a unique name to each source and refer to that unique source name in subsequent steps. To edit the code directly, use View | Advanced Editor from the query editing menu.