Join hints (Transact-SQL)
Applies to:
SQL Server
Azure SQL Database
Azure SQL Managed Instance
SQL analytics endpoint in Microsoft Fabric
Warehouse in Microsoft Fabric
SQL database in Microsoft Fabric
Join hints specify that the query optimizer enforce a join strategy between two tables in SQL Server. For general information about joins and join syntax, see FROM clause plus JOIN, APPLY, PIVOT.
Caution
Because the SQL Server query optimizer typically selects the best execution plan for a query, we recommend that hints be used only as a last resort by experienced developers and database administrators.
Applies to
Transact-SQL syntax conventions
Syntax
<join_hint> ::=
{ LOOP | HASH | MERGE | REMOTE | REDUCE | REPLICATE | REDISTRIBUTE [(columns count)]}
Arguments
{ LOOP | HASH | MERGE }
Applies to: Azure SQL Database, Azure SQL Managed Instance, SQL analytics endpoint, SQL database in Microsoft Fabric, Microsoft Fabric Warehouse
Specifies that the join in the query should use looping, hashing, or merging. Using LOOP
, HASH
, or MERGE JOIN
enforces a particular join between two tables. LOOP
can't be specified together with RIGHT
or FULL
as a join type. For more information, see Joins.
REMOTE
Applies to: Azure SQL Database, Azure SQL Managed Instance, SQL analytics endpoint, SQL database in Microsoft Fabric
Specifies that the join operation is performed on the site of the right table. This is useful when the left table is a local table and the right table is a remote table. REMOTE
should be used only when the left table has fewer rows than the right table.
If the right table is local, the join is performed locally. If both tables are remote but from different data sources, REMOTE
causes the join to be performed on the site of the right table. If both tables are remote tables from the same data source, REMOTE
isn't required.
REMOTE
can't be used when one of the values being compared in the join predicate is cast to a different collation using the COLLATE
clause.
REMOTE
can be used only for INNER JOIN
operations.
REDUCE
Applies to: Azure Synapse Analytics and Analytics Platform System (PDW)
Reduces the number of rows to be moved for the table on the right side of the join in order to make two distribution incompatible tables compatible. The REDUCE hint is also called a semi-join hint.
REPLICATE
Applies to: Azure Synapse Analytics, Analytics Platform System (PDW), Microsoft Fabric Warehouse
Causes a broadcast move operation, where a specific table to be replicated across all distribution nodes.
- Using
REPLICATE
with aINNER
orLEFT
join, the broadcast move operation will replicate the right side of the join to all nodes. - Similarly, while using
REPLICATE
with aRIGHT
join, the broadcast move operation will replicate the left side of the join to all nodes. - When using
REPLICATE
with aFULL
join, an estimated plan cannot be created.
REDISTRIBUTE [(columns_count)]
Applies to: Azure Synapse Analytics and Analytics Platform System (PDW)
Forces two data sources to be distributed on columns specified in the JOIN clause. For a distributed table, Analytics Platform System (PDW) performs a shuffle move on the first column of both tables For a replicated table, Analytics Platform System (PDW) performs a trim move. To understand these move types, see the "DMS Query Plan Operations" section in the "Understanding Query Plans" article in the Analytics Platform System (PDW) product documentation. This hint can improve performance when the query plan is using a broadcast move to resolve a distribution incompatible join.
Applies to: Microsoft Fabric Warehouse
The REDISTRIBUTE
hint ensures two data sources are distributed based on JOIN
clause columns. It handles multiple join conditions, specified by the first n columns in both tables, where n is the column_count
argument. Redistributing data optimizes query performance by evenly spreading data across nodes during intermediate steps of execution.
The (columns_count)
argument is only supported in Microsoft Fabric Warehouse.
Remarks
Join hints are specified in the FROM
clause of a query. Join hints enforce a join strategy between two tables. If a join hint is specified for any two tables, the query optimizer automatically enforces the join order for all joined tables in the query, based on the position of the ON
keywords. When a CROSS JOIN
is used without the ON
clause, parentheses can be used to indicate the join order.
Examples
The code samples in this article use the AdventureWorks2022
or AdventureWorksDW2022
sample database, which you can download from the Microsoft SQL Server Samples and Community Projects home page.
A. Use HASH
The following example specifies that the JOIN
operation in the query is performed by a HASH
join.
SELECT p.Name,
pr.ProductReviewID
FROM Production.Product AS p
LEFT OUTER HASH JOIN Production.ProductReview AS pr
ON p.ProductID = pr.ProductID
ORDER BY ProductReviewID DESC;
B. Use LOOP
The following example specifies that the JOIN
operation in the query is performed by a LOOP
join.
DELETE
FROM Sales.SalesPersonQuotaHistory
FROM Sales.SalesPersonQuotaHistory AS spqh
INNER LOOP JOIN Sales.SalesPerson AS sp
ON spqh.SalesPersonID = sp.SalesPersonID
WHERE sp.SalesYTD > 2500000.00;
GO
C. Use MERGE
The following example specifies that the JOIN
operation in the query is performed by a MERGE
join.
SELECT poh.PurchaseOrderID,
poh.OrderDate,
pod.ProductID,
pod.DueDate,
poh.VendorID
FROM Purchasing.PurchaseOrderHeader AS poh
INNER MERGE JOIN Purchasing.PurchaseOrderDetail AS pod
ON poh.PurchaseOrderID = pod.PurchaseOrderID;
GO
D. REDUCE join hint example
The following example uses the REDUCE
join hint to alter the processing of the derived table within the query. When using the REDUCE
join hint in this query, the fis.ProductKey
is projected, replicated and made distinct, and then joined to DimProduct
during the shuffle of DimProduct
on ProductKey
. The resulting derived table is distributed on fis.ProductKey
.
-- Uses AdventureWorks
SELECT SalesOrderNumber
FROM (
SELECT fis.SalesOrderNumber,
dp.ProductKey,
dp.EnglishProductName
FROM DimProduct AS dp
INNER REDUCE JOIN FactInternetSales AS fis
ON dp.ProductKey = fis.ProductKey
) AS dTable
ORDER BY SalesOrderNumber;
E. REPLICATE join hint example
This next example shows the same query as the previous example, except that a REPLICATE
join hint is used instead of the REDUCE
join hint. Use of the REPLICATE
hint causes the values in the ProductKey
(joining) column from the FactInternetSales
table to be replicated to all nodes. The DimProduct
table is joined to the replicated version of those values.
-- Uses AdventureWorks
SELECT SalesOrderNumber
FROM (
SELECT fis.SalesOrderNumber,
dp.ProductKey,
dp.EnglishProductName
FROM DimProduct AS dp
INNER REPLICATE JOIN FactInternetSales AS fis
ON dp.ProductKey = fis.ProductKey
) AS dTable
ORDER BY SalesOrderNumber;
F. Use the REDISTRIBUTE hint to guarantee a Shuffle move for a distribution incompatible join
The following query uses the REDISTRIBUTE
query hint on a distribution incompatible join. This guarantees the query optimizer uses a Shuffle move in the query plan. This also guarantees the query plan won't use a Broadcast move, which moves a distributed table to a replicated table.
In the following example, the REDISTRIBUTE
hint forces a Shuffle move on the FactInternetSales
table because ProductKey
is the distribution column for DimProduct
, and isn't the distribution column for FactInternetSales
.
-- Uses AdventureWorks
SELECT dp.ProductKey,
fis.SalesOrderNumber,
fis.TotalProductCost
FROM DimProduct AS dp
INNER REDISTRIBUTE JOIN FactInternetSales AS fis
ON dp.ProductKey = fis.ProductKey;
G. Use the columns count argument with the REDISTRIBUTE hint
The following query uses the REDISTRIBUTE
query hint with the columns count argument, and the shuffle takes place across the first four columns of each table in the join.
SELECT * FROM DA
INNER REDISTRIBUTE (4) JOIN DB
ON DA.a1 = DB.b1