Επεξεργασία

Κοινή χρήση μέσω


Join hints (Transact-SQL)

Applies to: SQL Server Azure SQL Database Azure SQL Managed Instance SQL analytics endpoint in Microsoft Fabric Warehouse in Microsoft Fabric SQL database in Microsoft Fabric

Join hints specify that the query optimizer enforce a join strategy between two tables in SQL Server. For general information about joins and join syntax, see FROM clause plus JOIN, APPLY, PIVOT.

Caution

Because the SQL Server query optimizer typically selects the best execution plan for a query, we recommend that hints be used only as a last resort by experienced developers and database administrators.

Applies to

Transact-SQL syntax conventions

Syntax

<join_hint> ::=
     { LOOP | HASH | MERGE | REMOTE | REDUCE | REPLICATE | REDISTRIBUTE [(columns count)]}

Arguments

{ LOOP | HASH | MERGE }

Applies to: Azure SQL Database, Azure SQL Managed Instance, SQL analytics endpoint, SQL database in Microsoft Fabric, Microsoft Fabric Warehouse

Specifies that the join in the query should use looping, hashing, or merging. Using LOOP, HASH, or MERGE JOIN enforces a particular join between two tables. LOOP can't be specified together with RIGHT or FULL as a join type. For more information, see Joins.

REMOTE

Applies to: Azure SQL Database, Azure SQL Managed Instance, SQL analytics endpoint, SQL database in Microsoft Fabric

Specifies that the join operation is performed on the site of the right table. This is useful when the left table is a local table and the right table is a remote table. REMOTE should be used only when the left table has fewer rows than the right table.

If the right table is local, the join is performed locally. If both tables are remote but from different data sources, REMOTE causes the join to be performed on the site of the right table. If both tables are remote tables from the same data source, REMOTE isn't required.

REMOTE can't be used when one of the values being compared in the join predicate is cast to a different collation using the COLLATE clause.

REMOTE can be used only for INNER JOIN operations.

REDUCE

Applies to: Azure Synapse Analytics and Analytics Platform System (PDW)

Reduces the number of rows to be moved for the table on the right side of the join in order to make two distribution incompatible tables compatible. The REDUCE hint is also called a semi-join hint.

REPLICATE

Applies to: Azure Synapse Analytics, Analytics Platform System (PDW), Microsoft Fabric Warehouse

Causes a broadcast move operation, where a specific table to be replicated across all distribution nodes.

  • Using REPLICATE with a INNER or LEFT join, the broadcast move operation will replicate the right side of the join to all nodes.
  • Similarly, while using REPLICATE with a RIGHT join, the broadcast move operation will replicate the left side of the join to all nodes.
  • When using REPLICATE with a FULL join, an estimated plan cannot be created.

REDISTRIBUTE [(columns_count)]

Applies to: Azure Synapse Analytics and Analytics Platform System (PDW)

Forces two data sources to be distributed on columns specified in the JOIN clause. For a distributed table, Analytics Platform System (PDW) performs a shuffle move on the first column of both tables For a replicated table, Analytics Platform System (PDW) performs a trim move. To understand these move types, see the "DMS Query Plan Operations" section in the "Understanding Query Plans" article in the Analytics Platform System (PDW) product documentation. This hint can improve performance when the query plan is using a broadcast move to resolve a distribution incompatible join.

Applies to: Microsoft Fabric Warehouse

The REDISTRIBUTE hint ensures two data sources are distributed based on JOIN clause columns. It handles multiple join conditions, specified by the first n columns in both tables, where n is the column_count argument. Redistributing data optimizes query performance by evenly spreading data across nodes during intermediate steps of execution.

The (columns_count) argument is only supported in Microsoft Fabric Warehouse.

Remarks

Join hints are specified in the FROM clause of a query. Join hints enforce a join strategy between two tables. If a join hint is specified for any two tables, the query optimizer automatically enforces the join order for all joined tables in the query, based on the position of the ON keywords. When a CROSS JOIN is used without the ON clause, parentheses can be used to indicate the join order.

Examples

The code samples in this article use the AdventureWorks2022 or AdventureWorksDW2022 sample database, which you can download from the Microsoft SQL Server Samples and Community Projects home page.

A. Use HASH

The following example specifies that the JOIN operation in the query is performed by a HASH join.

SELECT p.Name,
    pr.ProductReviewID
FROM Production.Product AS p
LEFT OUTER HASH JOIN Production.ProductReview AS pr
    ON p.ProductID = pr.ProductID
ORDER BY ProductReviewID DESC;

B. Use LOOP

The following example specifies that the JOIN operation in the query is performed by a LOOP join.

DELETE
FROM Sales.SalesPersonQuotaHistory
FROM Sales.SalesPersonQuotaHistory AS spqh
INNER LOOP JOIN Sales.SalesPerson AS sp
    ON spqh.SalesPersonID = sp.SalesPersonID
WHERE sp.SalesYTD > 2500000.00;
GO

C. Use MERGE

The following example specifies that the JOIN operation in the query is performed by a MERGE join.

SELECT poh.PurchaseOrderID,
    poh.OrderDate,
    pod.ProductID,
    pod.DueDate,
    poh.VendorID
FROM Purchasing.PurchaseOrderHeader AS poh
INNER MERGE JOIN Purchasing.PurchaseOrderDetail AS pod
    ON poh.PurchaseOrderID = pod.PurchaseOrderID;
GO

D. REDUCE join hint example

The following example uses the REDUCE join hint to alter the processing of the derived table within the query. When using the REDUCE join hint in this query, the fis.ProductKey is projected, replicated and made distinct, and then joined to DimProduct during the shuffle of DimProduct on ProductKey. The resulting derived table is distributed on fis.ProductKey.

-- Uses AdventureWorks
  
SELECT SalesOrderNumber
FROM (
    SELECT fis.SalesOrderNumber,
        dp.ProductKey,
        dp.EnglishProductName
    FROM DimProduct AS dp
    INNER REDUCE JOIN FactInternetSales AS fis
        ON dp.ProductKey = fis.ProductKey
    ) AS dTable
ORDER BY SalesOrderNumber;

E. REPLICATE join hint example

This next example shows the same query as the previous example, except that a REPLICATE join hint is used instead of the REDUCE join hint. Use of the REPLICATE hint causes the values in the ProductKey (joining) column from the FactInternetSales table to be replicated to all nodes. The DimProduct table is joined to the replicated version of those values.

-- Uses AdventureWorks

SELECT SalesOrderNumber
FROM (
    SELECT fis.SalesOrderNumber,
        dp.ProductKey,
        dp.EnglishProductName
    FROM DimProduct AS dp
    INNER REPLICATE JOIN FactInternetSales AS fis
        ON dp.ProductKey = fis.ProductKey
    ) AS dTable
ORDER BY SalesOrderNumber;

F. Use the REDISTRIBUTE hint to guarantee a Shuffle move for a distribution incompatible join

The following query uses the REDISTRIBUTE query hint on a distribution incompatible join. This guarantees the query optimizer uses a Shuffle move in the query plan. This also guarantees the query plan won't use a Broadcast move, which moves a distributed table to a replicated table.

In the following example, the REDISTRIBUTE hint forces a Shuffle move on the FactInternetSales table because ProductKey is the distribution column for DimProduct, and isn't the distribution column for FactInternetSales.

-- Uses AdventureWorks
  
SELECT dp.ProductKey,
    fis.SalesOrderNumber,
    fis.TotalProductCost
FROM DimProduct AS dp
INNER REDISTRIBUTE JOIN FactInternetSales AS fis
    ON dp.ProductKey = fis.ProductKey;

G. Use the columns count argument with the REDISTRIBUTE hint

The following query uses the REDISTRIBUTE query hint with the columns count argument, and the shuffle takes place across the first four columns of each table in the join.

SELECT * FROM DA
INNER REDISTRIBUTE (4) JOIN DB
ON DA.a1 = DB.b1