SQL 管道语法

适用于:勾选标记为“是” Databricks Runtime 16.2 及更高版本

Azure Databricks 支持 SQL 管道模式语法,该语法允许通过组合运算符链来构建查询。

  • 任何查询都可以将零个或多个管道运算符作为后缀,由管道字符 |>进行描述。
  • 每个 管道操作 以一个或多个 SQL 关键字开头,后面跟着它自己的语法。
  • 运算符可以按任意顺序应用任意次数。
  • 通常 FROM relation_name 用于启动管道,但任何查询都可以启动管道。

语法

{ FROM | TABLE } relation_name { |> piped_operation } [ ...]

参数

这是以 ANSI SQL 编写的 TPC-H 基准测试中的查询 13:

> SELECT c_count, COUNT(*) AS custdist
    FROM
    (SELECT c_custkey, COUNT(o_orderkey) c_count
      FROM customer
      LEFT OUTER JOIN orders ON c_custkey = o_custkey
                             AND o_comment NOT LIKE '%unusual%packages%'
     GROUP BY c_custkey
  ) AS c_orders
  GROUP BY c_count
  ORDER BY custdist DESC, c_count DESC;

若要使用 SQL 管道运算符编写相同的逻辑,可以按如下所示表达逻辑:

> FROM customer
  |> LEFT OUTER JOIN orders ON c_custkey = o_custkey
                            AND o_comment NOT LIKE '%unusual%packages%'
  |> AGGREGATE COUNT(o_orderkey) c_count
     GROUP BY c_custkey
  |> AGGREGATE COUNT(*) AS custdist
     GROUP BY c_count
  |> ORDER BY custdist DESC, c_count DESC;