Поделиться через


предложение PIVOT

Область применения:флажок Databricks SQL флажок Databricks Runtime

Преобразует строки table_reference путем поворота уникальных значений указанного списка столбцов в отдельные столбцы.

Синтаксис

table_reference PIVOT ( { aggregate_expression [ [ AS ] agg_column_alias ] } [, ...]
    FOR column_list IN ( expression_list ) )

column_list
 { column_name |
   ( column_name [, ...] ) }

expression_list
 { expression [ AS ] [ column_alias ] |
   { ( expression [, ...] ) [ AS ] [ column_alias] } [, ...] ) }

Параметры

  • table_reference

    Определяет тему PIVOT операции.

  • aggregate_expression

    Выражение любого типа, где все ссылки на столбцы table_reference являются аргументами для агрегатных функций.

  • agg_column_alias

    Необязательный псевдоним для результата статистической обработки. Если псевдоним не указан, PIVOT создает псевдоним на основе aggregate_expression.

  • column_list

    Набор столбцов, которые необходимо повернуть.

  • expression_list

    Сопоставляет значения от column_list с псевдонимами столбцов.

    • выражение

      Литеральное выражение с типом, который делит наименьший общий тип с соответствующим column_name.

      Количество выражений в каждом кортеже должно совпадать с количеством column_names в column_list.

    • column_alias

      Необязательный псевдоним, указывающий имя созданного столбца. Если псевдоним не указан, PIVOT создает псевдоним на основе expression.

Результат

Временная таблица следующей формы:

  • Все столбцы из промежуточного результирующего набора table_reference, которые не были указаны в каких-либо aggregate_expression или column_list.

    Эти столбцы являются группированием столбцов.

  • Для каждого кортежа expression и сочетания aggregate_expression создается один столбец PIVOT. Тип является типом aggregate_expression.

    Если есть только один aggregate_expression столбец называется с помощью column_alias. В противном случае он называется column_alias_agg_column_alias.

    Значение в каждой ячейке является результатом aggregation_expression с использованием FILTER ( WHERE column_list IN (expression, ...).

Примеры

-- A very basic PIVOT
-- Given a table with sales by quarter, return a table that returns sales across quarters per year.
> CREATE TEMP VIEW sales(year, quarter, region, sales) AS
   VALUES (2018, 1, 'east', 100),
          (2018, 2, 'east',  20),
          (2018, 3, 'east',  40),
          (2018, 4, 'east',  40),
          (2019, 1, 'east', 120),
          (2019, 2, 'east', 110),
          (2019, 3, 'east',  80),
          (2019, 4, 'east',  60),
          (2018, 1, 'west', 105),
          (2018, 2, 'west',  25),
          (2018, 3, 'west',  45),
          (2018, 4, 'west',  45),
          (2019, 1, 'west', 125),
          (2019, 2, 'west', 115),
          (2019, 3, 'west',  85),
          (2019, 4, 'west',  65);

> SELECT year, region, q1, q2, q3, q4
  FROM sales
  PIVOT (sum(sales) AS sales
    FOR quarter
    IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
 2018  east  100   20  40  40
 2019  east  120  110  80  60
 2018  west  105   25  45  45
 2019  west  125  115  85  65

-- The same query written without PIVOT
> SELECT year, region,
         sum(sales) FILTER(WHERE quarter = 1) AS q1,
         sum(sales) FILTER(WHERE quarter = 2) AS q2,
         sum(sales) FILTER(WHERE quarter = 3) AS q2,
         sum(sales) FILTER(WHERE quarter = 4) AS q4
  FROM sales
  GROUP BY year, region;
 2018  east  100   20  40  40
 2019  east  120  110  80  60
 2018  west  105   25  45  45
 2019  west  125  115  85  65

-- Also PIVOT on region
> SELECT year, q1_east, q1_west, q2_east, q2_west, q3_east, q3_west, q4_east, q4_west
    FROM sales
    PIVOT (sum(sales) AS sales
      FOR (quarter, region)
      IN ((1, 'east') AS q1_east, (1, 'west') AS q1_west, (2, 'east') AS q2_east, (2, 'west') AS q2_west,
          (3, 'east') AS q3_east, (3, 'west') AS q3_west, (4, 'east') AS q4_east, (4, 'west') AS q4_west));
 2018  100  105   20   25  40  45  40  45
 2019  120  125  110  115  80  85  60  65

-- The same query written without PIVOT
> SELECT year,
    sum(sales) FILTER(WHERE (quarter, region) IN ((1, 'east'))) AS q1_east,
    sum(sales) FILTER(WHERE (quarter, region) IN ((1, 'west'))) AS q1_west,
    sum(sales) FILTER(WHERE (quarter, region) IN ((2, 'east'))) AS q2_east,
    sum(sales) FILTER(WHERE (quarter, region) IN ((2, 'west'))) AS q2_west,
    sum(sales) FILTER(WHERE (quarter, region) IN ((3, 'east'))) AS q3_east,
    sum(sales) FILTER(WHERE (quarter, region) IN ((3, 'west'))) AS q3_west,
    sum(sales) FILTER(WHERE (quarter, region) IN ((4, 'east'))) AS q4_east,
    sum(sales) FILTER(WHERE (quarter, region) IN ((4, 'west'))) AS q4_west
    FROM sales
    GROUP BY year;
 2018  100  105   20   25  40  45  40  45
 2019  120  125  110  115  80  85  60  65

-- To aggregate across regions the column must be removed from the input.
> SELECT year, q1, q2, q3, q4
  FROM (SELECT year, quarter, sales FROM sales) AS s
  PIVOT (sum(sales) AS sales
    FOR quarter
    IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
  2018  205   45   85   85
  2019  245  225  165  125

-- The same query without PIVOT
> SELECT year,
    sum(sales) FILTER(WHERE quarter = 1) AS q1,
    sum(sales) FILTER(WHERE quarter = 2) AS q2,
    sum(sales) FILTER(WHERE quarter = 3) AS q3,
    sum(sales) FILTER(WHERE quarter = 4) AS q4
    FROM sales
    GROUP BY year;

-- A PIVOT with multiple aggregations
> SELECT year, q1_total, q1_avg, q2_total, q2_avg, q3_total, q3_avg, q4_total, q4_avg
    FROM (SELECT year, quarter, sales FROM sales) AS s
    PIVOT (sum(sales) AS total, avg(sales) AS avg
      FOR quarter
      IN (1 AS q1, 2 AS q2, 3 AS q3, 4 AS q4));
 2018  205  102.5   45   22.5   85  42.5   85  42.5
 2019  245  122.5  225  112.5  165  82.5  125  62.5

-- The same query without PIVOT
> SELECT year,
         sum(sales) FILTER(WHERE quarter = 1) AS q1_total,
         avg(sales) FILTER(WHERE quarter = 1) AS q1_avg,
         sum(sales) FILTER(WHERE quarter = 2) AS q2_total,
         avg(sales) FILTER(WHERE quarter = 2) AS q2_avg,
         sum(sales) FILTER(WHERE quarter = 3) AS q3_total,
         avg(sales) FILTER(WHERE quarter = 3) AS q3_avg,
         sum(sales) FILTER(WHERE quarter = 4) AS q4_total,
         avg(sales) FILTER(WHERE quarter = 4) AS q4_avg
    FROM sales
    GROUP BY year;
 2018  205  102.5   45   22.5   85  42.5   85  42.5
 2019  245  122.5  225  112.5  165  82.5  125  62.5