MISSING_AGGREGATION錯誤類別
非匯總表達式 <expression>
是以未參與 GROUP BY 子句的列為基礎。
將數據行或表示式新增至 GROUP BY、匯總表達式,或如果您不在意群組內傳回哪些值,請使用 <expressionAnyValue>
。
參數
-
表示式:
SELECT
清單中的非匯總、非群組表達式。 -
expressionAnyValue:
expression
包裝在 any_value() 聚合函數中。
解釋
在具有 GROUP BY 子句的查詢中,SELECT 名單中的本機欄位參考必須符合以下要求:
- 作為 聚合函數的參數使用,或
- 符合
GROUP BY
子句上表達式的表達式的一部分。
本機數據行參考是已解析為查詢 之 FROM 子句中 數據表參考 的數據行。
換句話說:欄位參考必須是群組索引鍵的一部分,或者它們必須是聚合的一部分。
Azure Databricks 會比對 上最費力的表達式:例如,它會辨識:SELECT c1 + 5 FROM T GROUP BY 5 + c1
為數學表達式。
但 SELECT c1 FROM T GROUP BY c1 + 5
不相符。
減緩
錯誤的緩解取決於原因:
您是否錯過了分組欄?
將
expression
或expression
的相關子運算式新增至GROUP BY
子句。與
GROUP BY
不同之epression
表示式的數據行參考部分嗎?比對
SELECT
清單中的運算式,或簡化GROUP BY
運算式。您是否遺漏匯總?
使用聚合函數包裝數據行參考。 如果您只想要群組中的代表性值,您可以使用 any_value(epression)。
例子
-- Sample data
> CREATE OR REPLACE TEMPORARY VIEW tasks(name, firstname, task, cost) AS
VALUES ('Smith' , 'Sam' , 'UNPIVOT', 10),
('Smith' , 'Sam' , 'LATERAL', 5),
('Shuster', 'Sally' , 'DELETE' , 7),
('Shuster', 'Sally' , 'GRANT' , 8);
-- `name` and `firstname` are part of the group by coumns, but incomplete
> SELECT name, firstname, sum(cost) FROM tasks GROUP BY firstname || ' ' || name;
[MISSING_AGGREGATION] The expression "name" is neither present in the group by, nor is it an aggregate function.
-- Match the GROUP BY expression
> SELECT firstname || ' ' || name, sum(cost) FROM tasks GROUP BY firstname || ' ' || name;
Sam Smith 15
Sally Shuster 15
-- Break up the GROUP BY expression
> SELECT firstname, name, sum(cost) FROM tasks GROUP BY firstname, name;
Sam Smith 15
Sally Shuster 15
-- Missing grouping column
> SELECT name, firstname, sum(cost) FROM tasks GROUP BY name;
[MISSING_AGGREGATION] The expression "firstname" is neither present in the group by, nor is it an aggregate function.
-- Add the grouping column
> SELECT firstname, name, sum(cost) FROM tasks GROUP BY firstname, name;
Sam Smith 15
Sally Shuster 15
-- Missing aggregate
> SELECT firstname, name, sum(cost), task FROM tasks GROUP BY firstname, name;
[MISSING_AGGREGATION] The expression "task" is neither present in the group by, nor is it an aggregate function.
-- Add an aggregate
> SELECT firstname, name, sum(cost), array_agg(task) FROM tasks GROUP BY firstname, name;
Sam Smith 15 ["UNPIVOT","LATERAL"]
Sally Shuster 15 ["DELETE","GRANT"]
-- Return any task
> SELECT firstname, name, sum(cost), any_value(task) FROM tasks GROUP BY firstname, name;
Sam Smith 15 LATERAL
Sally Shuster 15 DELETE