MISSING_AGGREGATION錯誤類別
非匯總運算式 <expression>
是以未參與 GROUP BY 子句的資料行為基礎。
將資料行或運算式新增至 GROUP BY、匯總運算式,或如果您不小心傳回群組中的哪些值,請使用 <expressionAnyValue>
。
參數
-
運算式:清單中的非匯總、非群組運算式
SELECT
。 -
expressionAnyValue:
expression
包裝在 any_value () 彙總函式中。
解釋
在具有 GROUP BY 子句的查詢內容中, SELECT 清單中的本機資料行參考必須是:
- 當做 彙總函式的引數使用,或
- 符合 子句上運算式之運算式的
GROUP BY
一部分。
本機資料行參考是已解析為查詢FROM子句中資料表參考的資料行。
換句話說:資料行參考必須是群組索引鍵的一部分,或者它們必須是匯總的一部分。
Azure Databricks 會比對 最佳努力的運算式:例如,它會辨識: SELECT c1 + 5 FROM T GROUP BY 5 + c1
做為數學運算式。
但 SELECT c1 FROM T GROUP BY c1 + 5
不符合。
緩解
錯誤的風險降低取決於原因:
您是否錯過群組資料行?
將
expression
的相關子運算式expression
加入GROUP BY
至 子句。運算式的資料
GROUP BY
行參考部分與 不同epression
嗎?比對清單中的運算式
SELECT
,或簡化GROUP BY
運算式。您是否遺漏匯總?
使用彙總函式包裝資料行參考。 如果您只想要來自群組的代表性值,您可以使用 any_value (pression) 。
例子
-- Sample data
> CREATE OR REPLACE TEMPORARY VIEW tasks(name, firstname, task, cost) AS
VALUES ('Smith' , 'Sam' , 'UNPIVOT', 10),
('Smith' , 'Sam' , 'LATERAL', 5),
('Shuster', 'Sally' , 'DELETE' , 7),
('Shuster', 'Sally' , 'GRANT' , 8);
-- `name` and `firstname` are part of the group by coumns, but incomplete
> SELECT name, firstname, sum(cost) FROM tasks GROUP BY firstname || ' ' || name;
[MISSING_AGGREGATION] The expression "name" is neither present in the group by, nor is it an aggregate function.
-- Match the GROUP BY expression
> SELECT firstname || ' ' || name, sum(cost) FROM tasks GROUP BY firstname || ' ' || name;
Sam Smith 15
Sally Shuster 15
-- Break up the GROUP BY expression
> SELECT firstname, name, sum(cost) FROM tasks GROUP BY firstname, name;
Sam Smith 15
Sally Shuster 15
-- Missing grouping column
> SELECT name, firstname, sum(cost) FROM tasks GROUP BY name;
[MISSING_AGGREGATION] The expression "firstname" is neither present in the group by, nor is it an aggregate function.
-- Add the grouping column
> SELECT firstname, name, sum(cost) FROM tasks GROUP BY firstname, name;
Sam Smith 15
Sally Shuster 15
-- Missing aggregate
> SELECT firstname, name, sum(cost), task FROM tasks GROUP BY firstname, name;
[MISSING_AGGREGATION] The expression "task" is neither present in the group by, nor is it an aggregate function.
-- Add an aggregate
> SELECT firstname, name, sum(cost), array_agg(task) FROM tasks GROUP BY firstname, name;
Sam Smith 15 ["UNPIVOT","LATERAL"]
Sally Shuster 15 ["DELETE","GRANT"]
-- Return any task
> SELECT firstname, name, sum(cost), any_value(task) FROM tasks GROUP BY firstname, name;
Sam Smith 15 LATERAL
Sally Shuster 15 DELETE