'Column X contains an aggregation function, which is not allowed in GROUP BY at
Im facing an issue with my SQL query... When i try to run it, the following message appears: "com.google.cloud.bigquery.BigQueryException: Column 16 contains an aggregation function, which is not allowed in GROUP BY at [65:31]"
The query in question is this one: https://pastebin.com/Nm6UxDEr (too long to paste here).
Can someone help me solve this aggregation problem?
Solution 1:[1]
Line 65 has a group by columns list:
Select ....
WHERE A.SHP_LG_FACILITY_ID IN ('SMS1')
GROUP BY 1,2,3,4,5,6,7,8,9,16,17)
Since some columns were added the group by has to be adjusted to:
GROUP BY 1,2,3,4,5,6,7,8,9,10,17,18
)
Your script from line 2 to 65 should be runnable on its own. It cannot be tested here, because there are no table data given. However, in a first attempt some sample data can be made up and the script should be runnable. All calculated fields are replaced by a test variable:
SELECT
DISTINCT SHP_SHIPMENT_ID AS ID_PACOTE,
TMS_SELLER_INFO_NAME AS SELLER,
SHP_PICKING_TYPE_ID AS TIPO_PICKING,
SHP_LOGISTIC_CENTER_ID AS ORIGEM,
SHP_LG_FACILITY_ID AS DESTINO,
SHP_LG_ORIGIN_CODE AS XPT,
A1 AS STATUS_LT,
A2 AS PROMESSA_INICIAL,
A3 AS PROMESSA_FINAL,
A4 AS PROMESSA,
MIN(A) AS A_ETIQUETAR,
MIN(A) AS PRONTO_PARA_COLETA,
MIN(A) AS COLETADO,
MIN(A) AS EM_DEPOSITO,
MIN(A) AS CAMINHO_PARA_ESTACAO,
MIN(A) AS ENTROU_NA_ESTACAO,
(A5) AS PRIMEIRA_VISITA,
(A6) AS DATA_ENTREGA,
FROM (
SELECT
1001 AS A,
1 SHP_SHIPMENT_ID,
2 TMS_SELLER_INFO_NAME,
3 SHP_PICKING_TYPE_ID,
4 SHP_LOGISTIC_CENTER_ID,
5 SHP_LG_FACILITY_ID,
6 SHP_LG_ORIGIN_CODE,
11 A1,
12 A2,
13 A3,
14 A4,
15 A5,
16 A6
Union ALL Select 1000,101,102,103,104,105,106,107,108,109,110,111,112)
# WHERE A.SHP_LG_FACILITY_ID IN ('SMS1')
# GROUP BY 1,2,3,4,5,6,7,8,9,16,17 # Error: COLUMN 16 contains an aggregation FUNCTION, which IS NOT allowed IN GROUP BY
GROUP BY 1,2,3,4,5,6,7,8,9,17,18,10
With the comment out line for the group by the error for column 16 occures. Indeed the group by statement should be adjusted by 17,18,10. The query should work then. Using the formula this can be written and tested as
SELECT DISTINCT
SHP_SHIPMENT_ID AS ID_PACOTE,
TMS_SELLER_INFO_NAME AS SELLER,
SHP_PICKING_TYPE_ID AS TIPO_PICKING,
SHP_LOGISTIC_CENTER_ID AS ORIGEM,
SHP_LG_FACILITY_ID AS DESTINO,
SHP_LG_ORIGIN_CODE AS XPT,
CASE WHEN CAST(SHP_DATE_FIRST_VISIT_ID AS DATE) < CAST(EST_DELIVERY_DATE AS DATE) THEN 'EARLY'
WHEN CAST(SHP_DATE_FIRST_VISIT_ID AS DATE) > CAST((CASE WHEN EST_DELIVERY_OFFSET_DATE IS NULL THEN DATETIME_ADD(EST_DELIVERY_DATE, INTERVAL '21' HOUR) ELSE DATETIME_ADD(EST_DELIVERY_OFFSET_DATE, INTERVAL '21' HOUR) END) AS DATE) THEN 'DELAY'
ELSE 'ON_TIME'
END AS STATUS_LT,
DATETIME_ADD(EST_DELIVERY_DATE, INTERVAL '1' HOUR) AS PROMESSA_INICIAL,
DATETIME_ADD(EST_DELIVERY_OFFSET_DATE, INTERVAL '1' HOUR) AS PROMESSA_FINAL,
CASE WHEN EST_DELIVERY_OFFSET_DATE IS NULL THEN DATETIME_ADD(EST_DELIVERY_DATE, INTERVAL '21' HOUR) ELSE DATETIME_ADD(EST_DELIVERY_OFFSET_DATE, INTERVAL '21' HOUR) END AS PROMESSA,
MIN(CASE WHEN SHP_LG_STATUS = 'ready_to_print' THEN DATETIME_ADD(CAST(SHP_LG_SHIPMENT_CHK_DT AS DATETIME), INTERVAL '1' HOUR) END) AS A_ETIQUETAR,
MIN(CASE WHEN SHP_LG_STATUS = 'for_pickup' THEN DATETIME_ADD(CAST(SHP_LG_SHIPMENT_CHK_DT AS DATETIME), INTERVAL '1' HOUR) END) AS PRONTO_PARA_COLETA,
MIN(CASE WHEN SHP_LG_STATUS = 'picked_up' THEN DATETIME_ADD(CAST(SHP_LG_SHIPMENT_CHK_DT AS DATETIME), INTERVAL '1' HOUR) END) AS COLETADO,
MIN(CASE WHEN SHP_LG_STATUS = 'on_hub' THEN DATETIME_ADD(CAST(SHP_LG_SHIPMENT_CHK_DT AS DATETIME), INTERVAL '1' HOUR) END) AS EM_DEPOSITO,
MIN(CASE WHEN SHP_LG_STATUS = 'on_way' THEN DATETIME_ADD(CAST(SHP_LG_SHIPMENT_CHK_DT AS DATETIME), INTERVAL '1' HOUR) END) AS CAMINHO_PARA_ESTACAO,
MIN(CASE WHEN SHP_LG_STATUS = 'at_station' AND SHP_LG_SUB_STATUS ='sorting' THEN DATETIME_ADD(CAST(SHP_LG_SHIPMENT_CHK_DT AS DATETIME), INTERVAL '1' HOUR) END) AS ENTROU_NA_ESTACAO,
DATETIME_ADD(SHP_DATE_FIRST_VISIT_ID, INTERVAL '1' HOUR) AS PRIMEIRA_VISITA,
DATETIME_ADD(SHP_DATETIME_DELIVERED_ID, INTERVAL '1' HOUR) AS DATA_ENTREGA,
FROM
(
SELECT
1001 AS A,
1 SHP_SHIPMENT_ID,
2 TMS_SELLER_INFO_NAME,
3 SHP_PICKING_TYPE_ID,
4 SHP_LOGISTIC_CENTER_ID,
5 SHP_LG_FACILITY_ID,
6 SHP_LG_ORIGIN_CODE,
'ready_to_print' SHP_LG_STATUS ,
'' as SHP_LG_SUB_STATUS,
current_datetime() SHP_DATE_FIRST_VISIT_ID,
current_datetime() EST_DELIVERY_DATE,
current_datetime() EST_DELIVERY_OFFSET_DATE,
current_datetime() SHP_LG_SHIPMENT_CHK_DT,
current_datetime() SHP_DATETIME_DELIVERED_ID
Union ALL Select 1000,101,102,103,104,105,106,'on_hub','',current_datetime(),current_datetime(),current_datetime(),current_datetime(),current_datetime()
)
# WHERE SHP_LG_FACILITY_ID IN ('SMS1')
# GROUP BY 1,2,3,4,5,6,7,8,9,16,17 # Error: COLUMN 16 contains an aggregation FUNCTION, which IS NOT allowed IN GROUP BY
GROUP BY 1,2,3,4,5,6,7,8,9,17,18,10
Some further guides to shorten your query:
- Use
ifnull(columnA, 0)when possible instead of acase when columnA is null.
CAST((CASE WHEN B.EST_DELIVERY_OFFSET_DATE IS NULL THEN DATETIME_ADD(B.EST_DELIVERY_DATE, INTERVAL '21' HOUR) ELSE DATETIME_ADD(B.EST_DELIVERY_OFFSET_DATE, INTERVAL '21' HOUR) END) AS DATE)
shortens to
CAST(DATETIME_ADD(IFNULL(EST_DELIVERY_OFFSET_DATE,B.EST_DELIVERY_DATE), INTERVAL '21' HOUR) AS DATE)
- use list in case statement:
CASE
WHEN DESTINO in ('SMG1','SMG6','SMG12') THEN 'MG'
WHEN DESTINO in ('SMG4','SMG7','SMG8','SMG13') THEN 'MGI'
WHEN DESTINO = 'SMG2' THEN 'MGII'
WHEN DESTINO = 'SMG3' THEN 'MGII'
WHEN DESTINO = 'SMG5' THEN 'MGII'
WHEN DESTINO = 'SMG9' THEN 'MGII'
END AS REGIONAL
- use
date_diffinstead three times adate_addto figure out a time difference in a case statement. Nested select statement are also allowed to calculate the difference and then use that value in the next select statement. - An user SQL-UDF function may look nicer for the six
A.SHP_LG_STATUS = 'ready_to_print'statements.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
