'Presto query Array of Rows
So I have a hive external table with schema looks like this :
{
.
.
`x` string,
`y` ARRAY<struct<age:string,cId:string,dmt:string>>,
`z` string
}
So basically I need to query a column(column "y") which is array of nested json, I can see data of column "y" from hive, but data in that column seems invisible to presto, even though presto knows schema of this field, like this:
array(row(age varchar,cid varchar,dmt varchar))
As you can see presto already knows this field is array of row.
Notes:
1.The table is a hive external table.
2.I get schema of field "y" by using ODBC driver, but data is just all empty, however I can see something like this in hive : [{"age":"12","cId":"bx21hdg","dmt":"120"}]
3.Presto queries hivemetastore for schema.
4.Table was stored as parquet format.
So how can I see my data in field "y" please?
Solution 1:[1]
Please try the below. This should work in Presto.
"If the array element is a row data type, the result is a table with one column for each row field in the element data type. The result table column data types match the corresponding array element row field data types"
select
y,age,cid,dmt
from
table
cross join UNNEST(y) AS nested_data(age,cid,dmt)
Reference: https://www.ibm.com/support/knowledgecenter/en/SSEPGG_10.5.0/com.ibm.db2.luw.sql.ref.doc/doc/r0055064.html
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Anil Kulkarni |
