'Define nested items while creating table in Hive

I am trying to create an external hive table using CSV as input file.

How my data looks like:

xxx|2021-08-14 07:10:41.080|[{"sub1","90"},{"sub2","95"}]

I am creating the table using below sql:

CREATE EXTERNAL TABLE mydb.mytable (
Name string,
Last_upd_timestamp timestamp,
subjects array<struct<sub_code:string,sub_marks:string>>
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES ('collection.delim'=',','field.delim'='|','serialization.format'='|')
STORED AS INPUTFORMAT 'org.apache.hadoop.mapred.TextInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
Location 'hdfs://nameservice1/myinputfile'

)

When i try the above, table is created with subjects column like:

[{"sub_code":"[{\"sub1\",\"90\"},{\"sub2\",\"95\"}]","sub_marks":null}]

Not sure what I am doing wrong in the above. Would highly appreciate if someone can help me with how I can create the table in expected output.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source