'Impala insert vs hive insert

When I tried to insert integer values into a column in a parquet table with Hive command, values are not getting insert and shows as null. But when used impala command it is working. But the partition size reduces with impala insert. Also number of rows in the partitions (show partitions) show as -1. What is the reason for this?1

CREATE TABLE TEST.LOGS( 2 recordtype string, 3 recordstatus string, 4 recordnumber string, 5 starttime string, 6 endtime string, 7 acctsessionid string, 8 subscriberid string, 9 framedip string, 10 servicename string, 11 totalbytes int, 12 rxbytes int, 13 txbytes int, 14 time int, 15 plan string, 16 tcpudp string, 17 intport string) 18 PARTITIONED BY (ymd string) 20 ROW FORMAT SERDE 21 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe' 22 WITH SERDEPROPERTIES ( 23 'field.delim'=',', 24 'serialization.format'=',') 25 STORED AS INPUTFORMAT 26 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat' 27 OUTPUTFORMAT 28 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat' 29 LOCATION 30 'hdfs://dev-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX' 31 TBLPROPERTIES ( 32 'transient_lastDdlTime'='1634390569')

Insert Statement

Hive

sudo -u hdfs hive -e 'insert into table TEST.LOGS partition (ymd="20220221") select * from TEMP.LOGS;'

Impala

impala-shell --ssl -i xxxxxxxxxxx:21000 -q 'insert into table TEST.LOGS partition (ymd="20220221") select * from TEMP.LOGS;'



Solution 1:[1]

When I tried to insert integer values into a column in a parquet table with Hive command, values are not getting insert and shows as null.

Could you pls share your exact insert statement and table definition for precise answer? If i have to guess, this may be because of difference in implicit data type conversion by hive and impala.

  • HIVE - If you set hive.metastore.disallow.incompatible.col.type.changes to false, the types of columns in Metastore can be changed from any type to any other type. After such a type change, if the data can be shown correctly with the new type, the data will be displayed. Otherwise, the data will be displayed as NULL. As per documentation forward conversion works(int> bigint) whereas backward (big int > small int) doesnt and produces null.
  • Impala - it supports a limited set of implicit casts to avoid undesired results from unexpected casting behavior. Impala does perform implicit casts among the numeric types, when going from a smaller or less precise type to a larger or more precise one. For example, Impala will implicitly convert a SMALLINT to a BIGINT.

Also number of rows in the partitions (show partitions) show as -1 -
Please run compute stats table_name to fix this issue.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Koushik Roy