'Athena create table from CSV file (S3 bucket) with semicolon
I'm trying to create a table with a S3 bucket which has an CSV file. Because of the regional settings the CSV has semicolon as a seperator and one row even contains commas.
Input CSV file:
Name;Phone;CRM;Desk;Rol
First Name;f.name;Name, First;IT;Inbel
First2 Name2;f2.name2;Name2, First2;IT;Inbel
First3 Name3;f3.name3;Name3, First3;IT;Inbel
First4 Name4;f4.name4;Name4, First4;IT;Inbel
Athena query:
CREATE EXTERNAL TABLE IF NOT EXISTS `a`.`test` (
`Name` string,
`Phone` string,
`CRM` string,
`Desk` string,
`Rol` string
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
WITH SERDEPROPERTIES (
'serialization.format' = ',',
'field.delim' = ','
) LOCATION 's3://***/test/'
TBLPROPERTIES ('has_encrypted_data'='false');
The output comes out as:
Name;Phone;CRM;Desk;Rol
First Name;f.name;Name First;IT;Inbel
First2 Name2;f2.name2;Name2 First2;IT;Inbel
First3 Name3;f3.name3;Name3 First3;IT;Inbel
First4 Name4;f4.name4;Name4 First4;IT;Inbel
I tried scanning the web for solutions (especially for the seperator), but nothing seems to work. I don't want to change regional settings and would love to keep the input file as is. Also if someone knows the solution for the CRM column it would be a bonus!
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
