'AWS Athena query specifying linebreak and quotation character
I have the following query to create a table in Athena out of existing files located in S3. As we can see, I am defining the linebreak character and how to manage null values:
CREATE EXTERNAL TABLE IF NOT EXISTS table_name(
`field1` STRING,
`field2` STRING
)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
NULL DEFINED AS ' '
LOCATION 's3://bucket/prefix/'
TBLPROPERTIES ('skip.header.line.count'='1')
Now I also want to include the quotation character, but I don't see any property for that.
I tried using WITH SERDEPROPERTIES properties as shown below (where I can use quoteChar), but then I cannot find any SERDE property to define the "linebreak" and the "NULL management".
CREATE EXTERNAL TABLE IF NOT EXISTS table_name(
`field1` STRING,
`field2` STRING
)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES ('separatorChar' = ',', 'quoteChar' = '"')
LOCATION 's3://bucket/prefix/'
TBLPROPERTIES ('skip.header.line.count'='1')
Is there any way of using quotation character, field delimiter, linebreak, and null management together?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
