'Syntax error writing Spark dataframe to Impala table
I want to write a Spark dataframe to an Impala table, but syntax errors occur on the types of the columns.
This is an example of the code in Pyspark:
df = spark.createDataFrame(
[
(1, 'row1'),
(2, 'row2'),
],
['col1', 'col2']
)
df.write.format('jdbc').option('url','jdbc:impala://172.25.0.1:21050/default').option('driver','com.cloudera.impala.jdbc41.Driver').option('dbtable','test').save()
And this is the error that appears:
Caused by: com.cloudera.impala.support.exceptions.GeneralException: [Cloudera][ImpalaJDBCDriver](500051) ERROR processing query/statement. Error Code: 0, SQL state: TStatus(statusCode:ERROR_STATUS, sqlState:HY000, errorMessage:ParseException: Syntax error in line 1:
CREATE TABLE test ("col1" BIGINT , "col2" TEXT )
^
Encountered: STRING LITERAL
Expected: DEFAULT, IDENTIFIER
CAUSED BY: Exception: Syntax error
), Query: CREATE TABLE test ("col1" BIGINT , "col2" TEXT ).
... 50 more
I have tried numerous alternatives for creating the Spark dataframe and for writing mode using JDBC but I always get this error. How could I solve this?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
