'Spark SQL single quote error
I have a DataFrame (Apache Spark 1.5). I want add new column using spark sql context to get new column where all raws contains a single quote.
My code:
df.registerTempTable("tempdf");
df = df.sqlContext().sql("SELECT *, \" \\\" \" as quoteCol FROM tempdf");
After execution Spark throw next exception:
Exception in thread "main" java.lang.RuntimeException: [1.44] failure: ``union'' expected but ErrorToken(end of input) found
SELECT *, " \" " as quoteCol FROM tempdf
^
at scala.sys.package$.error(package.scala:27)
at org.apache.spark.sql.catalyst.AbstractSparkSQLParser.parse(AbstractSparkSQLParser.scala:36)
at org.apache.spark.sql.catalyst.DefaultParserDialect.parse(ParserDialect.scala:67)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
at org.apache.spark.sql.SQLContext$$anonfun$2.apply(SQLContext.scala:211)
at org.apache.spark.sql.execution.SparkSQLParser$$anonfun$org$apache$spark$sql$execution$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:114)
at org.apache.spark.sql.execution.SparkSQLParser$$anonfun$org$apache$spark$sql$execution$SparkSQLParser$$others$1.apply(SparkSQLParser.scala:113)
at scala.util.parsing.combinator.Parsers$Success.map(Parsers.scala:137)
...
Next code work correctly and add new row with a single character:
df.registerTempTable("tempdf");
df = df.sqlContext().sql("SELECT *, \" q \" as quoteCol FROM tempdf");
What am I doing wrong?
Solution 1:[1]
SQL strings should use single quotes:
sqlContext().sql("SELECT *, '\"' AS quoteCol FROM tempdf");
Solution 2:[2]
Better try it this way it will ignore the quotes included in the String. It should improve the readability:
dfs = sqlContext().sql("""SELECT *, '\"' AS quoteCol FROM tempdf""");
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | F. Müller |
