'Write only few columns of pyspark dataframe to a snowflake table
I have a pyspark data frame as below:
>>> l = [('XYZ', '324 NW', ‘VA’), ('XYZ, '323 NW', ‘VA’), (‘CVB’, '314 NW', ‘VA’)]
>>> df = spark.createDataFrame(l, ('Name', 'Address', 'State'))
Name Address State
XYZ 324 NW VA
ABC 323 NW VA
CVB 314 NW VA
CREATE or replace TABLE TEST
(
NAME VARCHAR(50),
ADDRESS VARCHAR(50),
STATE VARCHAR(50));
When I try to write this entire pyspark dataframe to a TEST table using the below pyspark code, it works perfectly fine
df.write.format(SNOWFLAKE_SOURCE_NAME).OPTIONS(**snowflake_options).option(“dbtable”, ‘TEST).mode(‘append’).save()
How do I write only 'Address' and 'State' column to snowflake table ignoring ‘Name’ column?
I also tried below code but it keeps giving me invalid syntax
df.write.format(SNOWFLAKE_SOURCE_NAME).OPTIONS(**snowflake_options)
.option(“dbtable”, ‘TEST)
.option("columnmap", Map("Address" -> "ADDRESS", "State" -> "STATE").toString())
.mode(‘append’).save()
Solution 1:[1]
Just select 2 columns you wanted to save
(df
.select('Address', 'State') ## add this
.write
.format(SNOWFLAKE_SOURCE_NAME)
.OPTIONS(**snowflake_options)
.option('dbtable', 'TEST')
.mode('append')
.save()
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | pltc |
