'Overwriting a file in Azure datalake Gen 2 from Synapse Notebook throws Exception

As part of migrating from Azure Databricks to Azure Synapse Analytics Notebooks, I'm facing the issue explained below.

While reading a CSV file from Azure Datalake Storage Gen 2 and assigning it to a pyspark dataframe using the following command.

df = spark.read.format('csv').option("delimiter", ",").option("multiline", "true").option("quote", '"').option("header", "true").option("escape", "\\").load(csvFilePath)

After processing this file, we need to overwrite it and we use the following command.

df.coalesce(1).write.option("delimiter", ",").csv(csvFilePath, mode = 'overwrite', header = 'true')

What this does is, it deletes the existing file at the path "csvFilePath" and the fails with error, "Py4JJavaError: An error occurred while calling o617.csv."

Things I've noticed:

Once the CSV file at path "csvFilePath" is deleted by the overwrite command, data from dataframe "df" also gets removed.
Looks like it is referring the file at runtime whereas traditionally in databricks we did not have this issue and overwrite ran successfully.

[Error returned by Synapse Notebook at write command.][1] [1]: https://i.stack.imgur.com/Obj9q.png

Solution 1:^[1]

It's suggestable to perform mounting the data storage. Kindly refer the below documentation.

https://docs.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-use-databricks-spark

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	SairamTadepalli-MT

'Overwriting a file in Azure datalake Gen 2 from Synapse Notebook throws Exception

Solution 1:[1]

Sources

Related Questions

Solution 1:^[1]