'Error while Data Ingestion from SFTP to GCS or BigQuery using Cloud Data Fusion
I am trying to move CSV files in SFTP folder to GCS using Data Fusion. But I am unable to do it and throwing below error:
Here are the properties of both FTP and GCS plugins. Surprisingly, I could see the data in PREVIEW mode in all the stages but when I try to deploy the pipeline it fails. I tried using CSVParser as well as a TRANSFORM in between source(FTP) and sink (GCS). Still it shows the same error. I am using FTP plugin in Hub with version 3.0.0. Please help me to solve it.
And the error is as below, when I try to deploy the pipeline, eventhough Preview Data I was able to see the data.
Solution 1:[1]
Well I have dig a lot on this, I found that this plugins have issues when running ftp-plugins, so at the moment you can't do much on it. Fortunately, there are workarounds for this. To name a few here are some:
You can use an old version ( Dataproc image to 1.5/1.3 ) as indicated on the public case that also makes reference to this issue. For more details about this case, you can check the link for the issue, SFTP Source fails when deployed (SftpExecption) but not in preview. Don't forget to upvote and leave a comment too.
Another way is to use
SFTPCopyplugin (once you pick up from the hub it should appear underConditions and Actions). So you will be able to pick up the file from your SFTP into a local path and the use SourceFILEto continue with the processing of your file. There is a small guide on Reading from SFTP and writing to BigQueryThis one is a bit extreme but you can also use a different workflow management platform like airflow for file processing.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Betjens |


