'Can 2 Spark job use a single HDFS/S3 storage simultaneously?

I'm a beginner in Spark. Can I have 2 spark jobs to use a single HDFS/S3 storage at the same time? One job will write latest data to S3/HDFS and other will read that along with input data from another source for analysis.



Solution 1:[1]

In order to use both file systems, you need to include the protocol for the files.

e.g. spark.read.path("s3a://bucket/file") and/or spark.write.path("hdfs:///tmp/data")

However, you can use S3 directly in place of HDFS via setting fs.defaultFS

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1