'Automate pulling json files from S3 and pushing the same to pyspark for ETL

There will be log files dropped into S3 in some interval time, i want to automate the picking up of new files from S3 and push the same in my pyspark ETL code. Can we watch the S3 using spark streaming, how to do that with python?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Automate pulling json files from S3 and pushing the same to pyspark for ETL

Sources

Related Questions