'Reading parquet files from different folders inside a azure storage container in Pyspark

I need to read parquet files from multiple directories inside a azure storage container..
for example,

Container1
 folder1
    parquet1
    parquet2
    ..
 folder2
   parquet5
   parquet6
   ...

I know these folders are virtual in blob but how do i read all these parquet and load into one dataframe.

paths=["wasbs://{container}@{storageAccountName}.blob.core.windows.net/pathToFile1","wasbs://{container}@{storageAccountName}.blob.core.windows.net/pathToFile2"]
df=spark.read.parquet(*paths)

will the above code work?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Reading parquet files from different folders inside a azure storage container in Pyspark

Sources

Related Questions