'Azure File Share - Recursive Directory Search like os.walk

I am writing a Python script to download files from Azure File Share. The structure of the File Share is as below:

/analytics/Part1/file1.txt
/analytics/Part1/file2.txt
/analytics/mainfile.txt
/analytics/Part1/Part1_1/file11.txt

I tried to use the following lines in my script but it looks for files and directories only at the root directory level.

fileshareclient = ShareClient(
    account_url=args.get('AccountURL'),
    credential=args.get('SASKey'),
    share_name=args.get('FileShare')
)

fileLst = list(
    fileshareclient.list_directories_and_files('analytics')
)

The output is:

/analytics/mainfile.txt  --> File
/analytics/Part1 --> Dir

But, I am looking for something like os.walk() function in Python here to achieve this recursive directory walk. Any idea if such function is available in Azure File Service Python API?



Solution 1:[1]

Honestly, I find it a bit confusing having to deal with two different methods to do "the same thing". I prefer to instantiate the directory client via the from_connection_string method, as in the following approach.

Python Requirements

pip install azure-storage-file-share==12.6.0

Python Script

FileShare file listing:

from azure.storage.fileshare import ShareServiceClient, ShareDirectoryClient, ShareFileClient
from typing import Iterator, Union
import os


def list_files(
    dir_path: str,
    share_name: str,
    connection_string: str,
    include_properties: bool = False,
    recursive: bool = True
) -> Iterator[Union[str, dict]]:
    """
    List files from FileShare on Azure Storage Account.

    Parameters
    ----------
    dir_path: str
        Directory path to list files from.
    share_name: str
        FileShare name.
    connection_string: str
        Connection string.
    include_properties: bool
        Specifies that file properties to be returned in the response.
    recursive: bool
        Specifies whether to list files recursively.

    Returns
    -------
    files_list: Iterator[Union[str, dict]]
        List of filepaths from FileShare.
        Returns a list of strings (Iterator[str]) if 'include_properties' is false.
        Otherwise, return a list of dictionaries (Iterator[dict]).

    Notes
    -----
    This method only lists files, ignoring empty directories.

    References
    ----------
    .. [1] Method 'list_files': https://stackoverflow.com/a/71088269/16109419
    .. [2] Recursive files listing: https://stackoverflow.com/a/66543222/16109419
    """
    dir_client = ShareDirectoryClient.from_connection_string(
        conn_str=connection_string,
        share_name=share_name,
        directory_path=dir_path
    )

    # Listing files from current directory path:
    for file in dir_client.list_directories_and_files():
        name, is_directory = file['name'], file['is_directory']
        path = os.path.join(dir_path, name)

        if is_directory:
            if recursive:
                # Listing files recursively:
                childrens = list_files(
                    dir_path=path,
                    share_name=share_name,
                    connection_string=connection_string,
                    include_properties=include_properties,
                    recursive=recursive
                )

                for child in childrens:
                    yield child
        else:
            if include_properties:
                file_client = ShareFileClient.from_connection_string(
                    conn_str=connection_string,
                    share_name=share_name,
                    file_path=path
                )

                yield file_client.get_file_properties()
            else:
                yield path

FileShare file listing example:

def main() -> None:
    connection_string = "<your-conn-str>"
    share_name = "<your-share-name>"
    dir_path = ""  # Leave it empty to list files from root directory.

    files_list = list_files(
        dir_path=dir_path,
        share_name=share_name,
        connection_string=connection_string,
        include_properties=False,
        recursive=True
    )

    for i, f in enumerate(files_list, start=1):
        print(i, f)


if __name__ == '__main__':
    main()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1