'Airflow GCSFileTransformOperator source object filename wildcard

I am working on a DAG that should read an xml file, do some transformations to it and land the result as a CSV. For this I am using GCSFileTransformOperator.

Example:

    xml_to_csv = GCSFileTransformOperator(
    task_id=f'xml_to_csv',
        source_bucket='source_bucket',
        source_object=(
            f'raw/dt=2022-01-19/File_20220119_4302.xml'
        ),
        destination_bucket='destination_bucket',
        destination_object=f'csv_format/dt=2022-01-19/File_20220119_4302.csv',
        transform_script=[
            '/path_to_script/transform_script.py'
        ],
)

My problem is that the filename has is ending with a 4 digit number that is different each day (File_20220119_4302). Next day the number will be different. I can use template for execution date: {{ ds }}, {{ ds_nodash }}, but not sure what to with the number. I have tried wildcards like File_20220119_*.xml, with no success.



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source