'Is it possible to combine multiple input files with different schemas using Schema Drift / Dynamic Columns

I have around 20 tab-separated input files. They have in the region of 500 columns, but each will be slightly different.

The sink output schema is known and will contain all the possible input columns.

As a simplified example:

File 1

Name	Age	DOB	Nationality
Bob	21	01/01/1972	British

File2

Name	Nationality	NINO
Joe	British	AA995654A

File 3

Name	DOB	Nationality
Sam	01/01/1990	British

Is it possible to have one DataFlow with multiple inputs, where the schema is not known until runtime, that would cope with changes in the input files and in this case would output:

Name	Age	DOB	NINO	Nationality
Bob	21	01/01/1972	NULL	British
Joe	NULL	NULL	AA995654A	British
Sam	NULL	01/01/1990	NULL	British

I have looked at column pattern matching and schema drift, but don't see how/if it is possible to achieve this.

azure-data-factory azure-data-flow

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Is it possible to combine multiple input files with different schemas using Schema Drift / Dynamic Columns

Sources

Related Questions