'How to read Schema from the config table and attach to Pyspark DataFrame?
I have 50 tables in my on premise server. I want to migrate those 50 tables from on-premise to delta table in data bricks. But every table has specific schema defined but i need to design the single adf pipeline to move those fifty tables from on-premises to delta table.
How to attach the schema to the data frame at the run time based on the table name ?
Solution 1:[1]
I would use mapping data flows for this scenario:
- Create a static table/file with the list of your tables.
- Add a for each loop in ADF pipeline.
- Within foreach, create a mapping data flow.
- As a source provide your on-prem database (schema will be detected automatically).
- Delta table as a destination (sink).
Mapping data flows are spark-based, therefore you can see in "projection" tab that it is translated to the spark type already.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Lukas U-ski |
