'Mapping mixed case Parquet to Snowflake with ADF

Trying to solve an issue with mapping mixed case fields in a Parquet file to Snowflake, as it stores column names in uppercase.

I'm using Azure Data Factory's copy activity to load a Parquet file to Snowflake. For example, my parquet schema is (ID, CustomerName, CustomerType). When loading to Snowflake, I get the ID column populated, but nothing in CustomerName/CustomerType.

Keeping in mind that my source casing can unexpectedly change (e.g. CustomerName -> customername), has anyone seen and solved this issue before?

snowflake-cloud-data-platform azure-data-factory

Solution 1:^[1]

If your parquet is already in blob, you should be able to copy them directly into a Snowflake table with the

MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE

copy option. https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html

Example:

copy into example_table from @example_stage/table_folder/
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE -- match parquet column names to snowflake table column names
PATTERN = '.*parquet' -- optional, pattern to only copy files that end in "parquet"
PURGE = TRUE -- optional, will delete successfully copied files from blob (if privlaged to do so)
;

ADF now has the script activity so you can orchestrate this there.

https://medium.com/@chuang.zhu/seamless-migration-to-snowflake-using-adf-script-activity-schema-detection-25475ea86a09

Side Note: Snowflake has schema detection for parquet, so you can even generate tables directly from your parquet files as well.

Solution 2:^[2]

Snowflake as sink:

Snowflake connector utilizes Snowflake’s COPY into [table] command to achieve the best performance. It supports writing data to Snowflake on Azure.

Property Description Required
importSettings Advanced settings used to write data into Snowflake. You can configure the ones supported by the COPY into command that the service will pass through when you invoke the statement. No

Property	Description	Required
importSettings	Advanced settings used to write data into Snowflake. You can configure the ones supported by the COPY into command that the service will pass through when you invoke the statement.	No

Option: MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	cleveralias
Solution 2	Lukasz Szozda

'Mapping mixed case Parquet to Snowflake with ADF

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]