'How to load Excel file data into Snowflake table
There is a use case to load Excel file (.xls, .xlsx) into Snowflake.
Using SnowSQL PUT command I'm able to load the file to Stage (User Stage) and it works fine till this point. Next step is to load from User Stage to Snowflake Table using COPY INTO command. Issue is that there is no FILE_FORMAT available in Snowflake to specify XLS type data.
I don't want to convert XLS in CSV format which is available option for FILE_FORMAT type. Any clue how this can be done without modifying source Excel file?
Solution 1:[1]
Snowflake doesn't support loading data directly from Excel files as of this date. You have to convert the source file to a supported format. https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html
Solution 2:[2]
I agree that converting data to CSV is error prone. While not "native" to the Snowflake options, Snowflake does have a python connector and there are a number of python excel libraries. Here's a Snowflake reference on this: python-connector-pandas And one library example: pandas.read_excel
Given your subsequent comments about multiple sheets within the excel workbooks and the number of files, I think a programmatic solution is the best option.
Solution 3:[3]
Option 1:
- convert your file to CSV 2.put your file into data-lake or s3bucket
- create file format
- load your csv file using copy command https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html
Option 2:
Solution 4:[4]
First you have to process your excel file using Python, then load the data into Snowflake. Please use the code below:
from sqlalchemy import create_engine
import pandas as pd
snowflake_username = 'username'
snowflake_password = 'password'
snowflake_account = 'accoutname'
snowflake_warehouse = 'warehouse'
snowflake_database = 'database'
snowflake_schema = 'public'
engine = create_engine(
'snowflake://{user}:{password}@{account}/{db}/{schema}?warehouse=
{warehouse}'.format(
user=snowflake_username,
password=snowflake_password,
account=snowflake_account,
db=snowflake_database,
schema=snowflake_schema,
warehouse=snowflake_warehouse,
),echo_pool=True, pool_size=10, max_overflow=20
)
try:
connection = engine.connect()
df_sensor.columns = map(str.upper, df_sensor.columns)
df_sensor.to_sql('tb_equipments'.lower(), con=connection,
schema='public', index=False, if_exists='append', chunksize=16000)
results = connection.execute('select count(1) from
tb_equipments').fetchone()
print('\nTotal de linhas inseridas: ',results[0], '\n')
finally: connection.close() engine.dispose()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Gani Simsek |
| Solution 2 | Beege |
| Solution 3 | Anil Dhakal |
| Solution 4 | Jayron Soares |





