'How to load Excel file data into Snowflake table

There is a use case to load Excel file (.xls, .xlsx) into Snowflake.

Using SnowSQL PUT command I'm able to load the file to Stage (User Stage) and it works fine till this point. Next step is to load from User Stage to Snowflake Table using COPY INTO command. Issue is that there is no FILE_FORMAT available in Snowflake to specify XLS type data.

I don't want to convert XLS in CSV format which is available option for FILE_FORMAT type. Any clue how this can be done without modifying source Excel file?



Solution 1:[1]

Snowflake doesn't support loading data directly from Excel files as of this date. You have to convert the source file to a supported format. https://docs.snowflake.com/en/sql-reference/sql/create-file-format.html

Solution 2:[2]

I agree that converting data to CSV is error prone. While not "native" to the Snowflake options, Snowflake does have a python connector and there are a number of python excel libraries. Here's a Snowflake reference on this: python-connector-pandas And one library example: pandas.read_excel

Given your subsequent comments about multiple sheets within the excel workbooks and the number of files, I think a programmatic solution is the best option.

Solution 3:[3]

Option 1:

  1. convert your file to CSV 2.put your file into data-lake or s3bucket
  2. create file format
  3. load your csv file using copy command https://docs.snowflake.com/en/sql-reference/sql/copy-into-table.html

Option 2:

  1. convert excel to csv.

  2. go to UI create file format fileformat

  3. click on the the table-->load table load table

  4. select file Select picture from computer

  5. choose file format Choose newly created file format

  6. Load option Choose your best options

Solution 4:[4]

First you have to process your excel file using Python, then load the data into Snowflake. Please use the code below:

from sqlalchemy import create_engine
import pandas as pd
snowflake_username = 'username'
snowflake_password = 'password'
snowflake_account = 'accoutname'
snowflake_warehouse = 'warehouse'
snowflake_database = 'database'
snowflake_schema = 'public'

engine = create_engine(
'snowflake://{user}:{password}@{account}/{db}/{schema}?warehouse= 
    {warehouse}'.format(
     user=snowflake_username,
     password=snowflake_password,
     account=snowflake_account,
     db=snowflake_database,
     schema=snowflake_schema,
     warehouse=snowflake_warehouse,
    ),echo_pool=True, pool_size=10, max_overflow=20
)

try:
 connection = engine.connect()

 df_sensor.columns = map(str.upper, df_sensor.columns)
 df_sensor.to_sql('tb_equipments'.lower(), con=connection, 
 schema='public', index=False, if_exists='append', chunksize=16000)
 results = connection.execute('select count(1) from 
 tb_equipments').fetchone()

print('\nTotal de linhas inseridas: ',results[0], '\n')

finally: connection.close() engine.dispose()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Gani Simsek
Solution 2 Beege
Solution 3 Anil Dhakal
Solution 4 Jayron Soares