Excel is everywhere! As a data professional, when I hear that the source data is coming from Excel – it causes a visceral reaction knowing there are a lot of potential pitfalls in getting the data from Excel into any database system. Loading an Excel file to Snowflake is no picnic, or is it? Sure, we could use an ETL tool to load the data from Excel to a Snowflake table; however, if we leverage several Python libraries that are pre-loaded into Snowflake, we can load the Excel file in just a few simple steps – all within Snowflake.
Prep Work
The code samples shown assume the desired Excel file is already loaded into an internal Snowflake stage and the user executing the code has proper use/table creation privileges on the stage and target database schema in Snowflake. For the purposes of the demo, I used a sample Excel file from thespreadsheetguru.com
If you’re working with a new Snowflake demo account and need to setup an environment to match the next few code samples, the initial setup script for my environment is on my GitHub Repo.
Confirm Excel File Location
First, let’s confirm our desired Excel file is loaded to our demo stage location. Running the query below will list the files in my Snowflake stage.
USE ROLE ST_DEMO_ROLE;
USE SCHEMA ST_DEMO_DB.ST_INTERNAL_STAGES;
USE WAREHOUSE ST_DEMO_XS_WH;
LIST @ST_DEMO_STAGE;
Python Worksheet
Next, create a new Python worksheet in Snowsight. We’ll need to include a few packages first, so on the Packages drop down, under “Anaconda Packages” add pandas and openpyxl
Next, we’ll add pandas and os to our list of imports at the top of our script. The os library will be used to assist with joining file paths – not necessary, but helpful to simplify code.
import snowflake.snowpark as snowpark
from snowflake.snowpark.functions import col
import pandas as pd
import os
Capture Excel Sheet Names
Now, we’re ready to read use pandas to read our Excel file from the stage location. To confirm we have captured the Excel file properly, we’ll temporarily return a Snowpark dataframe containing the sheet names in the file. In this case, the file only has one sheet named “Data”.
def main(session: snowpark.Session):
#SET THE FULL STAGE PATH
str_stg_name = "@ST_DEMO_DB.ST_INTERNAL_STAGES.ST_DEMO_STAGE"
#SET THE DESIRED EXCEL FILE NAME
str_excel_file = "Employee Sample Data.xlsx"
#STREAM THE EXCEL FILE TO AN PANDAS EXCEL OBJECT WITH FILE STREAM
xl_file = pd.ExcelFile(session.file.get_stream(os.path.join(str_stg_name,str_excel_file)))
#CAPTURE THE SHEET NAMES
sp_df = session.createDataFrame(xl_file.sheet_names,schema=["Sheet Name"])
return(sp_df)
Read Excel Sheet to Screen
Now that we’ve confirmed that the file is accessible and we can get the sheet names, let’s capture that data into a data frame and return it to the screen. With just a few simple modifications to our script above, we can do just that!
def main(session: snowpark.Session):
#SET THE FULL STAGE PATH
str_stg_name = "@ST_DEMO_DB.ST_INTERNAL_STAGES.ST_DEMO_STAGE"
#SET THE DESIRED EXCEL FILE NAME
str_excel_file = "Employee Sample Data.xlsx"
#STREAM THE EXCEL FILE TO AN PANDAS EXCEL OBJECT WITH FILE STREAM
xl_file = pd.ExcelFile(session.file.get_stream(os.path.join(str_stg_name,str_excel_file)))
#READ THE "DATA" SHEET TO PANDAS DATA FRAME
data_df = pd.read_excel(xl_file,sheet_name="Data")
#CONVERT PANDAS TO SNOWPARK DF
sp_df = session.createDataFrame(data_df)
#RETURN THE DATA
return(sp_df)
Write Excel Sheet to Snowflake
Great! Now we’ve confirmed we can read the data from the Excel file. Next, let’s write it to a transient table in Snowflake in the RAW_DATA schema. I’ll also set the auto_create_table and overwrite options to TRUE so Snowflake will create the table if it doesn’t exist, or overwrite it if it does. These settings are for the purposes of re-running this demo code; however, adjust them as needed for your specific situation.
def main(session: snowpark.Session):
#SET THE FULL STAGE PATH
str_stg_name = "@ST_DEMO_DB.ST_INTERNAL_STAGES.ST_DEMO_STAGE"
#SET THE DESIRED EXCEL FILE NAME
str_excel_file = "Employee Sample Data.xlsx"
#SET THE DESIRED TARGET TABLE
str_snow_table = "EMPLOYEE_DATA"
#STREAM THE EXCEL FILE TO AN PANDAS EXCEL OBJECT WITH FILE STREAM
xl_file = pd.ExcelFile(session.file.get_stream(os.path.join(str_stg_name,str_excel_file)))
#READ THE "DATA" SHEET TO PANDAS DATA FRAME
data_df = pd.read_excel(xl_file,sheet_name="Data")
#WRITE THE DATA TO SNOWFLAKE
session.write_pandas(df=data_df,table_name=str_snow_table,table_type='transient',auto_create_table=True,overwrite=True)
#RETURN THE LIST OF TABLES IN RAW_DATA FOR CONFIRMATION
sp_df = session.sql("SHOW TABLES LIKE 'EMPLOYEE%'")
#RETURN THE DATA
return(sp_df)
Conclusion
By leveraging Python libraries pre-loaded into Snowflake, we can efficiently load Excel files into Snowflake in a few simple steps, avoiding the complexity of traditional ETL tools. In this demo, we walked through how to prepare, read, and write Excel data to Snowflake from a Python worksheet. This could easily be extended to run as a procedure and include much more dynamic methods of determining file names, sheet names and/or the table target.
Follow me on LinkedIn and Medium for more content on Data Management and demos including Snowflake, Streamlit, and SQL Server.






Hello,
First of all, great tutorial.
I try to follow steps presented on this site and i went successfully through ‘Confirm Excel File location’ (so i am getting confirmation that .xlsx file is where it should be) but next steps are failing in such a way that they do not return anything although script is executed and is successful. I just don’t get to see sheet name neither data in .xlsx file.
Do you have any idea why is that?
Regards,
Alfred
LikeLike
Do you have the stage, file and sheet name set exactly the same as in the script? If not, make sure you update the variables in the script to match your stage and files.
LikeLike