'Python and SQL Inserts from Excel file Using Pandas and List/Array
I have an Excel book with multiple sheets:
Company:
| ID | Name |
|---|---|
| 100 | CompanyA |
| 101 | CompanyB |
State:
| State | Name |
|---|---|
| NY | New York |
| AZ | Arizona |
I'm trying to figure out how to store Company ID column into an array (list?) and then do SQL inserts with the data in Sheet2 to a table. For example:
insert into location values (100, 'NY', 'New York')
insert into location values (100, 'AZ', 'Arizona')
insert into location values (101, 'NY', 'New York')
insert into location values (101, 'AZ', 'Arizona')
I can do this no problem if only one company is listed in the Excel file however when there are multiple companies and they share the same sheet data, I'm not entirely sure how to do this.
Code snippets:
import pandas as pd
import numpy as np
import pandas.io.sql
import pyodbc
import xlrd
# store excel read location
excel_file = 'C:\\temp\\template.xlsx'
# create connection and cursor objects
conn = pyodbc.connect('DRIVER={SQL Server};SERVER=' + server + ';DATABASE=' + db + ';UID=' + dbuser + ';PWD=' + dbpassword + ';Trusted_Connection=no')
cursor = conn.cursor()
# read data
df_company = pd.read_excel(excel_file, sheet_name='Company',engine='openpyxl')
df_state = pd.read_excel(excel_file, sheet_name='State',engine='openpyxl')
# build query
# build admit source query
query = """EXEC sp_executesql N 'insert [dbo].[location]([CompanyID], [State], [StateName])
values (@0, @1, @2)
select [CompanyID]
from [dbo].[location]
where @@ROWCOUNT > 0 and [ID] = scope_identity()',
N '@0 int,@1 varchar(2),@2 varchar(30)',
@0 = ?,
@1 = '?',
@2 = '?'"""
# iterate through sheet to execute sql query
for index, row in df_state.iterrows():
# Assign values from each row
values = (row.ID, row.State, row.Name)
# Execute sql query
cursor.execute(query, values)
# Commit the transaction
conn.commit()
# Close the database connection
conn.close()
The last bit of code is what I'm not quite grasping. I want to store the company ID in an array and then pass in each company ID for however many rows are in the 'State' sheet. I'm fairly new to Python so I'm still learning the different methods and functions and coming from Javascript, this seems doable but could use some guidance.
Note: I know row.ID will return an error as it doesn't exist in df_state.iterrows, I just don't know how to reference the Company sheet with this approach.
Solution 1:[1]
I was able to figure it out.
I added the IDs to a list as follows:
ids = df_facilities['ID'].to_list()
Then I iterate through the list:
for x in ids:
And then assign x as the ID in the values query.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | jc84 |
