'IndexError: array index out of range from excel file in which the column is a calculated function
I am trying to upload excel data to an SQL Server in Microsoft SSMS 18 using the following code:
import pandas as pd
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
import pandas.io.sql
import pyodbc
import xlrd
# Open the workbook and define the worksheet
book = xlrd.open_workbook('excel.xls')
sheet = book.sheet_by_name("data")
conn = pyodbc.connect("""server path;""")
cursor = conn.cursor()
query = """INSERT INTO SERVER(Origin_Station_Code, Origin_Facility_Name, Destination_Station_Code, Destination_Facility_Name, StartClock_Day, Calc_Piece_Count, Calc_Physical_Weight, Calc_Volumetric_Weight, Handled_Load, Weight_Type, Movement_Origin_IATA, Movement_Origin_Station_Name, Movement_Origin_Facility_Name, Movement_Destination_IATA, Movement_Destination_Station_Name, Movement_Destination_Facility_Name, Movement_Number, Skdl_Depart_Date, ULD_Registration_Number, ULD_HUID, CRN_MOVE) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"""
# Create a For loop to iterate through each row in the XLS file, starting at row 2 to skip the headers
for r in range(1, sheet.nrows):
Origin_Station_Code = sheet.cell(r,6).value
Origin_Facility_Name = sheet.cell(r,7).value
Destination_Station_Code = sheet.cell(r,9).value
Destination_Facility_Name = sheet.cell(r,10).value
StartClock_Day = sheet.cell(r,15).value
Calc_Piece_Count = sheet.cell(r,20).value
Calc_Physical_Weight = sheet.cell(r,21).value
Calc_Volumetric_Weight = sheet.cell(r,22).value
Handled_Load = sheet.cell(r,23).value
Weight_Type = sheet.cell(r,24).value
Movement_Origin_IATA = sheet.cell(r,26).value
Movement_Origin_Station_Name = sheet.cell(r,27).value
Movement_Origin_Facility_Name = sheet.cell(r,28).value
Movement_Destination_IATA = sheet.cell(r,30).value
Movement_Destination_Station_Name = sheet.cell(r,31).value
Movement_Destination_Facility_Name = sheet.cell(r,32).value
Movement_Number = sheet.cell(r,35).value
Skdl_Depart_Date = sheet.cell(r,36).value
ULD_Registration_Number = sheet.cell(r,39).value
ULD_HUID = sheet.cell(r,40).value
CRN_MOVE = sheet.cell(r,45).value
# Assign values from each row
values = (Origin_Station_Code, Origin_Facility_Name, Destination_Station_Code, Destination_Facility_Name, StartClock_Day, Calc_Piece_Count, Calc_Physical_Weight, Calc_Volumetric_Weight, Handled_Load, Weight_Type, Movement_Origin_IATA, Movement_Origin_Station_Name, Movement_Origin_Facility_Name, Movement_Destination_IATA, Movement_Destination_Station_Name, Movement_Destination_Facility_Name, Movement_Number, Skdl_Depart_Date, ULD_Registration_Number, ULD_HUID, CRN_MOVE)
# Execute sql Query
cursor.execute(query, values)
# Close the cursor
cursor.close()
# Commit the transaction
database.commit()
# Close the database connection
database.close()
# Print results
print("")
print("All Done! Bye, for now.")
print("")
The excel document has the following 45 columns:
Waybill number
Shp Rec Key
Piece Id
Pcs Rec Key
Origin Country Name
Origin_Station_Code
Origin Facility Name
Destination Country Name
Destination Station Code
Destination Facility Name
Product Group
Product Code
Shipper Name
Consignee Name
StartClock Day
Elapsed Timeband SC
Elapsed Timeband NC
End-to-End CAL Count
Origin Mvmt Uplift Day
Calc Piece Count
Calc Physical Weight
Calc Volumetric Weight
Handled Load
Weight Type
Movement Origin Country Name
Movement Origin IATA
Movement Origin Station Name
Movement Origin Facility Name
Movement Destination Country Name
Movement Destination IATA
Movement Destination Station Name
Movement Destination Facility Name
Movement Type Movement Sector Type
Movement Number Skdl Depart Date
Movement Stop Over Count
Movement MAWB
ULD Registration Number
ULD HUID
ULD Build By
Bag Routing Text
Bag HUID
Bag Built
CRN MOVE
However I just need the columns of Origin_Station_Code, Origin_Facility_Name, Destination_Station_Code, Destination_Facility_Name, StartClock_Day, Calc_Piece_Count, Calc_Physical_Weight, Calc_Volumetric_Weight, Handled_Load, Weight_Type, Movement_Origin_IATA, Movement_Origin_Station_Name, Movement_Origin_Facility_Name, Movement_Destination_IATA, Movement_Destination_Station_Name, Movement_Destination_Facility_Name, Movement_Number, Skdl_Depart_Date, ULD_Registration_Number, ULD_HUID, CRN_MOVE.
However whenever I run my code I get this error:
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-23-ea3ed821b9b7> in <module>
40 ULD_Registration_Number = sheet.cell(r,39).value
41 ULD_HUID = sheet.cell(r,40).value
---> 42 CRN_MOVE = sheet.cell(r,45).value
43
44 # Assign values from each row
~\Anaconda3\lib\site-packages\xlrd\sheet.py in cell(self, rowx, colx)
411 xfx = None
412 return Cell(
--> 413 self._cell_types[rowx][colx],
414 self._cell_values[rowx][colx],
415 xfx,
IndexError: array index out of range
Do you know why this is happening? I should mention that the column CRN MOVE uses the mid function. EX =MID(AI2,4,2) with AI being the column for Movement Number. Is that why it is not working? What would be a better alternative?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
