'0 is not in range in pandas
So I am trying to load my data from jupyter into mysql workbench using the one "multiple row" insert statement. I will achieve that with a for loop and i am receiving some error messages.
First, a little background:
So I had my csv file which contains data set for preprocessing and I split into 2 here:
Before_handwashing=copy_monthly_df.iloc[:76]
After_handwashig=copy_monthly_df.iloc[76:]
I have successfully structured and loaded the first data set Before_handwashing into mysql work bench using this for loop below.
for x in range(Before_handwashing.shape[0]):
insert_query+='('
for y in range(Before_handwashing.shape[1]):
insert_query+= str(Before_handwashing[Before_handwashing.columns.values[y]][x])+', '
insert_query=insert_query[:-2]+'), '
Now I want to structure and load my second part of the dataset which is After_handwashig into mysql workbench using a similar code structure here.
for x in range(After_handwashig.shape[0]):
insert_query+='('
for y in range(After_handwashig.shape[1]):
insert_query+=str(After_handwashig[After_handwashig.columns.values[y]][x])+', '
insert_query=insert_query[:-2]+'), '
And I am recieving the following error messages
error message: ValueError
Traceback (most recent call last)
~\anaconda3\lib\site-packages\pandas\core\indexes\range.py in get_loc(self, key, method, tolerance)
384 try:
--> 385 return self._range.index(new_key)
386 except ValueError as err:
ValueError: 0 is not in range
The above exception was the direct cause of the following exception:
KeyError
Traceback (most recent call last) ~\AppData\Local\Temp/ipykernel_4316/2677185951.py in <module>
2 insert_query+='('
3 for y in range(After_handwashig.shape[1]):
----> 4 insert_query+=str(After_handwashig[After_handwashig.columns.values[y]][x])+', '
5 insert_query=insert_query[:-2]+'), '
~\anaconda3\lib\site-packages\pandas\core\series.py in __getitem__(self, key)
940
941 elif key_is_scalar:
--> 942 return self._get_value(key)
943
944 if is_hashable(key):
~\anaconda3\lib\site-packages\pandas\core\series.py in
_get_value(self, label, takeable)
1049
1050 # Similar to Index.get_value, but we do not fall back to positional
-> 1051 loc = self.index.get_loc(label)
1052 return self.index._get_values_for_loc(self, loc, label)
1053
~\anaconda3\lib\site-packages\pandas\core\indexes\range.py in
get_loc(self, key, method, tolerance)
385 return self._range.index(new_key)
386 except ValueError as err:
--> 387 raise KeyError(key) from err
388 raise KeyError(key)
389 return super().get_loc(key, method=method, tolerance=tolerance)
KeyError: 0
Can someone help me out in answering this problem?
Solution 1:[1]
OK, it took me a moment to find this. Consider these statements:
Before_handwashing=copy_monthly_df.iloc[:76]
After_handwashig=copy_monthly_df.iloc[76:]
When these are done, Before contains lines with indexes 0 to 75. After contains lines starting with index 76. There is no line with index 0, so your attempt to access it causes a key error.
There are two solutions. One is to use iloc to reference lines by ordinal instead of by index:
insert_query+=str(After_handwashig[After_handwashig.columns.values[y]]./iloc(x))+', '
The other is to reset the indexes to start from 0:
After_handwashing.reset_index(drop=True, inplace=True)
There's really no point in splitting the dataframe like that. Just have your first loop do range(76): and the second do range(76,copy_monthly_dy.shape[1]):.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Tim Roberts |
