'Weird not supported operation error in simple loop over dataframe
I work on a dataframe and I want to iterate over one column as I did this hundreds of times with many dataframes. Today I get an error and I can't wrap my head around what's wrong with it.
Maybe worth mentioning, the dataframe is a concatenation.
log = (pd.concat([log_entry,log_exit]).sort_values(by=['date']))
dataframe:
position order price PnL
date
2022-03-27 20:45:00 short entry 29.242291 0
2022-03-28 13:45:00 short entry 31.052375 0
2022-03-28 15:00:00 short entry 31.072893 0
2022-03-28 19:15:00 short entry 31.070073 0
2022-03-28 20:45:00 short entry 31.220069 0
2022-03-28 23:00:00 - TP 30.016500 0
2022-03-28 23:15:00 - TP 29.788000 0
2022-03-28 23:15:00 - TP 29.820500 0
2022-03-28 23:30:00 - TP 29.640500 0
2022-03-29 05:30:00 short entry 30.902677 0
2022-03-29 06:15:00 short entry 30.893078 0
iteration:
for i in range(len(log.index)):
if log.position[i] == 'short':
print('ok')
error:
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
4729 try:
-> 4730 return self._engine.get_value(s, k, tz=getattr(series.dtype, "tz", None))
4731 except KeyError as e1:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine._get_loc_duplicates()
TypeError: '<' not supported between instances of 'str' and 'int'
During handling of the above exception, another exception occurred:
IndexError Traceback (most recent call last)
<ipython-input-8-c8832d66a85a> in <module>
156
157 for i in range(len(log.index)):
--> 158 if log.position[i] == 'short':
159 print('ok')
160 # dd_buffer.append(log.price[i])
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/series.py in __getitem__(self, key)
1066 key = com.apply_if_callable(key, self)
1067 try:
-> 1068 result = self.index.get_value(self, key)
1069
1070 if not is_scalar(result):
/opt/anaconda3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
4748 # python 3
4749 if is_scalar(key): # pragma: no cover
-> 4750 raise IndexError(key)
4751 raise InvalidIndexError(key)
4752
IndexError: 0
How is this possible??
Solution 1:[1]
With log.position[i] you are matching the index of the dataframe, i.e. the datetimes. There is no index 0 (only datetimes), that's why you get the IndexError. Use log.position.iloc[i] instead for integer-location based indexing for selection by position.
Solution 2:[2]
Edit: I think I understand your question now: You iterate over the range(len(index)) which will always start at 0 and iterate until len(index)-1
your Index doesn't have to be a sequential order, you could for example remove your index=0 and then your index would start at 1 (you could view it as another column which can simply be edited)
Try:
for i in log.index:
print(i)
This should show that your index doesn't start at 0 if I'm correct.
for i in log.index:
if log.position.iloc[i] == "short":
print("ok")
That should do what you wanted.
Earlier answer( I'm not sure what you are trying to achieve, but it looks like you want to know how many "short" are in position
Try: log.position.value_counts()
to see how many of each value are in there
Normally there are better ways than to loop over a dataframe in pandas)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mcsoini |
| Solution 2 |
