'Python Pandas dataframe shift does not work in apply functions

I am getting this error below

AttributeError: ("'float' object has no attribute 'shift'", 'occurred at index 718170')

on running my pandas scripts below.

def volumediff(x):
    if x['positive_mvt'] == True:
        volume_d = x['volume'].shift(1)
    else:
        volume_d = ""
    return volume_d

df['new_volume'] = df.apply(volumediff,axis=1)

So because of this I believe based on a almost similar error at AttributeError: 'float' object has no attribute 'split', I thought the issue is caused by a null value since the shift function takes the value that might be out of my dataset. However, I was successful in doing the below without having any issue.

df['new_volume'] = df['volume'].shift(1)

Unfortunately it just doesn't work with an apply function, which I need because I need to use "if else".

I have tried to get around by using the script below - by using an try except to skip any cells which create a value issue. But I am receiving "NA" and "" for all the values in my column, which shouldn't be the case.

def volumediff(x):
    if x['positive_mvt'] == True:
        try:
            volume_d = x['volume'].shift(1)
        except:
            volume_d = "NA"
    else:
        volume_d = ""
    return volume_d

df['new_volume'] = df.apply(volumediff,axis=1)

Original sample df:

x = [
    [False, 240.20353],
    [False, 621.28854],
    [True, 64.85972],
    [True, 151.86484],
    [False, 190.91042],
    [True, 128.78566],
    [False, 415.53138],
    [True, 43.14669],
    [True, 512.03531],
    [True, 502.41939],
]

df = pd.DataFrame(x, columns=['positive_mvt', 'volume'])

df
Out[1]: 
   positive_mvt     volume
0         False  240.20353
1         False  621.28854
2          True   64.85972
3          True  151.86484
4         False  190.91042
5          True  128.78566
6         False  415.53138
7          True   43.14669
8          True  512.03531
9          True  502.41939

Error example:

cmdline dataframe print

I checked my dataframe and I am suspecting that the issue might be caused by the conflict between my if function which only selects rows that are true, however some of the rows which are false are required by x[volume].shift(1) which is the row above it. But that wasn't the case because when I tried the script below, it wasn't working either and triggers the same attribute error. Looks like using the apply function just doesn't work with .shift.

def volumediff(x):
    volume_d = x['volume'].shift(1)
    return volume_d

df['new_volume'] = df.apply(volumediff,axis=1)

Anyone has any insights into how to solve this issue without doing two separately columns and sequentially work on on if else and the minus shift formula separately?



Solution 1:[1]

i find a way to avoid this problem. you can use groupby first before use apply functions(but you need to import something for groupby)

def shoupanjia1 (x,sp):

    return (x['???'].shift(sp) - x['???']) / x['???']

w_df['??????'] =w_df.groupby('??').apply(shoupanjia1,sp=-1).values

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 ???Len?