'how to perform this operation in less time?

hi i am working on this dataset

enter image description here

                                  date     open     high      low    close  \
timestamp                                                            
1597133400 2020-08-11 08:10:00  11759.0  11761.5  11750.0  11760.0   
1597133700 2020-08-11 08:15:00  11760.0  11776.0  11759.5  11773.5   
1597134000 2020-08-11 08:20:00  11773.5  11774.0  11751.5  11752.0   
1597134300 2020-08-11 08:25:00  11752.0  11752.0  11735.0  11744.5   

                                  buy                       sell  \
timestamp                                                          
1597133400         [[[11566, 11578]]]         [[[11959, 11971]]]   
1597133700         [[[11566, 11578]]]         [[[11959, 11971]]]   
1597134000         [[[11566, 11578]]]    [[[12339.6, 12457.12]]]   
1597134300  [[[11157.275, 11039.83]]]  [[[12331.725, 12449.17]]]   

                             buy2                  sell2  price_pool_long  \
timestamp                                                                   
1597133400         [11566, 11578]         [11566, 11578]         11578.00   
1597133700         [11566, 11578]         [11566, 11578]         11578.00   
1597134000         [11566, 11578]         [11566, 11578]         11578.00   
1597134300  [11157.275, 11039.83]  [11157.275, 11039.83]         11039.83   

            price_pool_short                                      buy_perc  \
timestamp                                                                    
1597133400         12348.000   [0.01649659863945574, 0.015476190476190421]   
1597133700         12362.175  [0.017624325816452235, 0.016605087696946574]   
1597134000         12339.600   [0.015827093260721625, 0.01480599046970732]   
1597134300         12331.725   [0.050000000000000044, 0.06000000000000005]   

                                                sell_perc  liquidation_short  \
timestamp                                                                      
1597133400  [-0.016773301054815892, -0.01571946795646917]                  0   
1597133700  [-0.017940515303475735, -0.01688547244774563]                  0   
1597134000  [-0.016081618537091558, -0.01502850233200892]                  0   
1597134300   [-0.05263157894736836, -0.06382978723404253]                  0   

            liquidation_long  
timestamp                     
1597133400                 0  
1597133700                 0  
1597134000                 0  
1597134300                 0  

and i need to change the value on the colum liquidation_long to -1 if the value of the colum low for the a row is lower than the value of the column price_pool_long from the previous row, else make it 0

i do this whit the next code,

 index = data4.index
 for i in range(1,len(index)):    
    low =data4.low[data4.index==index[i]]
    ppl = data4.price_pool_long[data4.index==index[i-1]].values[0]
    if low.values[0]<=ppl:
        data4.liquidation_long[data4.index==index[i]]=-1
    else:
        data4.liquidation_long[data4.index==index[i]]=0
           
    

but using my dataframe whit 200k rows is not a really fast aproach,

using this row, i should get -1 on the liquidation_short(first the "i-1" and then the "i" row

enter image description here

                          date     open     high      low    close  \
timestamp                                                            
1597176900 2020-08-11 20:15:00  11335.0  11335.0  11209.5  11215.5   

                                                          buy  \
timestamp                                                       
1597176900  [[[11192, 11204], [11204, 11215], [11215, 1122...   

                                                         sell  \
timestamp                                                       
1597176900  [[[11566, 11578], [11578, 11590], [11590, 1160...   

                                    buy2                         sell2  \
timestamp                                                                
1597176900  [11192, 11204, 11215, 11227]  [11192, 11204, 11215, 11227]   

            price_pool_long  price_pool_short  \
timestamp                                       
1597176900          11227.0         11776.275   

                                                     buy_perc  \
timestamp                                                       
1597176900  [0.002095314520083824, 0.001025366680041051, 4...   

                                                    sell_perc  \
timestamp                                                       
1597176900  [-0.002099714081486681, -0.0010264191360227493...   

            liquidation_short  liquidation_long  
timestamp                                        
1597176900                  0                 0  

and the next:

    date     open     high      low    close  \
timestamp                                                            
1597176600 2020-08-11 20:10:00  11360.0  11367.0  11334.5  11335.0   

                                                          buy  \
timestamp                                                       
1597176600  [[[11192, 11204], [11204, 11215], [11215, 1122...   

                                                         sell  \
timestamp                                                       
1597176600  [[[11566, 11578], [11578, 11590], [11590, 1160...   

                                           buy2  \
timestamp                                         
1597176600  [11192, 11204, 11215, 11227, 11238]   

                                          sell2  price_pool_long  \
timestamp                                                          
1597176600  [11192, 11204, 11215, 11227, 11238]          11238.0   

            price_pool_short  \
timestamp                      
1597176600          11901.75   

                                                     buy_perc  \
timestamp                                                       
1597176600  [0.012615791795324238, 0.011557123952359971, 0...   

                                                    sell_perc  \
timestamp                                                       
1597176600  [-0.012776983559685595, -0.011692252766869027,...   

            liquidation_short  liquidation_long  
timestamp                                        
1597176600                  0                 0  

i hope you can helpme to make an apply function or some nice aproach to make the same in less time, thanks in advance!



Solution 1:[1]

I faced a similar issue and solved it with a numpy work around. This solution should do the calculation within a few seconds max. I have built my own minimal example. Maybe you should check (or compare) the results before, but it should lead to your desired result, if I understood you correctly. The first value is of course nonsense, because it uses the newest ppl in my solution.

df = pd.DataFrame({'low': [2, 3, 4, 5, 6, 7],
                   'price_pool_long': [3, 2, 5, 8, 6, 1],
                   'liquidation_long': [0, 0, 0, 0, 0, 0]})


low = df['low'].to_numpy()
price_pool_long = df['price_pool_long'].to_numpy()
df['liquidation_long'] = np.where(low <= np.roll(price_pool_long, 1),-1,0)

When you do such calculations you must always avoid iterating the entire dataframe in python at all costs. This is always very slow. Small tip: 'concat' should also only be used with caution.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 PushTheButton