'Which objects/if statements are slowing down this code the most?

I'm having an issue with scaling the code below. If I have two equal length datasets, one with 20 features the other with 70 features, the speed of processing using the below code takes drastically different amounts of time. I understand that I'm adding potentially millions of data points between the two, they will naturally take longer to process, but from a object/structural point of view is there a way to do the same processes but much quicker?

    def generic_calc(df):
        
        ar1 = np.zeros(len(df))
        ar2 = np.zeros(len(df))
        
        checks = columns_group1 + columns_group2
        changes = ['chg1','chg2']
        for c in range(len(columns_group2)):
            changes.append('c{}_chg'.format(c+1))
        
        for row in range(len(df)):
            if any(df[checks].iloc[row])==0:
                ar1[row] = 0
                ar2[row] = df['Metric'][row] - df['Metric2'][row]
            elif any(abs(df[changes].iloc[row]))>1:
                ar1[row] = 0
                ar2[row] = df['Metric1'][row] - df['Metric2'][row]
            else:
                ar1[row] = (df['Metric3'][row]-df['Metric4'][row])*df['Metric5'][row]
                ar2[row] = (df['Metric6'][row]-df['Metric4'][row])*df['Metric7'][row]
                
        for c in range(len(columns_group2)):
            component = np.zeros(len(df))
            for row in range(len(df)):
                if df[checks].iloc[row].any()==0:
                    component[row] = 0
                elif any(abs(df[changes].iloc[row]))>1:
                    component[row] = 0
                else:
                    component[row] = (df['Metric8-{}'.format(c+1)][row]-df['Metric9{}'.format(c+1)][row])*df['QTY_shift'][row]
                df[columns_group2[c]] = component



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source