'During debugging, pandas gave an error: "TypeError: 'NoneType' object is not callable"

I have a record table.

   data  stage  epoch
0     0  train      0
1     1  valid      1
2     2  train      0
3     3  valid      1
4     4  train      2
5     5  valid      3

I want to separate this table by “train and ”valid“ starting from the last 0 in the ”epoch“. My code is as follows:

import numpy as np
import pandas as pd    

class SL(object):

    def select(self, df):
        df_train = df[df["stage"] == "train"]
        df_valid = df[df["stage"] == "valid"]

        index_zero = np.where(df["epoch"].values == 0)[0][-1]
        df_train = df_train.loc[index_zero:, :]
        df_valid = df_valid.loc[index_zero:, :]
        print(df_train,"\n", df_valid)

df = pd.DataFrame({"data":range(6), "stage":["train","valid","train", "valid","train","valid"], "epoch":[0,1,0,1,2,3]})
SL().select(df)

when I run it directly, it works fine,

 data  stage  epoch
2     2  train      0
4     4  train      2 

    data  stage  epoch
3     3  valid      1
5     5  valid      3

but when I debug with Pycharm, df_valid = df_valid.loc[index_zero:, :] always gives an error TypeError: 'NoneType' object is not callable, does anyone know why?



Solution 1:[1]

IIUC, you can first filter out the rows before the last 0 and then split using groupby:

s = df['epoch'].eq(0).cumsum()
d = {k: g for k,g in df[s.eq(s.iloc[-1])].groupby(df['stage'])}

output:

{'train':    data  stage  epoch
 2     2  train      0
 4     4  train      2,
 'valid':    data  stage  epoch
 3     3  valid      1
 5     5  valid      3}

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 mozway