'How can I create 2 new columns, one that looks up a prior match and one that show the next match

I have a dataset that looks similar to the below and would like to create two new columns. One column would return the prior year, and if 0 return NAN. The Second column would return the Next year, and if none, add 4 to the year.

data table:

NAME	YEAR	COUNT
Foo	2012	0
Foo	2017	1
Foo	2022	2
Bar	2015	0
Bar	2014	2

Goal Results:

Name	Year	Count	Prior	Next
Foo	2012	0	NAN	2017
Foo	2017	1	2012	2022
Foo	2022	2	2017	2026
Baa	2015	0	NAN	2024
Baa	2024	1	2015	2028

Using excel I would just use a Vlookup or something and return the values. I have no idea how to do this in Python.

I have tried experimenting with making a duplicate dataframe and then using pd.merge but keep getting an error.

Solution 1:^[1]

setup

df = pd.DataFrame(
    {
        "Name":["Foo","Foo","Foo","Bar","Bar"],
        "Year":[2012, 2017, 2022, 2015, 2024],
        "Count":[0,1,2,0,1]
    }
)

solution

def make_data(df_sub):   
    years = pd.Series(df_sub["Year"].sort_values().unique())
    df_sub["Prior"] = df_sub["Year"].map(dict(zip(years, years.shift()))).mask(df_sub["Count"] == 0)
    df_sub["Next"] = df_sub["Year"].map(dict(zip(years, years.shift(-1)))).fillna(df_sub["Year"]+4)
    return df_sub

df.groupby("Name").apply(make_data)

This gives you

  Name  Year  Count   Prior    Next
0  Foo  2012      0     NaN  2017.0
1  Foo  2017      1  2012.0  2022.0
2  Foo  2022      2  2017.0  2026.0
3  Bar  2015      0     NaN  2024.0
4  Bar  2024      1  2015.0  2028.0

The solution splits the dataframe according to Name values. For each of these sub-dataframes it sorts the years, and creates maps for next and previous years. For Prior column it will mask any values where Count column is 0. For 'Next` column it will fill empty values with the year + 4.

Solution 2:^[2]

Thanks for the setup df code from @Riley.

We can use numpy.where.

df["Prior"] = np.where(df.groupby("Name")["Count"].shift(1).isnull(),np.nan,  df["Year"].shift(1))
df["Next"] = np.where(df.groupby("Name")["Count"].shift(-1).isnull(), df["Year"] + 4, df["Year"].shift(-1))

> df
    Name    Year    Count   Prior   Next
0   Foo     2012    0       NaN     2017.0
1   Foo     2017    1       2012.0  2022.0
2   Foo     2022    2       2017.0  2026.0
3   Bar     2015    0       NaN     2024.0
4   Bar     2024    1       2015.0  2028.0

Since np.nan is float type. The whole column will be float type too.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1
Solution 2	Denny Chen

'How can I create 2 new columns, one that looks up a prior match and one that show the next match

Solution 1:[1]

Solution 2:[2]

Sources

Related Questions

Solution 1:^[1]

Solution 2:^[2]