'Condense rows in a DataFrame
I'm trying to change the structure of a dataframe, currently it looks like this (an approximation of my data):
Date Var1 Var2 Var3 Var4 Client Code
Jan You win! NaN 1 NaN Yep 100
Jan NaN You lose! NaN 0 Yep 100
Feb Go for it! NaN 1 NaN Bar 200
Feb NaN Dang NaN 0 Bar 200
Mar Go for it! NaN 0 NaN Foo 300
Mar NaN Darn NaN 1 Foo 300
Unfortunately this pattern is not consistent over the entirety of the DataFrame. Assume all the values are strings. I'm trying to condense it down based on if the Date, Client, and Code are the same.
Expected Output:
Date Var1 Var2 Var3 Var4 Client Code
Jan You win! You lose! 1 0 Yep 100
Feb Go for it! Dang 1 0 Bar 200
Mar Go for it! Darn 0 1 Foo 300
I'm really not sure how I'd do this, I guess I'm trying to group by Date, Client, and Code, but I don't want to do any aggregating I'm just trying to fill in the nans and then delete the duplicate rows.
df constructor:
df = pd.DataFrame({'Date': ['Jan', 'Jan', 'Feb', 'Feb', 'Mar', 'Mar'],
'Var1': ['You win!', np.nan, 'Go for it!', np.nan, 'Go for it!', np.nan],
'Var2': [np.nan, 'You lose!', np.nan, 'Dang', np.nan, 'Darn'],
'Var3': [1.0, np.nan, 1.0, np.nan, 0.0, np.nan],
'Var4': [np.nan, 0.0, np.nan, 0.0, np.nan, 1.0],
'Client': ['Yep', 'Yep', 'Bar', 'Bar', 'Foo', 'Foo'],
'Code': [100, 100, 200, 200, 300, 300]})
Solution 1:[1]
Assuming that the NaN pattern is consistent, you can do that with the following
df2 = df.iloc[::2, :]
df2["Var2"] = df["Var2"][~df['Var2'].isna()].values
df2["Var4"] = df["Var4"][~df['Var4'].isna()].values
print(df2)
Date Var1 Var2 Var3 Var4 Client Code
0 Jan You win! You lose! 1 0 Yep 100
2 Feb Go for it! Dang 1 0 Bar 200
4 Mar Go for it! Darn 0 1 Foo 300
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | BoomBoxBoy |
