'Replace all except first row in a multi index
I am using pandas and have loaded some data into a dataframe. What I would like to do is replace the scenario frequency column in my data for all but the first value in each group.
My data looks like this:
ExplosionID FireWater FireID Scenario Frequency
111 0 213 4.209055e-15
214 4.209055e-15
215 4.209055e-15
217 4.209055e-15
219 4.209055e-15
220 4.209055e-15
112 0 232 8.388742e-16
233 8.388742e-16
234 8.388742e-16
235 8.388742e-16
237 8.388742e-16
239 8.388742e-16
240 8.388742e-16
I would like to replace all but the first values in scenario frequency column with 0, so that I end up with this:
ExplosionID FireWater FireID Scenario Frequency
111 0 213 4.209055e-15
214 0
215 0
217 0
219 0
220 0
112 0 232 8.388742e-16
233 0
234 0
235 0
237 0
239 0
240 0
The first three columns (ExplosionI, FireWater, FireID) are the indexes in a multi-index.
I've defined a function:
#function to replace all but first value in group with 0
def replace_all_except_first(group):
group.iloc[1:] = 0
return group
and have tried the following:
data_to_sum = HL_df_subset.groupby(level=0).apply(replace_all_except_first)
where HL_df_subset is my dataframe. However, this places all values with 0.
I'm new to python and i know i'm completely misunderstanding how groupby works, but i've been trying all sorts and can't get it to work.
Thanks for your help.
Solution 1:[1]
cumcount: to find the ordering. create a boolean series where not equal to0. This means, not the first rowmask: takes true values and masks the relevant parts of the dataframe. In this case, it makes everywhwere the cumcount isn't zero anp.nan.fillna: take thosenp.nanand fill them with zero
HL_df_subset.mask(HL_df_subset.groupby(level=0).cumcount().ne(0)).fillna(0)
consider df
df = pd.DataFrame(
dict(A=np.arange(100, 116)),
pd.MultiIndex.from_product(
[list('ab'), list('xy'), [1, 2, 3, 4]]))
df
A
a x 1 100
2 101
3 102
4 103
y 1 104
2 105
3 106
4 107
b x 1 108
2 109
3 110
4 111
y 1 112
2 113
3 114
4 115
df.mask(df.groupby(level=[0, 1]).cumcount().ne(0)).fillna(0)
A
a x 1 100.0
2 0.0
3 0.0
4 0.0
y 1 104.0
2 0.0
3 0.0
4 0.0
b x 1 108.0
2 0.0
3 0.0
4 0.0
y 1 112.0
2 0.0
3 0.0
4 0.0
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
