'How to group by number of bins a ordered dataframe?

I have a dataframe like that:

year count_yes count_no
1900 5 7
1903 5 3
1915 14 6
1919 6 14

I want to have two bins, independently of the value itself.

How can I group those categories and sum its values?

Expected result:

year count_yes count_no
1900 10 10
1910 20 20

Logic: Grouped the first two rows (1900 and 1903) and the two last rows (1915 and 1919) and summed the values of each category

I want to create a stacked percentage column graphic, so 1900 would be 50/50% and 1910 would be also 50/50%.

I've already created the function to build this graphic, I just need to adjust the dataframe size into bins to create a better distribution and visualization



Solution 1:[1]

This is a way to do what you need, if you are ok using the decades as index:

df['year'] = (df.year//10)*10
df_group   = df.groupby('year').sum()

Output>>>

df_group

       count_yes    count_no
year        
1900          10          10
1910          20          20

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 P. Pinho