'How can I group and reformat a dataframe based on a column?

This is my dataframe:

        cardio     variable  value
0            0  cholesterol      0
1            1  cholesterol      1
2            1  cholesterol      1
3            1  cholesterol      0
4            0  cholesterol      0
...        ...          ...    ...
419995       0   overweight      1
419996       1   overweight      1
419997       1   overweight      1
419998       1   overweight      1
419999       0   overweight      0

How can I split and group it based on the value of the "cardio" column and also get the counts? Like so:

   cardio variable  value  total
0       0   active      0   6378
1       0   active      1  28643
2       0     alco      0  33080
...
    cardio    variable  value  total
21       1  overweight      1  24440
22       1       smoke      0  32050
23       1       smoke      1   2929


Solution 1:[1]

Let's try a dict comprehension :

The idea is to first create a group for each cardio group df.groupby('cardio').

Then apply your operation on each group, in this instance size() and return it to its own dataframe.

We use a dictionary to hold the various dataframes in a single container as opposed to disparate variables.

data_dict = {
    f"cardio_{cardio}": data.groupby(["variable", "value"]).size().reset_index(name='counts')
    for cardio, data in df.groupby("cardio")
}

data_dict['cardio_0']

      variable  value  counts
0  cholesterol      0  2
1   overweight      0  1
2   overweight      1  1

data_dict['cardio_1']

      variable  value  counts
0  cholesterol      0  1
1  cholesterol      1  2
2   overweight      1  3

Solution 2:[2]

Sorting by the values of cardio column:

df = pd.read_csv('../')
df = df.sort_values(by=cardio)

Splitting the dataframe by the values of cardio:

df_cardio_0 = df[df['cardio']==0]
df_cardio_1 = df[df['cardio']==1]

Solution 3:[3]

You can write a function to concatenate the filter frame you want from your split column.

Def Group(column,  Frame):
      Return PD.concat([Frame[Frame[column]==0],  Frame[Frame[column]==1]

Then pass your dataframe to this function.

Group("cardio",  df)

Solution 4:[4]

You can try this:

df0 = df.query('cardio==0').groupby(['cardio','variable','value']).size().reset_index()
print(df0)

df1 = df.query('cardio==1').groupby(['cardio','variable','value']).size().reset_index()
print(df1)

Solution 5:[5]

If your dataframe is df_1 you can use groupby to reshape your dataframe

df_2 = df_1.groupby(['cardio','variable','value']).size().reset_index(name='counts')

Resulte will be as follows

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 halfer
Solution 2 Arnab
Solution 3 David Buck
Solution 4
Solution 5 Ksam