'Group rows in text file and select conditionally which to keep with python

I have a file which i have to manipulate using Python.
The file consists of entries like the below
student_id Name Surname DOB Sex Class

Example of raw data

1 John Taylor 2010-05-07 M ClsA
2 Mary Oliver 2010-01-29 F ClsA
3 Peter Edwards 2010-10-23 M ClsA
4 Robert Lewis 2010-12-02 M ClsB
5 Emily Clark 2009-12-04 F ClsB
6 Jeremy Wood 2009-08-15 M ClsB
7 Will Bennett 2008-11-30 M ClsC
8 Tanya Lee 2009-05-11 F ClsC

I have to create a new file where I pass all the data for only the classes where the oldest student of the class is male.
Taking the above example my new file should be like this:

Example of final data

How can I groupby the class and then conditionally write in the new file?



Solution 1:[1]

import pandas as pd

a = [['John Taylor', '2010-05-07', 'M', 'ClsA'],
['Mary Oliver', '2010-01-29', 'F', 'ClsA'],
['Peter Edwards', '2010-10-23', 'M', 'ClsA'],
['Robert Lewis', '2010-12-02', 'M', 'ClsB'],
['Emily Clark', '2009-12-04', 'F' ,'ClsB'],
['Jeremy Wood', '2009-08-15', 'M', 'ClsB'],
['Will Bennett', '2008-11-30', 'M', 'ClsC'],
['Tanya Lee', '2009-05-11', 'F', 'ClsC']]
df = pd.DataFrame(a, columns=['name', 'date', 'gender', 'cl'])
df['date'] = pd.to_datetime(df['date'])

aaa = []
for i in  ['ClsA', 'ClsB', 'ClsC']:
    qqq = df.loc[df['cl'] == i].sort_values(by='date', ascending=True).reset_index().values[0, 3]
    if qqq == 'M':
        aaa.append(i)

print((df.loc[df['cl'].isin(aaa)]).reset_index())

Output

   index          name        date gender    cl
0      3  Robert Lewis  2010-12-02      M  ClsB
1      4   Emily Clark  2009-12-04      F  ClsB
2      5   Jeremy Wood  2009-08-15      M  ClsB
3      6  Will Bennett  2008-11-30      M  ClsC
4      7     Tanya Lee  2009-05-11      F  ClsC

In the loop, I check if there is an older man in the class. I create a list of the necessary classes in it. With the help of which I output all the students of these classes.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1