'how to Keep the maximum value and remove other values of a list with the considration of othere lists

there is 3 lists. the first 2 lists show the id and third list is the Value. how to keep the maximum Values in third column with same id and remove the other Values. For example:

list1 list2 list3
1 4 17
2 32 44
1 5 7
2 32 5

The result should be like:

list1 list2 list3
1 4 17
2 32 44
1 5 7

this lists have more than 10 thousands Values and It would be great to avoid the loops.



Solution 1:[1]

df = pd.DataFrame({
    'list1' : [1,2,1,2],
    'list2' : [4,32,5,32],
    'list3' : [17,44,7,3],
})

You can do it like this:

1.

df.sort_values('list3', ascending=False).drop_duplicates(subset=['list1', 'list2'], keep='first').sort_index()

or 2.

df.groupby(['list1', 'list2'])['list3'].max().reset_index()

Update for 2.:

out =  df.groupby(['list1', 'list2'], as_index=False)['list3'].max()

Solution 2:[2]

With loops, you can do something like that :

def get_max(t):
    res = []
    
    for row in t:
        if t[2] > t[1] and t[2] > t[0]:
            res.append(row)
    
    return res

You loop in each row, if the value of the last column is greater than the other ones, you keep it. You can also make it in one line :

def get_max(t):
    return [row for row in t if t[2] > t[1] and t[2] > t[0]]

PS : As you get many data, the complexity of this algorithm is O(n) which is linear.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2 Lukas Laudrain