'find max value in a list of sets for an element at index 1 of sets

I have a list like this:

dummy_list = [(8, 'N'),
 (4, 'Y'),
 (1, 'N'),
 (1, 'Y'),
 (3, 'N'),
 (4, 'Y'),
 (3, 'N'),
 (2, 'Y'),
 (1, 'N'),
 (2, 'Y'),
 (1, 'N')]

and would like to get the biggest value in 1st column of the sets inside where value in the 2nd column is 'Y'.

How do I do this as efficiently as possible?



Solution 1:[1]

You can use max function with generator expression.

>>> dummy_list = [(8, 'N'),
...  (4, 'Y'),
...  (1, 'N'),
...  (1, 'Y'),
...  (3, 'N'),
...  (4, 'Y'),
...  (3, 'N'),
...  (2, 'Y'),
...  (1, 'N'),
...  (2, 'Y'),
...  (1, 'N')]
>>>
>>> max(first for first, second in dummy_list if second == 'Y')
4

Solution 2:[2]

You can use pandas for this as the data you have resembles a table.

import pandas as pd

df = pd.DataFrame(dummy_list, columns = ["Col 1", "Col 2"]) 
val_y = df[df["Col 2"] == "Y"]
max_index = val_y["Col 1"].idxmax()

print(df.loc[max_index, :])

First you convert it into a pandas dataframe using pd.DataFrame and set the column name to Col 1 and Col 2.

Then you get all the rows inside the dataframe with Col 2 values equal to Y.

Once you have this data, just select Col 1 and apply the idxmax function on it to get the index of the maximum value for that series.

You can then pass this index inside the loc function as the row and : (every) as the column to get the whole row.

It can be compressed to two lines in this way,

max_index = df[df["Col 2"] == "Y"]["Col 1"].idxmax()
df.loc[max_index, :]

Output -

Col 1    4
Col 2    Y
Name: 1, dtype: object

Solution 3:[3]

max([i[0] for i in dummy_list if i[1] == 'Y'])

Solution 4:[4]


max([i for i in dummy_list if i[1] == 'Y'])

output: (4, 'Y')

or


max(filter(lambda x: x[1] == 'Y', dummy_list))

output: (4, 'Y')

Solution 5:[5]

By passing a callback function to max to get a finer search, no further iterations are required.

y_max = max(dummy_list, key=lambda p: (p[0], 'Y'))[0]
print(ymax)

By decoupling the pairs and classify them wrt to the Y,N values

d = {}
for k, v in dummy_list:
    d.setdefault(v, []).append(k)

y_max = max(d['Y'])

By a zip-decoupling one can use a mask-like approach using itertools.compress

values, flags = zip(*dummy_list)
y_max = max(it.compress(values, map('Y'.__eq__, flags)))
print(y_max)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Abdul Niyas P M
Solution 2 Zero
Solution 3 TDT
Solution 4 Will
Solution 5