'Pandas - Select rows from a dataframe based on list in a column

This thread Select rows from a DataFrame based on values in a column in pandas shows how you can select rows if the column contains a scalar. How can I do so if my column contains a list of varied length?

To make it simple, assume the values in the list are similar.

           label
0          [1]
1          [1]
2          [1]
3          [1]
4          [1]
5          [1]
6          [1]
7          [1]
8          [1]
9          [1]
10         [1]
11         [1]
12         [1]
13         [1]
14         [1]
15         [1]
16         [0,0,0]
17         [1]
18         [1]
19         [1]
20         [1]
21         [1]
22         [1]
23         [1]
24         [1]
25         [1]
26         [1]
27      [1, 1]
28      [1, 1]
29      [0, 0]

I tried the following which does not work. What I tried was to check if the last element of the list is equivalent to a scalar.

df_pos = df[df["label"][-1] == 1]

Using tolist()[-1] returns me only the last row.

df_pos = df[df["label"].tolist()[-1] == 1]

Solution 1:^[1]

Using str

df_pos = df[df["label"].str[-1] == 1]

Solution 2:^[2]

Make a new column:

df['label2'] = df['label'].apply(lambda x: x[0])

and then do the operations on the column label2

Solution 3:^[3]

Today, I have worked on a similar problem where I need to fetch rows containing a value we are looking for, but the data frame's columns' values are in the list format.

This is the solution I have come up with.

fetched_rows = df.loc [ df['column_name'].map( lambda x : True if check_element in x else False) == True ]

Where column_name ---> the column name that we need to look into

check_element ---> the value that we use to check whether it exists.

Solution 4:^[4]

Seems like a good use of itertools.groupby:

from itertools import groupby

l = ['A','A','A','B','B','C','A','A','D','D','D','B']

[f'{len(list(g))}{k}' for k, g in groupby(l)]
# ['3A', '2B', '1C', '2A', '3D', '1B']

Solution 5:^[5]

This is one way to solve this problem.

letters = ['A', 'A', 'A', 'B', 'B', 'C', 'A', 'A', 'D', 'D', 'D']

previous_letter = letters[0]
counter = 1
res = list()
for i in range(1, len(letters)):
    current_letter = letters[i]
    if current_letter == previous_letter:
        counter += 1

    else:
        res.append(f"{counter}{current_letter}")
        previous_letter = letters[i]
        counter = 1

res.append(f"{counter}{previous_letter}")
print(res)

The trick is in checking a change of letters and keeping track of the count.

Solution 6:^[6]

you could try using list.count() to get the number of times the first item appears then use list.pop(list.index()) in a for loop to remove all occurences of the first item, do this in a while loop until len(lst) returns 0

out = []
while len(lst) > 0:
    s = lst[0]
    c = 0
    while len(lst) > 0:
        if lst[0] == s:
            c += 1
            lst.pop(0)
    out.append(f"{c}{s}")

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source
Solution 1	BENY
Solution 2	hacker315
Solution 3	Sindhukumari P
Solution 4	Mark
Solution 5	Josewails
Solution 6

'Pandas - Select rows from a dataframe based on list in a column

Solution 1:[1]

Solution 2:[2]

Solution 3:[3]

Solution 4:[4]

Solution 5:[5]

Solution 6:[6]