'Pandas - Select rows from a dataframe based on list in a column
This thread Select rows from a DataFrame based on values in a column in pandas shows how you can select rows if the column contains a scalar. How can I do so if my column contains a list of varied length?
To make it simple, assume the values in the list are similar.
label
0 [1]
1 [1]
2 [1]
3 [1]
4 [1]
5 [1]
6 [1]
7 [1]
8 [1]
9 [1]
10 [1]
11 [1]
12 [1]
13 [1]
14 [1]
15 [1]
16 [0,0,0]
17 [1]
18 [1]
19 [1]
20 [1]
21 [1]
22 [1]
23 [1]
24 [1]
25 [1]
26 [1]
27 [1, 1]
28 [1, 1]
29 [0, 0]
I tried the following which does not work. What I tried was to check if the last element of the list is equivalent to a scalar.
df_pos = df[df["label"][-1] == 1]
Using tolist()[-1] returns me only the last row.
df_pos = df[df["label"].tolist()[-1] == 1]
Solution 1:[1]
Using str
df_pos = df[df["label"].str[-1] == 1]
Solution 2:[2]
Make a new column:
df['label2'] = df['label'].apply(lambda x: x[0])
and then do the operations on the column label2
Solution 3:[3]
Today, I have worked on a similar problem where I need to fetch rows containing a value we are looking for, but the data frame's columns' values are in the list format.
This is the solution I have come up with.
fetched_rows = df.loc [ df['column_name'].map( lambda x : True if check_element in x else False) == True ]
Where column_name ---> the column name that we need to look into
check_element ---> the value that we use to check whether it exists.
Solution 4:[4]
Seems like a good use of itertools.groupby:
from itertools import groupby
l = ['A','A','A','B','B','C','A','A','D','D','D','B']
[f'{len(list(g))}{k}' for k, g in groupby(l)]
# ['3A', '2B', '1C', '2A', '3D', '1B']
Solution 5:[5]
This is one way to solve this problem.
letters = ['A', 'A', 'A', 'B', 'B', 'C', 'A', 'A', 'D', 'D', 'D']
previous_letter = letters[0]
counter = 1
res = list()
for i in range(1, len(letters)):
current_letter = letters[i]
if current_letter == previous_letter:
counter += 1
else:
res.append(f"{counter}{current_letter}")
previous_letter = letters[i]
counter = 1
res.append(f"{counter}{previous_letter}")
print(res)
The trick is in checking a change of letters and keeping track of the count.
Solution 6:[6]
you could try using list.count() to get the number of times the first item appears then use list.pop(list.index()) in a for loop to remove all occurences of the first item, do this in a while loop until len(lst) returns 0
out = []
while len(lst) > 0:
s = lst[0]
c = 0
while len(lst) > 0:
if lst[0] == s:
c += 1
lst.pop(0)
out.append(f"{c}{s}")
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | BENY |
| Solution 2 | hacker315 |
| Solution 3 | Sindhukumari P |
| Solution 4 | Mark |
| Solution 5 | Josewails |
| Solution 6 |
