'Searching for multiple strings over multiple columns in Python
I am trying to find multiple patterns (the first 3 elements of a string) over multiple columns. Up till now, I am able to find one pattern over multiple columns with the following code:
df['C07_location'] = df[colnames_locations].applymap(lambda x: 'C07' in x).any(1).astype(int)
In this case, it looks for the string C07 in all the columns with locations. However I have 30 of these locations I want to look for, which looks something like this:
unique_locations = ['C07', 'C08', 'C11', 'C14']
This is an example of what the original dataset looks like:
location_1 location_2 location_3 ...
0 C110 C072 NaN
1 NaN NaN NaN
2 C147 C144 C112
3 C082 C079 NaN
4 C071 C110 C145
... ... ... ...
I would like to create a new column for each unique location, with the end result looking like this:
location_1 location_2 location_3 C07_location C08_location C11_location ...
0 C110 C072 NaN 1 0 1
1 NaN NaN NaN 0 0 0
2 C147 C144 C112 0 0 1
3 C082 C079 NaN 1 1 0
4 C071 C110 C145 1 0 1
... ... ... ... ... ... ...
Any guidance in the right direction is much appreciated!
Solution 1:[1]
If i understood your question correctly, you want to create separate columns for unique_locations.
If that is the case then you can simply use a for loop
for loc in unique_locations:
df[loc+'_location']=(df[colnames_locations].values == loc).any(1).astype(int)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | ibadia |
