'How to search through a pandas dataframe for a match?
I got the following pandas dataframe:
paths tags
0 /home/onur/PycharmProjects/file-tagging/data/w... NaN
1 /home/onur/PycharmProjects/file-tagging/data/d... NaN
2 /home/onur/PycharmProjects/file-tagging/data/d... NaN
3 /home/onur/PycharmProjects/file-tagging/data/d... NaN
4 /home/onur/PycharmProjects/file-tagging/data/w... NaN
... ... ...
5404 /home/onur/PycharmProjects/file-tagging/.idea/... NaN
5405 /home/onur/PycharmProjects/file-tagging/.idea/... NaN
5406 /home/onur/PycharmProjects/file-tagging/.idea/... NaN
5407 /home/onur/PycharmProjects/file-tagging/.idea/... NaN
5408 /home/onur/PycharmProjects/file-tagging/conten... NaN
I want to iterate through every row in paths and if the entry meets a specific requirement, I want to add to the corresponding row in tags. A single tags row could have more than one entry. What is a good way to strategize this?
My goal is to also save this to a data file to be able to edit later on.
This is my code so far:
data_list = []
def myprint(path):
if not os.path.exists('content-log.txt'):
open('content-log.txt', 'a').close()
elements = os.listdir(path)
for element in elements:
if os.path.isdir(os.path.join(path, element)):
myprint(os.path.join(path, element))
else:
data_list.append(os.path.join(path, element))
df = pandas.DataFrame(data_list, columns=['paths'])
df['tags'] = pd.Series(dtype='str')
print(df)
Solution 1:[1]
Advice
- Use the
walkfunction instead of a loop within aisdirjudge.
Example
data_list = []
for root, dirs, files in os.walk(path):
data_list.extend([os.path.join(root, file) for file in files])
- Use a judge function to return your specific requirement.
Example
def judge(elem):
return [elem, "Ok"] if elem.endswith("json") else ["No-way"]
df["tags"] = df["value"].apply(judge)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | FavorMylikes |
