'Search list items (list has path that extracted using OS Walk) in a csv file. If it’s not available, then perform some action
I wanted to check each item of this list (Filepath_list) in a DataFrame (from a csv) that has list of path.
If it’s there than I will skip and check for next item in the list. If it not their than I will do some processing.
The code I am using is:
l = os.listdir(dir_path)
filepath_list = []
for root, dirs, files in os.walk(dir_path):
for f in files:
if 'Apr.log' in f:
filepath_list.append(os.path.join(root,f))
df_path_list_csv = pd.read_csv(r"C:\Users\ABC6\OneDrive - ABB\Files\Folder List.csv")
for g in filepath_list:
if df_path_list_csv['Detail'].str.contains(g).any():
print(g)
The error I am getting is:
incomplete escape \U at position 2
Can anyone please help me with correcting the error in this code. Is it because of backlash? If yes, how can I correct it. I tried replacing "\" with "/" but still its not working..
The variable Filepath_list has "\\" while df_path_list_csv dataframe has "\"
Filepath_list =
['C:\\Users\\ABC6\\OneDrive - ABB\\Job Files\\SA_R 1_Ret\\13Mar22\\Apr.log',
'C:\\Users\\ABC6\\OneDrive - ABB\\Job Files\\SA_Run 2_Dri\\28Mar22\\Apr.log',
'C:\\Users\\ABC6\\OneDrive - ABB\\Job Files\\SA_Well_Run 2_Dri\\29Mar22\\Apr.log']
df_path_list_csv:
0 C:\Users\ ABC6\OneDrive - ABB\Job Files...
1 C:\Users\ ABC6\OneDrive - ABB\ Job Files...
Solution 1:[1]
By default the contains method of pandas assume that the pattern used is a regular expression and so identify a backslash as an escape char. To avoid this behavior and use your pattern as litteral string you need to set the regex parameter to False.
l = os.listdir(dir_path)
filepath_list = []
for root, dirs, files in os.walk(dir_path):
for f in files:
if 'Apr.log' in f:
filepath_list.append(os.path.join(root,f))
# As said in my comment just double the backslash to avoid the error you get
df_path_list_csv = pd.read_csv("C:\\Users\\ABC6\\OneDrive - ABB\\Files\\Folder List.csv")
for g in filepath_list:
# Here you have to set to False the `regex` parameter of the `contains` method
if df_path_list_csv['Detail'].str.contains(g, regex=False).any():
print(g)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Yannick Guéhenneux |
