'Python regex - updating file by adding character for all the matched pattern
i am trying to create a script to:
- read all the text files in my folder
- find the words that matched the pattern [r'\d\d\d\d'+"H"] (eg. 1234H)
- replace them into (eg. 12:34:00)
- save file
currently my code is this, not sure where went wrong. pls advise thank you!
import os
import re
path = r'C:\Users\CL\Desktop\regex'
for root, dirs, files in os.walk(path):
for file in files:
if file.endswith('.txt'): #find all .txt files
path = os.path.join(root, file)
f = open(path,'a')
pattern = r'\d\d\d\d'+"H" #pattern
replacewords = re.findall(pattern, f) #find all words with this pattern
...... #replace matched words with eg. 12:23:00
f.write() #save file
f.close()
sample text content:
1111H, 1234H, 1115H
Solution 1:[1]
You can use
import os, re
path = r'C:\Users\CL\Desktop\regex'
for root, dirs, files in os.walk(path):
for file in files:
if file.lower().endswith('.txt'): #find all .txt / .TXT files
path = os.path.join(root, file)
pattern = r'(\d{2})(\d{2})H' # pattern
with open(path, 'r+') as f: # Read and update
contents = re.sub(pattern, r'\1:\2:00' f.read())
f.seek(0)
f.truncate()
f.write(contents)
NOTE:
if file.lower().endswith('.txt')makes text file search case insensitive(\d{2})(\d{2})Hpattern matches and captures the first two digits in Group 1 and the next two digits beforeHinto Group 2- When replacing,
\1refers to Group 1 value and\2refers to Group 2 value - The file read mode is set to
r+so that the file could be both read and updated. - The
f.seek(0)andf.truncate()allow re-writing the file contents with the updated contents.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Wiktor Stribiżew |
