'python equivalent to sed
Is there a way, without a double loop to accomplish what the following sed command does
Input:
Time
Banana
spinach
turkey
sed -i "/Banana/ s/$/Toothpaste/" file
Output:
Time
BananaToothpaste
spinach
turkey
What I have so far is a double list which would take a long time to go through both.
List a has a bunch of numbers list b has a the same bunch of numbers but in a different order
For each entry in A i want to find the line in B with that same number and add value C to the end of it.
Hope this makes sense, even if my example doesn't.
I was doing the following in Bash and it was working however it was super slow...
for line in $(cat DATSRCLN.txt.utf8); do
srch=$(echo $line | awk -F'^' '{print $1}');
rep=$(echo $line | awk -F'^' '{print $2}');
sed -i "/$(echo $srch)/ s/$/^$(echo $rep)/" tmp.1;
done
Thanks!
Solution 1:[1]
Using re.sub():
newstring = re.sub('(Banana)', r'\1Toothpaste', oldstring)
This catches one group (between first parentheses), and replaces it by ITSELF (the \number part) followed by a desired suffix. It is needed to use r'' (raw string) so that the escape is correctly interpreted.
Solution 2:[2]
A late comer to the race, here is my implementation for sed in Python:
import re
import shutil
from tempfile import mkstemp
def sed(pattern, replace, source, dest=None, count=0):
"""Reads a source file and writes the destination file.
In each line, replaces pattern with replace.
Args:
pattern (str): pattern to match (can be re.pattern)
replace (str): replacement str
source (str): input filename
count (int): number of occurrences to replace
dest (str): destination filename, if not given, source will be over written.
"""
fin = open(source, 'r')
num_replaced = count
if dest:
fout = open(dest, 'w')
else:
fd, name = mkstemp()
fout = open(name, 'w')
for line in fin:
out = re.sub(pattern, replace, line)
fout.write(out)
if out != line:
num_replaced += 1
if count and num_replaced > count:
break
try:
fout.writelines(fin.readlines())
except Exception as E:
raise E
fin.close()
fout.close()
if not dest:
shutil.move(name, source)
examples:
sed('foo', 'bar', "foo.txt")
will replace all 'foo' with 'bar' in foo.txt
sed('foo', 'bar', "foo.txt", "foo.updated.txt")
will replace all 'foo' with 'bar' in 'foo.txt' and save the result in "foo.updated.txt".
sed('foo', 'bar', "foo.txt", count=1)
will replace only the first occurrence of 'foo' with 'bar' and save the result in the original file 'foo.txt'
Solution 3:[3]
If you are using Python3 the following module will help you: https://github.com/mahmoudadel2/pysed
wget https://raw.githubusercontent.com/mahmoudadel2/pysed/master/pysed.py
Place the module file into your Python3 modules path, then:
import pysed
pysed.replace(<Old string>, <Replacement String>, <Text File>)
pysed.rmlinematch(<Unwanted string>, <Text File>)
pysed.rmlinenumber(<Unwanted Line Number>, <Text File>)
Solution 4:[4]
You can actually call sed from python. Many ways to do this but I like to use the sh module. (yum -y install python-sh)
The output of my example program is a follows.
[me@localhost sh]$ cat input
Time
Banana
spinich
turkey
[me@localhost sh]$ python test_sh.py
[me@localhost sh]$ cat input
Time
Toothpaste
spinich
turkey
[me@localhost sh]$
Here is test_sh.py
import sh
sh.sed('-i', 's/Banana/Toothpaste/', 'input')
This will probably only work under LINUX.
Solution 5:[5]
It's possible to do this using tmp file with low system requirements and only one iteration without copying whole file into the memory:
#/usr/bin/python
import tempfile
import shutil
import os
newfile = tempfile.mkdtemp()
oldfile = 'stack.txt'
f = open(oldfile)
n = open(newfile,'w')
for i in f:
if i.find('Banana') == -1:
n.write(i)
continue
# Last row
if i.find('\n') == -1:
i += 'ToothPaste'
else:
i = i.rstrip('\n')
i += 'ToothPaste\n'
n.write(i)
f.close()
n.close()
os.remove(oldfile)
shutil.move(newfile,oldfile)
Solution 6:[6]
I found the answer supplied by Oz123 to be great, but didn't seem to work 100%. I'm new to python, but modded it and wrapped it up to run in a bash script. This works on osx, using python 2.7.
# Replace 1 occurrence in file /tmp/1
$ search_replace "Banana" "BananaToothpaste" /tmp/1
# Replace 5 occurrences and save in /tmp/2
$ search_replace "Banana" "BananaToothpaste" /tmp/1 /tmp/2 5
search_replace
#!/usr/bin/env python
import sys
import re
import shutil
from tempfile import mkstemp
total = len(sys.argv)-1
cmdargs = str(sys.argv)
if (total < 3):
print ("Usage: SEARCH_FOR REPLACE_WITH IN_FILE {OUT_FILE} {COUNT}")
print ("by default, the input file is replaced")
print ("and the number of times to replace is 1")
sys.exit(1)
# Parsing args one by one
search_for = str(sys.argv[1])
replace_with = str(sys.argv[2])
file_name = str(sys.argv[3])
if (total < 4):
file_name_dest=file_name
else:
file_name_dest = str(sys.argv[4])
if (total < 5):
count = 1
else:
count = int(sys.argv[5])
def sed(pattern, replace, source, dest=None, count=0):
"""Reads a source file and writes the destination file.
In each line, replaces pattern with replace.
Args:
pattern (str): pattern to match (can be re.pattern)
replace (str): replacement str
source (str): input filename
count (int): number of occurrences to replace
dest (str): destination filename, if not given, source will be over written.
"""
fin = open(source, 'r')
num_replaced = 0
fd, name = mkstemp()
fout = open(name, 'w')
for line in fin:
if count and num_replaced < count:
out = re.sub(pattern, replace, line)
fout.write(out)
if out != line:
num_replaced += 1
else:
fout.write(line)
fin.close()
fout.close()
if file_name == file_name_dest:
shutil.move(name, file_name)
else:
shutil.move(name, file_name_dest)
sed(search_for, replace_with, file_name, file_name_dest, count)
Solution 7:[7]
you could use it as a command line tool:
# Will change all test*.py in subdirectories of tests.
massedit.py -e "re.sub('failIf', 'assertFalse', line)" -s tests test*.py
you also could use it as a library:
import massedit
filenames = ['massedit.py']
massedit.edit_files(filenames, ["re.sub('Jerome', 'J.', line)"])
Solution 8:[8]
With thanks to Oz123 above, here is sed which is not line by line so your replacement can span newlines. Larger files could be a problem.
import re
import shutil
from tempfile import mkstemp
def sed(pattern, replace, source, dest=None):
"""Reads a source file and writes the destination file.
Replaces pattern with replace globally through the file.
This is not line-by-line so the pattern can span newlines.
Args:
pattern (str): pattern to match (can be re.pattern)
replace (str): replacement str
source (str): input filename
dest (str): destination filename, if not given, source will be over written.
"""
if dest:
fout = open(dest, 'w')
else:
fd, name = mkstemp()
fout = open(name, 'w')
with open(source, 'r') as file:
data = file.read()
p = re.compile(pattern)
new_data = p.sub(replace, data)
fout.write(new_data)
fout.close()
if not dest:
shutil.move(name, source)
Solution 9:[9]
you can use sed or awk or grep in python (with some restrictions). Here is a very simple example. It changes banana to bananatoothpaste in the file. You can edit and use it. ( I tested it worked...note: if you are testing under windows you should install "sed" command and set the path first)
import os
file="a.txt"
oldtext="Banana"
newtext=" BananaToothpaste"
os.system('sed -i "s/{}/{}/g" {}'.format(oldtext,newtext,file))
#print(f'sed -i "s/{oldtext}/{newtext}/g" {file}')
print('This command was applied: sed -i "s/{}/{}/g" {}'.format(oldtext,newtext,file))
if you want to see results on the file directly apply: "type" for windows/ "cat" for linux:
####FOR WINDOWS:
os.popen("type " + file).read()
####FOR LINUX:
os.popen("cat " + file).read()
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | heltonbiker |
| Solution 2 | oz123 |
| Solution 3 | M. Adel |
| Solution 4 | shrewmouse |
| Solution 5 | |
| Solution 6 | |
| Solution 7 | leafonsword |
| Solution 8 | rleir |
| Solution 9 |
