'Removing all spaces in text file with Python 3.x

So I have this crazy long text file made by my crawler and it for some reason added some spaces inbetween the links, like this:

https://example.com/asdf.html                                (note the spaces)
https://example.com/johndoe.php                              (again)

I want to get rid of that, but keep the new line. Keep in mind that the text file is 4.000+ lines long. I tried to do it myself but figured that I have no idea how to loop through new lines in files.



Solution 1:[1]

Seems like you can't directly edit a python file, so here is my suggestion:

# first get all lines from file
with open('file.txt', 'r') as f:
    lines = f.readlines()

# remove spaces
lines = [line.replace(' ', '') for line in lines]

# finally, write lines in the file
with open('file.txt', 'w') as f:
    f.writelines(lines)

Solution 2:[2]

You can open file and read line by line and remove white space -

Python 3.x:

with open('filename') as f:
    for line in f:
        print(line.strip())

Python 2.x:

with open('filename') as f:
    for line in f:
        print line.strip()

It will remove space from each line and print it.

Hope it helps!

Solution 3:[3]

Read text from file, remove spaces, write text to file:

with open('file.txt', 'r') as f:
    txt = f.read().replace(' ', '')

with open('file.txt', 'w') as f:
    f.write(txt)

In @Leonardo Chirivì's solution it's unnecessary to create a list to store file contents when a string is sufficient and more memory efficient. The .replace(' ', '') operation is only called once on the string, which is more efficient than iterating through a list performing replace for each line individually.

To avoid opening the file twice:

with open('file.txt', 'r+') as f:
    txt = f.read().replace(' ', '')
    f.seek(0)
    f.write(txt)
    f.truncate()

It would be more efficient to only open the file once. This requires moving the file pointer back to the start of the file after reading, as well as truncating any possibly remaining content left over after you write back to the file. A drawback to this solution however is that is not as easily readable.

Solution 4:[4]

I had something similar that I'd been dealing with.

This is what worked for me (Note: This converts from 2+ spaces into a comma, but if you read below the code block, I explain how you can get rid of ALL whitespaces):

import re

# read the file
with open('C:\\path\\to\\test_file.txt') as f:
    read_file = f.read()
    print(type(read_file)) # to confirm that it's a string

read_file = re.sub(r'\s{2,}', ',', read_file) # find/convert 2+ whitespace into ','

# write the file
with open('C:\\path\\to\\test_file.txt', 'w') as f:
    f.writelines('read_file')

This helped me then send the updated data to a CSV, which suited my need, but it can help for you as well, so instead of converting it to a comma (','), you can convert it to an empty string (''), and then [or] use a read_file.replace(' ', '') method if you don't need any whitespaces at all.

Solution 5:[5]

Lets not forget about adding back the \n to go to the next row.

The complete function would be :

with open(str_path, 'r') as file :
    str_lines = file.readlines()

# remove spaces    
if bl_right is True:    
    str_lines = [line.rstrip() + '\n' for line in str_lines]
elif bl_left is True:   
    str_lines = [line.lstrip() + '\n' for line in str_lines]
else:                   
    str_lines = [line.strip() + '\n' for line in str_lines]

# Write the file out again
with open(str_path, 'w') as file:
    file.writelines(str_lines)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3
Solution 4 Mova
Solution 5 Laurent T