'How to join all the lines together in a text file in python?

I have a file and when I open it, it prints out some paragraphs. I need to join these paragraphs together with a space to form one big body of text.

for e.g.

for data in open('file.txt'):
    print data

has an output like this:

Hello my name is blah. What is your name?
Hello your name is blah. What is my name?

How can the output be like this?:

Hello my name is blah. What is your name? Hello your name is blah. What is my name?

I've tried replacing the newlines with a space like so:

for data in open('file.txt'):
      updatedData = data.replace('\n',' ')

but that only gets rid of the empty lines, it doesn't join the paragraphs

and also tried joining like so:

for data in open('file.txt'):
    joinedData = " ".join(data)

but that separates each character with a space, while not getting rid of the paragraph format either.



Solution 1:[1]

You could use str.join:

with open('file.txt') as f:
    print " ".join(line.strip() for line in f)  

line.strip() will remove all types of whitespaces from both ends of the line. You can use line.rstrip("\n") to remove only the trailing "\n".

If file.txt contains:

Hello my name is blah. What is your name?
Hello your name is blah. What is my name?

Then the output would be:

Hello my name is blah. What is your name? Hello your name is blah. What is my name?

Solution 2:[2]

You are looping over individual lines and it is the print statement that is adding newlines. The following would work:

for data in open('file.txt'):
    print data.rstrip('\n'),

With the trailing comma, print doesn't add a newline, and the .rstrip() call removes just the trailing newline from the line.

Alternatively, you need to pass all read and stripped lines to ' '.join(), not each line itself. Strings in python are sequences to, so the string contained in line is interpreted as separate characters when passed on it's own to ' '.join().

The following code uses two new tricks; context managers and a list comprehension:

with open('file.txt') as inputfile:
    print ' '.join([line.rstrip('\n') for line in inputfile])

The with statement uses the file object as a context manager, meaning the file will be automatically closed when we are done with the block indented below the with statement. The [.. for .. in ..] syntax generates a list from the inputfile object where we turn each line into a version without a newline at the end.

Solution 3:[3]

data = open('file.txt').read().replace('\n', '')

Solution 4:[4]

If anyone's doing this in pandas, where you have all lines in a particular column, you can use the following:

import pandas as pd

# line is the name of the column containing all lines in df
df.line.to_string()

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1
Solution 2
Solution 3 John La Rooy
Solution 4 Brndn