'Why does [line in open("text.txt")] yield newlines?
Looking at solutions to reading in a file in Python, every time the newline character should be stripped off:
In [5]: [line for line in open("text.txt", "r")]
Out[5]: ['line1\n', 'line2']
The intuitive behavior (judging by the popularity of some questions (1, 2) about this) would be to just yield the stripped lines.
What is the rationale behind this?
Solution 1:[1]
Well, this is a line. A line is defined by ending with the character \n. If a sequence of characters did not end with a \n (or EOF) how could we know it was a line?
"hello world"
"hello world\n"
The first is not a line, if we print it twice we might get
hello worldhello world
Wile the second version will give us
hello world
hello world
Solution 2:[2]
Migrating the asker's response/solution from the question to an answer:
Granted: 'intuitive' is subjective. 'Consistent', however, is less so. Apparently the 'line' concept in
"line1\nline2".splitlines()is a different one than the one handled by theiter(open("text.txt")):>>> assert(open("text.txt").readlines() == \ ... open("text.txt").read().splitlines()) AssertionErrorPretty sure people do get caught by this.
So I was mistaken: maybe my intuition is just in line with the
splitlinesinterpretation: the split stuff should not include the separators. Maybe the answer to my question is not technical, but more like "since PEP-xyz was approved by different people than PEP-qrs". Maybe I should post it to some python language forum.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | beoliver |
| Solution 2 |
