'Code Treats .txt File Differently When Saved
I have an input .txt file that looks something like this.
command1 param1
command2 param2
command3 param3
command4 param4
I am trying to reduce the extra whitespace so I implemented the code below to remove that.
string[] output = File.ReadAllText(InputFilePath).Split('\n').Select(s => Regex.Replace(s, @"\s+", " ")).ToArray();
File.WriteAllLines(OutputFilePath, output);
If I run the code on the file without doing anything, the code does not work.
However, If I manually go into the input file and just save it without changing anything and then run the code again, it works fine.
I believe this is some sort of UTF-16/8 issue but I am not sure how to account for it. What can I do?
Solution 1:[1]
In this specific case there were "invisible control characters and unused code points". Using regular expressions to remove those characters resolved the issue.
string[] output = File.ReadAllLines(InputFilePath).Select(s => Regex.Replace(s, @"\p{C}+", "")).ToArray();
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Niuq Navig |