'Java: How to remove all line breaks between double quotes

I am having a big CSV file which I am parsing in Java. The problem is, that in some of the text sections, which are marked with "", I am having line breaks. I am now trying to remove all the line breaks in the "" sections but was not successful so far.

For example, I am having the following CSV:

"Test Line wo line break"; "Test Line 
with line break"
"Test Line2 wo line break"; "Test Line2 
with line break"

The result should be:

"Test Line wo line break"; "Test Line with line break"
"Test Line2 wo line break"; "Test Line2 with line break"

I have tried the following so far:

s.replaceAll("(\\w)*\r\n", "$1");

But this, unfortunately, replaces all line breaks, also the one at the end of the lines.

Then I added the double apostrophes to the regex:

s.replaceAll("\"(\\w)*\r\n\"", "$1");

But with this, unfortunately, nothing gets replaces at all.

Can you please help me find out what I am doing wrong here?

Thanks in advance



Solution 1:[1]

I wouldn't recommend parsing CVS yourself if you can avoid it. In general parsing raw text often become a hazzle because you need to deal with all sorts of exceptions, and for instance you quite easily reach the point where regular expressions are not enough and you need to be able to parse context free grammars.

There are some options on libraries for parsing CSV here: CSV parsing in Java - working example..?

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Rohde Fischer