'How do I skip splitting when white space occurs?

I want to split using ";" as delimiter and put outcome into the list of strings, for example

Input:

sentence;sentence;sentence

should produce:

[sentence, sentence, sentence]

Problem is some strings are like this: "sentence; continuation;new sentence", and for such I'd like the outcome to be: [sentence; continuation, new sentence].

I'd like to skip splitting when there is whitespace after (or before) semicolon.

Example string I'd like to split:

String sentence = "Ogłoszenie o zamówieniu;2022/BZP 00065216/01;"Dostawa pojemników na odpady segregowane (900 sztuk o pojemności 240 l – kolor żółty; 30 sztuk o pojemności 1100 l – kolor żółty).";Zakład Wodociągów i Usług Komunalnych EKOWOD Spółka z ograniczoną odpowiedzialnością"

I tried:

String[] splitted = sentence.split(";\\S");

But this cuts off the first character of each sentence.



Solution 1:[1]

You can use a regex negative lookahead/lookbehind for this.

String testString = "hello;world; test1 ;test2";

String[] splitString = testString.split("(?<! );(?! )"); // Negative lookahead and lookbehind

for (String s : splitString) System.out.println(s);

Output:

hello
world; test1 ;test2

Here, the characters near the start and end of the regex are saying "only split on the semicolon if there are no spaces before or after it"

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1