'Extracting fields from back-to-back records in a string using Java

I haven't used regex or patterns in a while and I'm having trouble breaking down individual records within a multi-record stream. A single record has the following pattern:

[field1][field2][field3]field4

Each field is variable length text. The record pattern repeats consistently within a large string or text stream. If I can split the larger string/stream into an array of Strings, each element containing a record, I can use simple Java text processing methods to extract the fields. There isn't any specific delimiter between records within the string/stream except the record pattern. Essentially, I want to search each record for a specific substring, and if it exists, extract the 4 fields for follow-on processing.



Solution 1:[1]

Try

public void findFields() {
    String str = "[field1][field2][field3]field4";
    Pattern pattern = Pattern.compile("\\[(.*?)\\]");
    Matcher matcher = pattern.matcher(str);
    List<String> fields = new ArrayList<String>();
    while(matcher.find()) {
        fields.add(matcher.group(1));
    }
    matcher.replaceAll("");
    StringBuffer tail = new StringBuffer();
    matcher.appendTail(tail);
    fields.add(tail.toString());
    System.out.println(fields);
}

Solution 2:[2]

If you can afford to use Scala then you can extract the content of those 4 fields in a simple way:

val s"[$field1][$field2][$field3]$field4" = "[Adam][Bela][Cecil]doctor"
println(s"$field1,$field2,$field3,$field4")

The result printed by println will be: Adam, Bela, Cecil, doctor

Run it online here: https://scastie.scala-lang.org/kjA7J3iqRkmzSvrl1WM9xA

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Syam S
Solution 2