'Extracting fields from back-to-back records in a string using Java
I haven't used regex or patterns in a while and I'm having trouble breaking down individual records within a multi-record stream. A single record has the following pattern:
[field1][field2][field3]field4
Each field is variable length text. The record pattern repeats consistently within a large string or text stream. If I can split the larger string/stream into an array of Strings, each element containing a record, I can use simple Java text processing methods to extract the fields. There isn't any specific delimiter between records within the string/stream except the record pattern. Essentially, I want to search each record for a specific substring, and if it exists, extract the 4 fields for follow-on processing.
Solution 1:[1]
Try
public void findFields() {
String str = "[field1][field2][field3]field4";
Pattern pattern = Pattern.compile("\\[(.*?)\\]");
Matcher matcher = pattern.matcher(str);
List<String> fields = new ArrayList<String>();
while(matcher.find()) {
fields.add(matcher.group(1));
}
matcher.replaceAll("");
StringBuffer tail = new StringBuffer();
matcher.appendTail(tail);
fields.add(tail.toString());
System.out.println(fields);
}
Solution 2:[2]
If you can afford to use Scala then you can extract the content of those 4 fields in a simple way:
val s"[$field1][$field2][$field3]$field4" = "[Adam][Bela][Cecil]doctor"
println(s"$field1,$field2,$field3,$field4")
The result printed by println will be: Adam, Bela, Cecil, doctor
Run it online here: https://scastie.scala-lang.org/kjA7J3iqRkmzSvrl1WM9xA
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Syam S |
| Solution 2 |
