'Regular Expression for parsing ASCII data
Right now I have a couple separate regular expressions to filter data from a string but I'm curious if there's a way to do it all in one go.
Sample Data: (DATA$0$34.0002,5.3114$34.0002,5.2925$34.0004,5.3214$34.0007,2.2527$34.0002,44.3604$34.0002,43.689$34.0004,38.3179$34.0007,8.1299)
- Need to verify there's an open and close parentheses ( )
- Need to verify there's a "DATA$0" after the open parenthesis
- Need to split the results by $
- Need to split that subset by comma
- Need to capture only the last item of that subset (i.e. 5.3114, 5.2925, 5.3214, etc.)
My first check is on parenthesis using (([^)]+)) as my RegEx w/ RightToLeft & ExplicitCapture options (some lines can have multiple data sets).
Next I filter for the DATA$0 using (?:(DATA$0)
Finally I do my splits and take the last value in the array to get what I need but I'm trying to figure out if there's a better way.
string DataPattern = @"(?:\(DATA\$0)";
string ParenthesisPattern = @"\(([^)]+)\)";
RegexOptions options = RegexOptions.RightToLeft | RegexOptions.ExplicitCapture;
StreamReader sr = new StreamReader(FilePath);
while (!sr.EndOfStream)
{
string line = sr.ReadLine();
Console.WriteLine(line);
Match parentMatch = Regex.Match(line, ParenthesisPattern, options);
if (parentMatch.Success)
{
string value = parentMatch.Value;
Match dataMatch = Regex.Match(value, DataPattern);
if (dataMatch.Success)
{
string output = parentMatch.Value.Replace("(DATA$0", "").Replace(")", "");
string[] splitOutput = Regex.Split(output, @"\$");
foreach (string x in splitOutput)
{
if (!string.IsNullOrEmpty(x))
{
string[] splitDollar = Regex.Split(x, ",");
if (splitDollar.Length > 0)
Console.WriteLine("Value: " + splitDollar[splitDollar.Length - 1]);
}
}
}
else
Console.WriteLine("NO DATA");
}
else
Console.WriteLine("NO PARENTHESIS");
Console.ReadLine();
}
TIA
Solution 1:[1]
You can use
var results = Regex.Matches(text, @"(?<=\(DATA\$0[^()]*,)[^(),$]+(?=(?:\$[^()]*)?\))")
.Cast<Match>()
.Select(x => x.Value)
.ToList();
See the regex demo. Details:
(?<=\(DATA\$0[^()]*,)
- a positive lookbehind that matches a location that is immediately preceded with(DATA$0
, zero or more chars other than(
and)
(as many as possible) and a comma[^(),$]+
- one or more chars other than(
,)
,$
and a comma(?=(?:\$[^()]*)?\))
- the current location must be immediately followed with an optional occurrence of a$
char and then zero or more chars other than(
and)
, and then a)
char.
An alternative:
var results = Regex.Matches(text, @"(?:\G(?!^)|\(DATA\$0)[^()]*?,([^(),$]+)(?=(?:\$[^()]*)?\))")
.Cast<Match>()
.Select(x => x.Groups[1].Value)
.ToList();
See the regex demo. Details:
(?:\G(?!^)|\(DATA\$0)
- either the end of the previous successful match, or(DATA$0
string[^()]*?
- zero or more chars other than(
,)
,,
, as few as possible,
- a comma([^(),$]+)
- Group 1: one or more chars other than(
,)
,,
,$
(?=(?:\$[^()]*)?\))
- a positive lookahead matching the location that is immediately followed with an optional occurrence of a$
char followed with zero or more chars other than(
and)
, and then a)
char.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 |