'Parsing table-like string into JavaScript object
This string is structured in a human-readable table-like way. It contains three columns. However, the only information I need is a list of all of the values from the first column.
app115 115.115 winget
app225 115.115Chrome winget
Knotes 1MHz.Knotes winget
BPMN-RPA Studio 1ic.BPMN-RPAstudio winget
Fishing Funds 1zilc.FishingFunds winget
3601 360.360Chrome winget
3602 360.360Chrome.X winget
3603 360.360CleanMaster winget
3604 360.360se winget
3CX Call Flow Designer (.exe edition) 3CX.3CXCallFlowDesigner winget
Using javascript, how would I parse this string to get a result of something like:
['app115', 'app225', 'Knotes', 'BPMN-RPA Studio', 'Fishing Funds', '360', '360', '360', '360', '3CX Call Flow Designer (.exe edition)']
Here are some of my ideas that I couldn't get to work:
step 1, since the second two columns are not necessary, we can start by replacing 'winget' with blank text string1.replaceAll("winget", "") this removes the whole left column because all values in that column are 'winget'.
step 2, remove all occurrences of multiple characters that are surrounded by 2 or more spaces on each side. This should get rid of the whole second column because each value has at least two spaces on each side. -- will not work because if the value in the first column is too long, the value in the second column may only have one space next to it. Check the last row of the original string.
last step, once string now looks something like: "app115 app225 Knotes BPMN-RPA Studio Fishing Funds...", make into an array using string.split(" ")
Hope my question makes sense.
Thanks for any help
Solution 1:[1]
This regex will extract the first column that is of fixed length ( 38 here ).
It's a template, can be adapted to work to get any column though.
It also trims leading and trailing whitespace. (?<=^\s*(?!\s)).{1,38}(?<!\s)(?<=^.{1,38})|^(?=\s{38})
This is a single operation and is a Template that is valid only when using variable
length look behind construct engines like JS and C#.
The regex is no more complex than putting together a password regex.
(?<= # Alignment using a look behind assertion
^ \s* # Beginning of line, optional ws
(?! \s ) # Not a ws forward
)
.{1,38} # 1-38 characters width column
(?<! \s ) # Look behind assertion for trailing ws trim
(?<= ^ .{1,38} ) # Look behind assertion to fix overall length to 38
|
^ # Or the entire column is WS
(?= \s{38} ) # Check with look ahead asserstion
column1 = table.match( /(?<=^\s*(?!\s)).{1,38}(?<!\s)(?<=^.{1,38})|^(?=\s{38})/gm )
console.log(column1)
<script>
const table = `app115 115.115 winget
app225 115.115Chrome winget
Knotes 1MHz.Knotes winget
BPMN-RPA Studio 1ic.BPMN-RPAstudio winget
Fishing Funds 1zilc.FishingFunds winget
3601 360.360Chrome winget
1zilc.FishingFunds winget
3602 360.360Chrome.X winget
3603 360.360CleanMaster winget
3604 360.360se winget
3CX Call Flow Designer (.exe edition) 3CX.3CXCallFlowDesigner winget
Call Flow Designer (.exe edition) 3CX.3CXCallFlowDesigner winget
Flow Designer (.exe edition) 3CX.3CXCallFlowDesigner winget`
</script>
To generalize the above to get any column, just the column offset (in characters)
and the column width are needed. These can be plugged into this regex template:
(?:(?<=^.{N}\s*(?!\s)).{1,W}(?<!\s)(?<=^.{N}.{1,W})|(?<=^.{N})(?=\s{W}))
Where N is the Offset up to the column. W is the Width of the column.
In the linked example, N = 10, W = 38.
(?:
(?<= # Alignment using a look behind assertion
^ .{N} \s* # Offset to column and leading ws trim
(?! \s ) # Not a ws forward
)
.{1,W} # 1 - width column characters
(?<! \s ) # Not a ws behind for trailing ws trim
(?<= # Behind col offset and 1 - width, to fix overall length
^ .{N} .{1,W}
)
| # Or the entire column is WS
(?<= ^ .{N} ) # Alignment behind offset to column
(?= \s{W} ) # Ahead insure entire column is ws
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
