'Regex in c++ for maching some patters
I want regex of this.
- add x2, x1, x0 is a valid instruction;
I want to implement this. But bit confused, how to, as I am newbie in using Regex. Can anyone share these Regex?
Solution 1:[1]
If this is a longer project and will have more requirements later, then definitely a different approach would be better.
The standard approach to solve such a problem ist to define a grammar and then created a lexer and a parser. The tools lex/yacc or flex/bison can be used for that. Or, a simple shift/reduce parser can also be hand crafted.
The language that you sketched with the given grammar, may be indeed specified with a Chomsky class 3 grammar, and can hence be produced gy a regular grammar. And, with that, parsed with regular expressions.
The specification is a little bit unclear as to what a register is and if there are more keyowrds. Especially ecall is unclear.
But how to build such a regex?
You will define small tokens and concatenate them. And different paths can be implemented with the or operator |.
Let's give sume example.
- a register may be matched with
a\d+. So, an "a" followed by ome digits. If it is not only "a", but other letters as well, you could use[a-z]\d+ - op codes with the same number of parameters can be listed up with a simple or
|. like inadd|sub - For spaces there are many solutions. you may use
\s+or[ ]+or whatever spaces you need. - To build one rule, you can concatenate what you learned so far
- Having different parts needs an or
|for the complete path - If you want to get back the matched groups, you must enclose the needed stuff in brackets
And with that, one of many many possible solutions can be:
^[ ]*((add|sub)[ ]+(a\d+)[ ]*,[ ]*(a\d+)[ ]*,[ ]*(a\d+)|(ecall))[ ]*$
See example in: regex101
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Armin Montigny |
