'can I do this regex in just one step
I have a following text
fd types:["a"]
s types:
["b","c"]
types: [
"one"
]
types: "two"
types: ["three", "four", "five","six"]
no: ["Don't","read","this"]
and I would like to extract just all types: a to c, one to six I don't want to extract other properties like units so far I can do this in two steps
(?:types:\s*\[\s*)(.+?)(?:\s*])|(?:types:\s*)(\".+?\")
this way I got the groups:
"a"
"b","c"
"one"
"two"
"three", "four", "five","six"
then I apply
\s*"((?:[^",\s]*)*)"\s*
to any of the group and got
a
b
c
one
two
three
four
five
six
I wonder if this can be done in just one step
Solution 1:[1]
You can pip install
regex` and use
(?:\G(?!^)\s*,\s*|types:(?:\s*\[)?\s*)"([^"]*)"
See the regex demo. Details:
(?:\G(?!^)\s*,\s*|types:(?:\s*\[)?\s*)
- either of the two patterns:\G(?!^)\s*,\s*
- end of the previous match and then a comma enclosed with zero or more whitespaces|
- ortypes:(?:\s*\[)?\s*
-types:
, an optional sequence of zero or more whitespaces and a[
char, and then zero or more whitespaces
"([^"]*)"
-"
, then zero or more chars other than"
captured into Group 1, and then a"
char.
See the Python demo:
import regex
text = 'fd types:["a"] \ns types: \n ["b","c"]\ntypes: [\n"one"\n]\ntypes: "two"\ntypes: ["three", "four", "five","six"]\nno: ["Don\'t","read","this"]'
print(regex.findall(r'(?:\G(?!^)\s*,\s*|types:(?:\s*\[)?\s*)"([^"]*)"', text))
Output:
['a', 'b', 'c', 'one', 'two', 'three', 'four', 'five', 'six']
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Wiktor Stribiżew |