'Apply negative lookbehind to the entire group before it
I want to capture the model of a phone but not the storage in the title. So I don't want the regex to match xxxGB.
I am expecting to match:
iphone 13 from: "iphone 13 256gb - midnight"
iphone 13 pro max from "iphone 13 pro max 256gb - sierra blue"
iphone 13 pro from "iphone 13 pro 128gb - graphite"
galaxy tab a8 from "galaxy tab a8 wifi 128gb - grey"
The regular expression I have is
r'[A-Za-z]+\s?[A-Za-z\+\.\d]*((\spro|\smax|\slight|\smini|\splus|\sultra|\[A-Za-z]?\d+(?!gb)))*|$'
but the look behind only applied to the last number before "gb" not the entire number after the space
apple iphone 13 256gb - midnight
<re.Match object; span=(6, 18), match='iphone 13 25'>
<re.Match object; span=(32, 32), match=''>
apple iphone 13 pro 128gb - graphite
<re.Match object; span=(6, 22), match='iphone 13 pro 12'>
<re.Match object; span=(36, 36), match=''>
apple iphone 13 pro max 256gb - sierra blue
<re.Match object; span=(6, 26), match='iphone 13 pro max 25'>
<re.Match object; span=(43, 43), match=''>
samsung galaxy tab a8 wifi 128gb - grey
<re.Match object; span=(8, 21), match='galaxy tab a8'>
<re.Match object; span=(39, 39), match=''>
The testing template can be found from here: https://regex101.com/r/dn0Hyr/1
Many thanks!!
Solution 1:[1]
You may use this regex to match phone models:
^[A-Za-z]+(?: (?!wifi|\d*gb)[\dA-Za-z]+)*
RegEx Details:
^: Start[A-Za-z]+: Match 1+ letters(?: (?!wifi|\d*gb)[\dA-Za-z]+)*: Delimited by space match 1+ of letters or digits as long as word is notwifior digits followed bygb. Repeat this group 0 or more times
Solution 2:[2]
An alternative between two positive look ahead:
Figure I - Regex A
/^.*(?=\swifi\s\d{3})|^.*(?=\s\d{3})/gm
Figure II - RegEx A
| Segment | Meaning |
|---|---|
^.* |
Starting with anything BUT a newline occurring zero or more times... |
(?=\swifi\s\d{3}) |
...is a match if it is before a space, literal "wifi", a space, and 3 digits... |
| |
OR |
^.* |
...starting with anything BUT a newline occurring zero or more times... |
(?=\s\d{3}) |
...is a match if it is before a space and 3 digits. |
or a shortened version without the alternative and matches 2 and 3 digits as per The fourth bird's comment below. Note, rather than an alternative, a non-capturing group (?:wifi\s)? is nested inside the look ahead and the quantifier ? doesn't make the match a requirement just a possibility:
Figure III - RegEx B
/^.*?(?=\s(?:wifi\s)?\d{2,3}gb)/gm
Figure IV - RegEx B
| Segment | Meaning |
|---|---|
^.*? |
Starting with anything BUT a newline occurring zero or more times until... |
(?=\s(?:wifi\s)?... |
...there's a space, literal "wifi", and a space occurring once or not at all... |
...\d{2,3}gb) |
...followed by 2 or 3 digits, and literal "gb" |
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | anubhava |
| Solution 2 |
