'Regex to match two identical consecutive characters
Let say I have two lines:
- oaupiimsplsvcsie
- apjhkutpoiegnxfx
I want to match any character [a-z] that is immediately followed by the same character.
The first line should match because of "ii", but the second line should not match.
Here is my example code that does not work:
fn helper_regex(text: &str) -> (bool, Vec<usize>) {
lazy_static! {
static ref RE: RegexSet = RegexSet::new(&[
r"[a-z]{1,}",
]).unwrap();
}
let matches: Vec<_> = RE.matches(text).into_iter().collect();
}
Solution 1:[1]
As per the comments, it can not be done "automatically", it may require a bit more of manual work, so here is a recap of the options:
- Manually match pairs:
aa|bb|cc|dd|ee|ff|gg|hh|ii|..|zz - Use
fancy-regexthat supports it (although it may be slow in some cases) - Not regex based, but a plain iteration may do:
fn match_consecutive(s: &str) -> bool {
let i1 = s.chars();
let mut i2 = s.chars();
if i2.next().is_none() {
return false;
}
for (c1, c2) in i1.zip(i2) {
if c1 == c2 {
return true;
}
}
false
}
fn main() {
let should_match = "oaupiimsplsvcsie";
let should_not_match = "apjhkutpoiegnxfx";
assert_eq!(match_consecutive(should_match), true);
assert_eq!(match_consecutive(should_not_match), false);
}
From @SvenMarnach, can be done iterating over the bytes representation too (ASCII only):
fn match_consecutive(s: &str) -> bool {
s.as_bytes().windows(2).any(|x| x[0] == x[1])
}
Or with itertools::tuple_windows:
use itertools::Itertools; // 0.10.3
fn match_consecutive(s: &str) -> bool {
s.chars().tuple_windows().any(|(a, b)| a == b)
}
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 |
