'Regular expression not allowing a and c to be next to each other

I'm trying to write a regular expression which doesn't allow 'a' and 'c' to be next to each other in any combination of "abc" , the combinations might be "a" , "b" , "c" , "acb" , "abac" , here "abac" must be ignored because it contains "a" and "c" next to each other , I've written a regular expression which is doing half the job correct and the other half incorrect , it's basically ignoring a , bcb , bcc and others which are not supposed to be ignored.

Here's the regular expression :

^(a?b)*c?$

Here's the output I'm getting :

[a, b, c, ba, ca, ab, cb, ac, bc, baa, caa, aba, cba, aca, bca, 
bab, cab, abb, cbb, acb, bcb, bac, cac, abc, cbc, acc, bcc]
b 
c 
ab 
bc 
bab 
abb 
abc 

Could someone please tell me what I'm doing wrong ?



Solution 1:[1]

Your expression ignores several cases:

  • anything with more than one c is ignored
  • anything with a or b coming after c is ignored (that means if there is a c that has to be the last character)
  • anything containing an a is ignored if it doesn't contain also a b after that a
  • each a must be followed by a b
  • also your grouping is probably not really the form you need. You should use (?:X) for a non capturing group.

I would suggest a regex like

^(?:(?:a(?!c))?b?c?)+$

This matches all as not followed by a c and also all bs and cs - and needs at least one occurence so that empty strings are not matched.

You can play with it and get detailed explanations at https://regex101.com/r/DoyUPG/1

Solution 2:[2]

In spite of your provided data, you said "the combinations might be "a" , "b" , "c" , "acb" , "abac" , here "abac" which indicates they could be more than just three letters. Rather than use a regex I recommend String.contains.

String[] str = { "a", "b", "c", "babbacb", "ca", "ab", "cb",
        "aeseac", "bc", "baa", "caa", "aba", "cba", "aca",
        "bca", "bab", "cab", "abb", "cbbabcda", "acb",
        "bcbacbae", "bacadbac", "adecdcac", "abc", "cbc",
        "acc", "adbbcc", "abac" };
        
for (String s : str) {
    if (!(s.contains("ac") | s.contains("ca"))) {
        System.out.println(s);
    }
}

prints

a
b
c
ab
cb
bc
baa
aba
cba
bab
abb
cbbabcda
abc
cbc
adbbcc

But if you want to use a regex then simply check for those strings that matches string that is composed of at least one ac or ca.

String regex = ".*((ac)|(ca)).*";
for (String s : str) {
    if (!s.matches(regex)) {
        System.out.println(s);
    }
}

prints the same as above.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 cyberbrain
Solution 2