'How to convert (aa)^* U (bb)^* to regex in python?

I am doing a textbook problem and I am trying to understand how regex works with Python. In python, I tried doing:

"(aa)*|(bb)*"

I seem to be accepting any string and I also want to limit accepting strings that have only {a,b}. I am new to regex, so referring to documentation is a bit confusing at the moment, too. I understand at the moment that as long as our strings are even, then it should be accepted. I appreciate any advice.



Solution 1:[1]

Your regex accepts 0 or more (*).

Try this:

^(?:(?:aa)+|(?:bb)+)$

Explanation:

  • ^ - Search from start of string
  • (?:(?:aa)+|(?:bb)+) - Search for 1 or more occurences of aa or bb
  • $ - All the way until the end of the string.

Solution 2:[2]

You might use a capture group with an optionally repeated backreference to:

  • Match strings like aa bb aaaa bbbb

  • Not match strings like a b aabb or an empty string

Pattern

^(([ab])\2)\1*$

Explanation

  • ^ Start of string
  • ( Capture group 1
    • ([ab])\2 Capture group 2, match either a or b followed by a backreference to group 2
  • ) Close group 1
  • \1* Optionally repeat a backreference to what is captured in group 1
  • $ End of string

See a regex 101 demo.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Bharel
Solution 2 The fourth bird