'Capturing whitespace separating a group and an optional group

I'm trying to implement a regex with the following task. A string contains a state name. At the end of the State name, optional parentheses may contain additional informations. Examples of valid strings:

  • New York, US
  • California, United States of America (USA)
  • Massachusetts, United States of America(USA)

Between the State name and the first parenthesis, a space may be present. The regex should extract the State name, dropping the optional content, as well as the space separating the State name and the optional content. At the moment I am using the following regex:

(?P<country>[A-Za-z ,]+)(?: {0,1})(?=[(])?(?:[(]\w*[)])?

Unfortunately, however, due to the greedyness of (?: {0,1})(?=[(])? the whitespace separating the State name and the optional content never gets captured, as shown in this regex101. The desired result would be New York, US, California, United States of America, and Massachusetts, United States of America.

Any suggestion?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source