'regular expression to split a string with comma outside parentheses with more than one level python

I have a string like this in python

filter="eq(Firstname,test),eq(Lastname,ltest),OR(eq(ContactID,12345),eq(ContactID,123456))"
    rx_comma = re.compile(r"(?:[^,(]|\([^)]*\))+")
    result = rx_comma.findall(filter)

Actual result is:

['eq(Firstname,test)', 'eq(Lastname,ltest)', 'OR(eq(ContactID,12345)', 'eq(ContactID,123456))']

Expected result is:

['eq(Firstname,test)', 'eq(Lastname,ltest)', 'OR(eq(ContactID,12345),eq(ContactID,123456))']

Any help is appreciated.



Solution 1:[1]

The OP's issue was already solved by using the regex module though, I'd like to introduce pyparsing as an alternative solution here. It can be installed by the following command:

pip install pyparsing

Code:

import pyparsing as pp
s = "eq(Firstname,test),eq(Lastname,ltest),OR(eq(ContactID,12345),eq(ContactID,123456))"
expr = pp.delimited_list(pp.original_text_for(pp.Regex(r'.*?(?=\()') + pp.nested_expr('(', ')')))
output = expr.parse_string(s).as_list()
assert output == ['eq(Firstname,test)', 'eq(Lastname,ltest)', 'OR(eq(ContactID,12345),eq(ContactID,123456))']

Explanation:

The key point is the expr in the above code. I added some explanatory comments to its definition as follows:

pp.delimited_list( # Separate a given string at the default comma delimiter
    pp.original_text_for( # Get original text instead of completely parsed elements.
        pp.Regex(r'.*?(?=\()') # Search everything before the first opening parenthesis '('
        + pp.nested_expr('(', ')') # Parse nested parentheses
    )
)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 quasi-human