'Match groups in Python

Is there a way in Python to access match groups without explicitly creating a match object (or another way to beautify the example below)?

Here is an example to clarify my motivation for the question:

Following Perl code

if    ($statement =~ /I love (\w+)/) {
  print "He loves $1\n";
}
elsif ($statement =~ /Ich liebe (\w+)/) {
  print "Er liebt $1\n";
}
elsif ($statement =~ /Je t\'aime (\w+)/) {
  print "Il aime $1\n";
}

translated into Python

m = re.search("I love (\w+)", statement)
if m:
  print "He loves",m.group(1)
else:
  m = re.search("Ich liebe (\w+)", statement)
  if m:
    print "Er liebt",m.group(1)
  else:
    m = re.search("Je t'aime (\w+)", statement)
    if m:
      print "Il aime",m.group(1)

looks very awkward (if-else-cascade, match object creation).



Solution 1:[1]

Less efficient, but simpler-looking:

m0 = re.match("I love (\w+)", statement)
m1 = re.match("Ich liebe (\w+)", statement)
m2 = re.match("Je t'aime (\w+)", statement)
if m0:
  print("He loves", m0.group(1))
elif m1:
  print("Er liebt", m1.group(1))
elif m2:
  print("Il aime", m2.group(1))

The problem with the Perl stuff is the implicit updating of some hidden variable. That's simply hard to achieve in Python because you need to have an assignment statement to actually update any variables.

The version with less repetition (and better efficiency) is this:

pats = [
    ("I love (\w+)", "He Loves {0}" ),
    ("Ich liebe (\w+)", "Er Liebe {0}" ),
    ("Je t'aime (\w+)", "Il aime {0}")
 ]
for p1, p3 in pats:
    m = re.match(p1, statement)
    if m:
        print(p3.format(m.group(1)))
        break

A minor variation that some Perl folk prefer:

pats = {
    "I love (\w+)" : "He Loves {0}",
    "Ich liebe (\w+)" : "Er Liebe {0}",
    "Je t'aime (\w+)" : "Il aime {0}",
}
for p1 in pats:
    m = re.match(p1, statement)
    if m:
        print(pats[p1].format(m.group(1)))
        break

This is hardly worth mentioning except it does come up sometimes from Perl programmers.

Solution 2:[2]

Starting Python 3.8, and the introduction of assignment expressions (PEP 572) (:= operator), we can now capture the condition value re.search(pattern, statement) in a variable (let's all it match) in order to both check if it's not None and then re-use it within the body of the condition:

if match := re.search('I love (\w+)', statement):
  print(f'He loves {match.group(1)}')
elif match := re.search("Ich liebe (\w+)", statement):
  print(f'Er liebt {match.group(1)}')
elif match := re.search("Je t'aime (\w+)", statement):
  print(f'Il aime {match.group(1)}')

Solution 3:[3]

this is not a regex solution.

alist={"I love ":""He loves"","Je t'aime ":"Il aime","Ich liebe ":"Er liebt"}
for k in alist.keys():
    if k in statement:
       print alist[k],statement.split(k)[1:]

Solution 4:[4]

You could create a helper function:

def re_match_group(pattern, str, out_groups):
    del out_groups[:]
    result = re.match(pattern, str)
    if result:
        out_groups[:len(result.groups())] = result.groups()
    return result

And then use it like this:

groups = []
if re_match_group("I love (\w+)", statement, groups):
    print "He loves", groups[0]
elif re_match_group("Ich liebe (\w+)", statement, groups):
    print "Er liebt", groups[0]
elif re_match_group("Je t'aime (\w+)", statement, groups):
    print "Il aime", groups[0]

It's a little clunky, but it gets the job done.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Neuron
Solution 2
Solution 3 ghostdog74
Solution 4 Adam Rosenfield