'How to find pattern upper case sentence

I have a texte file like following

FAKE ET FAKE
1, rue Fake - 99567 FAKE
Tél, :00 99 89 22 34 © [email protected]
FAKE-ET-FAKE.fr

FAKE AGAIN
2, rue Fake - 99567 FAKE
Tél, :00 99 89 22 34 © [email protected]
FAKE-AGAIN.fr 

STILL FAKE AGAIN ANOTHER
2, rue Fake - 99567 FAKE
Tél, :00 99 89 22 34 © [email protected]
STILL-FAKE-AGAIN-ANOTHER.fr 

with a regex I want to extract the header of each paragraph. I know that the pattern of the header is to be upper case separated with space but the number of upper case words and spaces is different

I have tried this but problem is I do not manage to make it work wathever the number of pattern "UPPER UPPER UPPER ..."

here what I have tried:

regex = r'[A-Z]+\s[A-Z]+'
re.findall(regex, text)

Here I would only find "FAKE AGAIN" in my example.

I have tried

regex = r'([A-Z]+\s[A-Z]+)+'

to say that this pattern of UPPER\s can reproduce but did not work



Solution 1:[1]

With x.txt containing your text, the following worked fine for me:

egrep -e '^[A-Z][A-Z ]*$' x.txt

Solution 2:[2]

Use

regex = r'[A-Z]+(?:[^\S\n][A-Z]+)+'

See regex proof.

EXPLANATION

 - [A-Z]+ - Match a single character in the range between A (index 65) and Z (index 90) (case sensitive) between one and unlimited times, as many times as possible, giving back as needed (greedy)
 - Non-capturing group (?:[^\S\n][A-Z]+)+
   - + - matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
   - Match a single character not present in the list below [^\S\n]
      - \S - matches any non-whitespace character (equivalent to [^\r\n\t\f\v ])
      - \n - matches a line-feed (newline) character (ASCII 10)
   - Match a single character present in the list below [A-Z]
      - + -  matches the previous token between one and unlimited times, as many times as possible, giving back as needed (greedy)
      - A-Z - matches a single character in the range between A (index 65) and Z (index 90) (case sensitive)

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ronald
Solution 2 Ryszard Czech