'Last word delete but keep if it is only one

I have a series of texts that has either one word or a combination of words. I need to delete the last word its greater than 1, if not leave the last word.

Have tried the following regex:

 df["first_middle_name"] = df["full_name"].replace("\s+\S+$", "")

from this solution: Removing last words in each row in pandas dataframe

It deletes certain words keeps others.

Some examples of strings in my df['Municipio']:

Zacapa    
San Luis, **Jalapa**    
Antigua Guatemala **Sacatepéquez**    
Guatemala    
Mixco    
Sacapulas, **Jutiapa**    
Puerto Barrios, **Izabal**    
Petén **Petén**    
San Martin Jil, **Chimaltenango**

What I need for example is if it finds one word keeps that word, if it is a combination of more words (2 or more) and there is a comma or space delete the last word. See bold words.

Thank you!



Solution 1:[1]

You can apply a function to check if , in string first, then check space in string.

df['Municipio'] = df['Municipio'].apply(lambda x: ', '.join(x.split(',')[:-1]) if ',' in x
                                        else (' '.join(x.split(' ')[:-1]) if ' ' in x else x))
print(df)

           Municipio
0             Zacapa
1           San Luis
2  Antigua Guatemala
3          Guatemala
4              Mixco
5          Sacapulas
6     Puerto Barrios
7              Petén
8     San Martin Jil

If you want to keep the last comma and space

df['Municipio'] = df['Municipio'].apply(lambda x: ', '.join(x.split(',')[:-1]+['']) if ',' in x
                                        else (' '.join(x.split(' ')[:-1]+['']) if ' ' in x else x))
print(df)

            Municipio
0              Zacapa
1          San Luis,
2  Antigua Guatemala
3           Guatemala
4               Mixco
5         Sacapulas,
6    Puerto Barrios,
7              Petén
8    San Martin Jil,

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Ynjxsjmh