'AttributeError: 'str' object has no attribute 'values' (compare dataframes)

I am a beginner at Python, and I have some issues with a code I wrote.

I have 2 dataframes: one with general informations about books (dfMediaGe), and the other with the names of books which were shown on TV (dfTV).

My goal is to compare them, and to fill the column 'At least 1 TV emission' in dfMediaGe with a 1 if the book appears in the dfTV dataframe.

My difficulty is that the dataframes do not have the same number of lines/columns.

Here is a sample of dfMediaGe :

    Titre   original_title  AUTEUR  DATE    EDITEUR THEMESIMPLE THEME   GENRE   rating  rating_average  ... current_count   done_count  list_count  recommend_count review_count    TRADUITDE   LANGUEECRITURE  NOTE    At least 1 TV emission  id
0   La souris des dents NaN Roger, Marie-Sabine|Desbons, Marie  01/01/2021  Lito    TIPJJ001 Eveil  J000100 Jeunesse - Eveil et Fiction / Histoire...   GJEU003 Jeunesse / Mini albums|GJEU013 Jeuness...   NaN NaN ... 0.0 0.0 0.0 0.0 0.0 NaN fre NaN 0   46220676.0

1   La petite mare du grand crocodile   NaN Buteau, Gaëlle|Hudrisier, Cécile    01/01/2021  Lito    TIPJJ001 Eveil  J000100 Jeunesse - Eveil et Fiction / Histoire...   GJEU003 Jeunesse / Mini albums|GJEU013 Jeuness...   NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 NaN fre NaN 46220678.0

and here is a sample of dfTV :

    Titre   AUTEUR  DATE    EDITEUR GENRE   THEMESIMPLE TRADUITDE   NOTE    THEME   LANGUEECRITURE  FORMATNUMERIQUE PUBLIC  MATIERE LEXIQUE DESCRIPTION
0   Les strates Bagieu, Pénélope    11/12/2021  Gallimard   NaN TIPBD001 Albums NaN NaN T090200 Bandes dessinées / Bandes dessinées fre NaN NaN NaN NaN 1 vol. ; illustrations en noir et blanc ; 24 x...

And here is the code I wrote, which is not working at all.

  for Titre, r in dfMediaGe.iterrows():
    for Titre, r in dfTV.iterrows():
        p = 0
        if r['Titre'].values == (dfTV['Titre']).values.any():
            p = 1
            r['Au moins 1 passage TV'].append(p)

I do get this error :

AttributeError: 'str' object has no attribute 'values'

Thank you very much for your help !!



Solution 1:[1]

I don't think your two data frames not having the same amount of columns is a problem.

You can achieve what you are looking for using this:

data_dfMediaGe = [
    ['Les strates Bagieu'],
    ['La petite mare du grand crocodile'],
    ['La souris des dents NaN Roger'],
    ['Movie XYZ']
]
dfMediaGe = pd.DataFrame(data=data_dfMediaGe, columns=['Titre'])
dfMediaGe['Au moins 1 passage TV'] = 0

data_dfTV = [
    ['La petite mare du grand crocodile'],
    ['Movie XYZ']
]
dfTV = pd.DataFrame(data=data_dfTV, columns=['Titre'])

for i, row in dfMediaGe.iterrows():
    if row['Titre'] in list(dfTV['Titre']):
        dfMediaGe.at[i, 'Au moins 1 passage TV'] = 1

print(dfMediaGe)


                               Titre  Au moins 1 passage TV
0                 Les strates Bagieu                      0
1  La petite mare du grand crocodile                      1
2      La souris des dents NaN Roger                      0
3                          Movie XYZ                      1

All you have to do is iterate through rows in dfMediaGe and check if the value in the Titrecolumn is present in dfTV in the Titre column.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Adrian Mole