'AttributeError: 'str' object has no attribute 'values' (compare dataframes)
I am a beginner at Python, and I have some issues with a code I wrote.
I have 2 dataframes: one with general informations about books (dfMediaGe), and the other with the names of books which were shown on TV (dfTV).
My goal is to compare them, and to fill the column 'At least 1 TV emission' in dfMediaGe with a 1 if the book appears in the dfTV dataframe.
My difficulty is that the dataframes do not have the same number of lines/columns.
Here is a sample of dfMediaGe :
Titre original_title AUTEUR DATE EDITEUR THEMESIMPLE THEME GENRE rating rating_average ... current_count done_count list_count recommend_count review_count TRADUITDE LANGUEECRITURE NOTE At least 1 TV emission id
0 La souris des dents NaN Roger, Marie-Sabine|Desbons, Marie 01/01/2021 Lito TIPJJ001 Eveil J000100 Jeunesse - Eveil et Fiction / Histoire... GJEU003 Jeunesse / Mini albums|GJEU013 Jeuness... NaN NaN ... 0.0 0.0 0.0 0.0 0.0 NaN fre NaN 0 46220676.0
1 La petite mare du grand crocodile NaN Buteau, Gaëlle|Hudrisier, Cécile 01/01/2021 Lito TIPJJ001 Eveil J000100 Jeunesse - Eveil et Fiction / Histoire... GJEU003 Jeunesse / Mini albums|GJEU013 Jeuness... NaN NaN 0.0 0.0 0.0 0.0 0.0 0.0 NaN fre NaN 46220678.0
and here is a sample of dfTV :
Titre AUTEUR DATE EDITEUR GENRE THEMESIMPLE TRADUITDE NOTE THEME LANGUEECRITURE FORMATNUMERIQUE PUBLIC MATIERE LEXIQUE DESCRIPTION
0 Les strates Bagieu, Pénélope 11/12/2021 Gallimard NaN TIPBD001 Albums NaN NaN T090200 Bandes dessinées / Bandes dessinées fre NaN NaN NaN NaN 1 vol. ; illustrations en noir et blanc ; 24 x...
And here is the code I wrote, which is not working at all.
for Titre, r in dfMediaGe.iterrows():
for Titre, r in dfTV.iterrows():
p = 0
if r['Titre'].values == (dfTV['Titre']).values.any():
p = 1
r['Au moins 1 passage TV'].append(p)
I do get this error :
AttributeError: 'str' object has no attribute 'values'
Thank you very much for your help !!
Solution 1:[1]
I don't think your two data frames not having the same amount of columns is a problem.
You can achieve what you are looking for using this:
data_dfMediaGe = [
['Les strates Bagieu'],
['La petite mare du grand crocodile'],
['La souris des dents NaN Roger'],
['Movie XYZ']
]
dfMediaGe = pd.DataFrame(data=data_dfMediaGe, columns=['Titre'])
dfMediaGe['Au moins 1 passage TV'] = 0
data_dfTV = [
['La petite mare du grand crocodile'],
['Movie XYZ']
]
dfTV = pd.DataFrame(data=data_dfTV, columns=['Titre'])
for i, row in dfMediaGe.iterrows():
if row['Titre'] in list(dfTV['Titre']):
dfMediaGe.at[i, 'Au moins 1 passage TV'] = 1
print(dfMediaGe)
Titre Au moins 1 passage TV
0 Les strates Bagieu 0
1 La petite mare du grand crocodile 1
2 La souris des dents NaN Roger 0
3 Movie XYZ 1
All you have to do is iterate through rows in dfMediaGe and check if the value in the Titrecolumn is present in dfTV in the Titre column.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Adrian Mole |
