'pandas - joining multiple lines from one dataframe to just one line in another dataframe
I have two Datframes, df1 and df2 with each different dimensions:
df1=
timestamp price ... condition (Preis Gesamt)
ISBN ...
9783411718832 1644767760 1.08 ... 2 4.03
9783411718832 1644767760 4.04 ... 4 4.04
9783411718832 1639948080 4.38 ... 3 4.38
9783411718832 1633536720 5.88 ... 3 5.88
9783411718832 1616377080 2.98 ... 2 5.98
9783411718832 1642252560 4.37 ... 3 7.37
9783411718832 1643644200 4.95 ... 3 7.95
9783411718832 1616377080 5.90 ... 3 8.89
9783411718832 1643644200 4.38 ... 4 4.38
9783411718832 1645194480 4.38 ... 2 4.38
9783411745258 1635163440 4.00 ... 3 4.00
9783411745258 1644321360 1.14 ... 4 4.14
9783411745258 1619435640 1.89 ... 3 4.89
9783411745258 1644321360 5.00 ... 2 9.19
9783411745258 1644321360 5.00 ... 2 9.80
15x6
and df2 =
Menge Lagerplatz ... Preis (VLB) Mwst.
ISBN ...
9783411718832 1 Lago237-2 ... 9.99 7.0
9783411745258 1 Lago237-2 ... 5.00 7.0
2x9
What I'm trying to do is to add a column to df2 called 'offers' which contains the values of df1 where:
df1['ISBN'] == df2['ISBN']
(In both Dataframes 'ISBN' works as the index)
My problem with all of that is, that I have to add multiple rows from df1 to just one row in df2, so I actually have to store df1 as a 'cell' in df2.
- I have no idea how to do that, and 2. I feel like there is a better way to solve that problem?
I'd be glad about any help!
Solution 1:[1]
As ISBN is your index df1['ISBN'] == df2['ISBN'] won't work because is an operation involving a column that doesn't exist in your dataframes. You should just merge or join dataframes, and that will be automatically done based on yours indexes.
Check this section of pandas documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | luka1156 |
