'iterate over a df and multiply the values by the values of another df
My df1 looks like this:It contains 3 unique project id.The date starts on 01-01-22 and ends on 01-12-28
| id | date | p50 | p90 |
|---|---|---|---|
| apv1 | 01-01-22 | 1000 | 1000 |
| apv2 | 01-01-22 | 1000 | 1000 |
| tsso | 01-01-22 | 1202 | 2005 |
| apv1 | 01-02-22 | 1000 | 2000 |
| apv2 | 01-02-22 | 1400 | 5000 |
| tsso | 01-02-22 | 200 | 1000 |
| . | . | . | . |
| . | . | . | . |
| . | . | . | . |
| apv1 | 01-11-28 | 305 | 400 |
| apv2 | 01-11-28 | 300 | 200 |
| tsso | 01-11-28 | 250 | 499 |
| apv1 | 01-12-28 | 100 | 290 |
| apv2 | 01-12-28 | 145 | 650 |
| tsso | 01-12-28 | 299 | 179 |
df2 looks like this:
| mth | month | pct |
|---|---|---|
| 01 | january | 0.042 |
| 02 | febuary | 0.055 |
| 03 | march | 0.089 |
| 04 | april | 0.097 |
| 05 | may | 0.10 |
| 06 | june | 0.11 |
| 07 | july | 0.127 |
| 08 | august | 0.114 |
| 09 | sept | 0.091 |
| 10 | oct | 0.068 |
| 11 | nov | 0.043 |
| 12 | dec | 0.038 |
I want to multiply each value p50 of df1 by the value pct of df2 with mth or month value equals to the month of the date in df1. Similarly for p90.
The Final df should look like this:
| id | date | p50 | p90 |
|---|---|---|---|
| apv1 | 01-01-22 | 1000*0.042 | 1000*0.042 |
| apv2 | 01-01-22 | 1000*0.042 | 1000*0.042 |
| tsso | 01-01-22 | 1202*0.042 | 2005*0.042 |
| apv1 | 01-02-22 | 1000*0.055 | 2000*0.055 |
| apv2 | 01-02-22 | 1400*0.055 | 5000*0.055 |
| tsso | 01-02-22 | 200*0.055 | 1000*0.055 |
| . | . | . | . |
| . | . | . | . |
| . | . | . | . |
| apv1 | 01-11-28 | 305*0.043 | 400*0.043 |
| apv2 | 01-11-28 | 300*0.043 | 200*0.043 |
| tsso | 01-11-28 | 250*0.043 | 499*0.043 |
| apv1 | 01-12-28 | 100*0.038 | 290*0.038 |
| apv2 | 01-12-28 | 145*0.038 | 650*0.038 |
| tsso | 01-12-28 | 299*0.038 | 179*0.038 |
Solution 1:[1]
You should be able to use map to modify you data in place:
s = df2.set_index('mth')['pct']
pct = pd.to_datetime(df1['date']).dt.strftime('%m').map(s)
df1[['p50', 'p90']] *= pct
Solution 2:[2]
Simply assign month column using the date column. Then merge both data frames by month. Finally, run multiplication with needed columns:
final_df = (
df1.assign(mth = lambda x: pd.to_datetime(x["date"]).dt.strftime("%m"))
.merge(df2, on="mth")
.assign(
p50 = lambda x: x["p50"].mul(x["pct"]),
p90 = lambda x: x["p90"].mul(x["pct"])
)
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | mozway |
| Solution 2 | Parfait |
