'iterate over a df and multiply the values by the values of another df
My df1 looks like this:It contains 3 unique project id
.The date
starts on 01-01-22
and ends on 01-12-28
id | date | p50 | p90 |
---|---|---|---|
apv1 | 01-01-22 | 1000 | 1000 |
apv2 | 01-01-22 | 1000 | 1000 |
tsso | 01-01-22 | 1202 | 2005 |
apv1 | 01-02-22 | 1000 | 2000 |
apv2 | 01-02-22 | 1400 | 5000 |
tsso | 01-02-22 | 200 | 1000 |
. | . | . | . |
. | . | . | . |
. | . | . | . |
apv1 | 01-11-28 | 305 | 400 |
apv2 | 01-11-28 | 300 | 200 |
tsso | 01-11-28 | 250 | 499 |
apv1 | 01-12-28 | 100 | 290 |
apv2 | 01-12-28 | 145 | 650 |
tsso | 01-12-28 | 299 | 179 |
df2 looks like this:
mth | month | pct |
---|---|---|
01 | january | 0.042 |
02 | febuary | 0.055 |
03 | march | 0.089 |
04 | april | 0.097 |
05 | may | 0.10 |
06 | june | 0.11 |
07 | july | 0.127 |
08 | august | 0.114 |
09 | sept | 0.091 |
10 | oct | 0.068 |
11 | nov | 0.043 |
12 | dec | 0.038 |
I want to multiply each value p50
of df1 by the value pct
of df2 with mth
or month
value equals to the month of the date
in df1. Similarly for p90
.
The Final df should look like this:
id | date | p50 | p90 |
---|---|---|---|
apv1 | 01-01-22 | 1000*0.042 | 1000*0.042 |
apv2 | 01-01-22 | 1000*0.042 | 1000*0.042 |
tsso | 01-01-22 | 1202*0.042 | 2005*0.042 |
apv1 | 01-02-22 | 1000*0.055 | 2000*0.055 |
apv2 | 01-02-22 | 1400*0.055 | 5000*0.055 |
tsso | 01-02-22 | 200*0.055 | 1000*0.055 |
. | . | . | . |
. | . | . | . |
. | . | . | . |
apv1 | 01-11-28 | 305*0.043 | 400*0.043 |
apv2 | 01-11-28 | 300*0.043 | 200*0.043 |
tsso | 01-11-28 | 250*0.043 | 499*0.043 |
apv1 | 01-12-28 | 100*0.038 | 290*0.038 |
apv2 | 01-12-28 | 145*0.038 | 650*0.038 |
tsso | 01-12-28 | 299*0.038 | 179*0.038 |
Solution 1:[1]
You should be able to use map
to modify you data in place:
s = df2.set_index('mth')['pct']
pct = pd.to_datetime(df1['date']).dt.strftime('%m').map(s)
df1[['p50', 'p90']] *= pct
Solution 2:[2]
Simply assign
month column using the date column. Then merge
both data frames by month. Finally, run multiplication with needed columns:
final_df = (
df1.assign(mth = lambda x: pd.to_datetime(x["date"]).dt.strftime("%m"))
.merge(df2, on="mth")
.assign(
p50 = lambda x: x["p50"].mul(x["pct"]),
p90 = lambda x: x["p90"].mul(x["pct"])
)
)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | mozway |
Solution 2 | Parfait |