'Data manipulation in pandas on monthly, quarterly and annual level on multiple columns
I need to create a function which take an input as dictionary and update column values in the dataframe. My data looks following
| Date | Col_1 | Col_2 | Col_3 | Col_4 | Col_5 |
|---|---|---|---|---|---|
| 01/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 02/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 03/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 04/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 05/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 06/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 07/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 08/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 09/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 10/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 11/01/2021 | 10 | 20 | 10 | 20 | 10 |
| 12/01/2021 | 10 | 20 | 10 | 20 | 10 |
Now, if pass monthly level update in percentage for 'Col_1' and 'Col_2', say
{Date: ['01/01/2021','02/01/2021','03/01/2021','04/01/2021','05/01/2021','06/01/2021',
'07/01/2021','08/01/2021','09/01/2021','10/01/2021','11/01/2021','12/01/2021',],
'Col_1': [20,20,20,20,30,30,40,40,20,20,20,20],
'Col_2': [0,0,0,0,0,0,0,0,0,0,10,10]}
After performing this my desired out looks like this for monthly change
| Date | Col_1 | Col_2 | Col_3 | Col_4 | Col_5 |
|---|---|---|---|---|---|
| 01/01/2021 | 12 | 20 | 10 | 20 | 10 |
| 02/01/2021 | 12 | 20 | 10 | 20 | 10 |
| 03/01/2021 | 12 | 20 | 10 | 20 | 10 |
| 04/01/2021 | 12 | 20 | 10 | 20 | 10 |
| 05/01/2021 | 13 | 20 | 10 | 20 | 10 |
| 06/01/2021 | 13 | 20 | 10 | 20 | 10 |
| 07/01/2021 | 14 | 20 | 10 | 20 | 10 |
| 08/01/2021 | 14 | 20 | 10 | 20 | 10 |
| 09/01/2021 | 12 | 20 | 10 | 20 | 10 |
| 10/01/2021 | 12 | 20 | 10 | 20 | 10 |
| 11/01/2021 | 12 | 24 | 10 | 20 | 10 |
| 12/01/2021 | 12 | 24 | 10 | 20 | 10 |
Similarly, I want to update my data on Quarterly and Yearly Level too. I am able to do the Yearly update, here is my code. Please help me with MONTHY AND QUARTERLY updation based on the inputs.
Thank you!!
dic = {'col_1':10,'col_2':-5)
year = 2021
def update_df(dic,df,year):
df = df[df['date'].dt.year == year]
df = (df+df.select_dtypes(include = 'number').mul(pd.Series(dic)/100)).combine_first(df)[df.columns]
return df
I am trying like this
def update_df(dic,df,year,choice):
if choice == annual:
df = df[df['date'].dt.year == year]
df = (df+df.select_dtypes(include =
'number').mul(pd.Series(dic)/100)).combine_first(df)[df.columns]
elif choice == quarterly :
df["quarter"] = df.date.dt.quarter
df = (df+df.select_dtypes(include =
'number').mul(pd.Series(dic)/100)).combine_first(df)[df.columns]
else choice == monthly :
df["month"] = df.date.dt.month
df = (df+df.select_dtypes(include =
'number').mul(pd.Series(dic)/100)).combine_first(df)[df.columns]
return df
Solution 1:[1]
**pandas captures 4 general time related concepts:
Date times: A specific date and time with timezone support. Similar to datetime.datetime from the standard library.
Time deltas: An absolute time duration. Similar to datetime.timedelta from the standard library.
Time spans: A span of time defined by a point in time and its associated frequency.
Date offsets: A relative time duration that respects calendar arithmetic. Similar to dateutil.relativedelta.relativedelta from the dateutil package.**
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | SANA SIPOU |
