'How to create new column in pandas dataframe with week of year from datetime64 ns without SettingWithCopyWarning?
I went thru multiple articles explaining this, but I failed to understand how their examples are comparable to my case. I also looked up some on this site. Here is my issue:
I want to be able to have column with datetime that can be some period - week/ month/ quarter (depending on request, but let´s say we want week). Later I want to be able to tap into last week´s data and work with that without getting SettingWithCopyWarning.
df = pd.read_excel("file path", engine="openpyxl")
df_clear = df.dropna()
Now I have clear dataframe full of data with multiple columns one of which contains dates in format datetime64 [ns]. Let´s call this column "some dates". Further in my code I would like to reference to brand new column that will contain week of the year from column "some dates". I tried multiple things some work, some don´t, but all of them give me SettingWithCopyWarning.
| column1 | column2 | some dates | column3 |
|---|---|---|---|
| data | data | 2020-10-10 07:07:07.777 | data |
I would like to have
| column1 | column2 | some dates | column3 | new_column |
|---|---|---|---|---|
| data | data | 2020-10-10 07:07:07.777 | data | 6 |
or even this if it would work:
| column1 | column2 | some dates | column3 | new_column |
|---|---|---|---|---|
| data | data | 2020-10-10 07:07:07.777 | data | 2020-wk 6 |
I know I can use df_clear["some dates"] instead of df_clear.loc[:, 'some dates'], but I was trying to get rid of that Warning at all cost.
Example 1 : This works, but gives me Warning (prints df + new column with only # of week)
df_clear['new_column']=df_clear.loc[:, 'some dates'].dt.isocalendar().week
print(df_clear)
Example 2: Doesn´t work and gives me Warning (prints df + new column with data from "some dates" column - formatting doesn´t work)
df_clear['new_column']=pd.to_datetime(df_clear.loc[:, "some dates"], format="%W")
print(df_clear)
I am not sure if I can use pandas.Series.dt.strftime as I want to be able to compare week, when the code is being run (week5), to previous week(week4), so I can filter only data for week4.
EDIT: changing this
df = pd.read_excel("file path", engine="openpyxl")
df_clear = df.dropna()
into this
df = pd.read_excel("file path", engine="openpyxl")
df = df.dropna()
solved the issue ....
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
