'Converting dates into day of year in pandas data frame

In the data frame df1, how to convert q1,....q9,q0,...d5 days into day of years?

YYYY,MM,q1,q2,q3,q4,q5,q6,q7,q8,q9,q0,d1,d2,d3,d4,d5
1975,01,2,11,12,26,25,10,29,21,30,22,8,7,14,4,13
1975,02,27,22,8,20,6,26,21,4,19,9,10,1,11,12,23
1975,03,8,7,21,22,25,9,4,30,2,19,10,11,28,12,27
1975,04,29,28,27,17,19,2,30,16,18,3,9,10,11,8,13

What I have tried is

from datetime import datetime
day_of_year = datetime.now().timetuple().tm_yday

But it is not working. Please help.



Solution 1:[1]

IIUC:

# Flat your dataframe to vectorize datetime operation
out = df.melt(['YYYY', 'MM'], ignore_index=False).astype(str)

# Compute day of year
out['value'] = pd.to_datetime(out['YYYY']+'-'+out['MM']+'-'+out['value']).dt.dayofyear

# Reshape your dataframe as your original
df = out.reset_index().pivot(['index', 'YYYY', 'MM'], 'variable', 'value') \
        .droplevel(0).reset_index().rename_axis(columns=None)[df.columns]

Output:

>>> df
   YYYY MM   q1   q2   q3   q4   q5  q6   q7   q8   q9  q0  d1   d2   d3  d4   d5
0  1975  1    2   11   12   26   25  10   29   21   30  22   8    7   14   4   13
1  1975  2   58   53   39   51   37  57   52   35   50  40  41   32   42  43   54
2  1975  3   67   66   80   81   84  68   63   89   61  78  69   70   87  71   86
3  1975  4  119  118  117  107  109  92  120  106  108  93  99  100  101  98  103

Solution 2:[2]

You can subtract (January 1st) of the same year from a date, to get the day of the year:

from datetime import datetime

def day_of_year(y, m, d):
    return (datetime(y, m, d) - datetime(y, 1, 1)).days + 1

print(day_of_year(1975,2,27))
# 58

Solution 3:[3]

You can avoid wide to long and back to wide by using apply; you should get some speedup by avoiding the round trip:

outcome = (df
           .filter(regex = r".+\d$")
           .astype(str)
           .apply(lambda m: pd.to_datetime(df.YYYY.astype(str) 
                                           + "-" 
                                           + df.MM.astype(str) 
                                           + "-" 
                                           + m)
                                           .dt.dayofyear)
           )

df.assign(**outcome)


   YYYY  MM   q1   q2   q3   q4   q5  q6   q7   q8   q9  q0  d1   d2   d3  d4   d5
0  1975   1    2   11   12   26   25  10   29   21   30  22   8    7   14   4   13
1  1975   2   58   53   39   51   37  57   52   35   50  40  41   32   42  43   54
2  1975   3   67   66   80   81   84  68   63   89   61  78  69   70   87  71   86
3  1975   4  119  118  117  107  109  92  120  106  108  93  99  100  101  98  103

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Corralien
Solution 2 Stef
Solution 3 sammywemmy