'Rearranging dataset to make plot with ggplot for specific time-period

I have a long data.frame with daily precipitation values. What I want to do is:

  1. Select a specific time-frame
  2. Group and plot the average values by day and month

To select a specific time-period I've used the openair package:

library(openair)
filtered <- selectByDate(mydata, year = 2017:2020, month = c(9, 10, 11, 12, 1, 2, 3))

Which gives this:

   
           date value
  1: 2017-01-01  0.04
  2: 2017-01-02  0.08
  3: 2017-01-03  8.83
  4: 2017-01-04  1.51
  5: 2017-01-05  7.71
 ---                 
845: 2020-12-27  0.82
846: 2020-12-28  2.29
847: 2020-12-29  2.29
848: 2020-12-30  0.03
849: 2020-12-31  2.36
         

However, I need dates in Day-Month format, so I did:

dm.date <- format(filtered$date, "%d-%m")
filtered$dm <- factor(dm.date) 

And got:

         date value    dm
1: 2017-01-01  0.04 01-01
2: 2017-01-02  0.08 02-01
3: 2017-01-03  8.83 03-01
4: 2017-01-04  1.51 04-01
5: 2017-01-05  7.71 05-01
6: 2017-01-06  1.62 06-01 

What I want to do is summarize the information for each day and month (DD-MM), the dm column in my new dataset. So I've used dplyr's group_by:


library(dplyr)
data.new <- filtered %>% group_by(dm) %>% summarise(avg = mean(value))
data.new

       dm     avg
1   01-01  2.6400
2   01-02  1.9575
3   01-03 11.4300
4   01-09  0.8150
5   01-10  3.0200
6   01-11  4.8975
7   01-12 34.0525
8   02-01  6.2275
9   02-02  8.6775
10  02-03  9.8800
11  02-09  0.1150
12  02-10  2.1925
13  02-11  1.2000
14  02-12 12.2425
15  03-01  6.7125
16  03-02 14.9300
17  03-03  4.8825
18  03-09  0.1250
19  03-10  1.5175
20  03-11  1.5275
21  03-12  5.3400
22  04-01  9.7525
23  04-02 13.5575
24  04-03  3.9450
25  04-09  0.4975
26  04-10  1.4725
27  04-11 11.8100
28  04-12  6.7850
29  05-01  9.2700
30  05-02 16.5575
31  05-03  2.6850
32  05-09  0.3225
33  05-10  0.6150
34  05-11  3.0500
35  05-12  9.5600
36  06-01  2.3375
37  06-02 16.0350
38  06-03  3.7325
39  06-09  0.0825
40  06-10  0.4375
41  06-11  3.4400
42  06-12 11.6975
43  07-01  7.2575
44  07-02  7.1350
45  07-03  3.2650
46  07-09  0.0125
47  07-10  1.1125
48  07-11  6.7450
49  07-12 16.1550
50  08-01  4.9225
51  08-02  2.0500
52  08-03  8.9175
53  08-09  0.0075
54  08-10  3.2425
55  08-11 18.9150
56  08-12  8.2575
57  09-01  0.5700
58  09-02  1.9050
59  09-03  2.3925
60  09-09  0.0000
61  09-10  5.3925
62  09-11 11.2500
63  09-12  4.1200
64  10-01  2.9100
65  10-02  5.6975
66  10-03 11.8350
67  10-09  0.0375
68  10-10  2.5375
69  10-11  7.2300
70  10-12  6.0800
71  11-01  2.0850
72  11-02 13.2200
73  11-03  6.9900
74  11-09  0.0700
75  11-10  0.7775
76  11-11 10.1525
77  11-12  7.2675
78  12-01  4.9575
79  12-02  5.9300
80  12-03  1.7375
81  12-09  0.2600
82  12-10  0.9650
83  12-11  8.5350
84  12-12  5.3500
85  13-01  8.0350
86  13-02 12.4100
87  13-03  2.5500
88  13-09  0.0250
89  13-10  0.9425
90  13-11  5.0075
91  13-12  4.6200
92  14-01  9.2025
93  14-02  1.8525
94  14-03  2.9325
95  14-09  8.0950
96  14-10  5.1650
97  14-11  4.9600
98  14-12  0.8575
99  15-01  5.1700
100 15-02  2.5900
101 15-03  1.9800
102 15-09  2.6225
103 15-10  5.5625
104 15-11  7.4550
105 15-12  2.0725
106 16-01  7.7775
107 16-02 10.3700
108 16-03  5.9400
109 16-09  0.2925
110 16-10  1.1475
111 16-11  9.3025
112 16-12  2.8100
113 17-01  6.5500
114 17-02  9.6000
115 17-03  7.7925
116 17-09  3.4125
117 17-10  2.7100
118 17-11  1.7425
119 17-12  6.2200
120 18-01  4.0150
121 18-02  3.6025
122 18-03  4.8150
123 18-09  1.0450
124 18-10  3.2725
125 18-11 11.6025
126 18-12  2.9650
127 19-01  1.5075
128 19-02  3.2500
129 19-03 12.1250
130 19-09  0.2450
131 19-10  5.6000
132 19-11 16.7350
133 19-12  3.4350
134 20-01  3.6975
135 20-02  2.1825
136 20-03  6.1700
137 20-09  5.0525
138 20-10  0.5350
139 20-11 14.2575
140 20-12  2.5000
141 21-01  1.6850
142 21-02  6.4650
143 21-03  9.8800
144 21-09  3.9850
145 21-10  4.8875
146 21-11  5.6325
147 21-12  6.4300
148 22-01  4.6950
149 22-02  8.7150
150 22-03  5.7250
151 22-09  2.0300
152 22-10  6.4875
153 22-11 11.7175
154 22-12 12.5875
155 23-01  7.8350
156 23-02  5.0875
157 23-03  1.4150
158 23-09  0.4300
159 23-10  3.0675
160 23-11 12.5700
161 23-12 20.4850
162 24-01 23.4325
163 24-02  3.7100
164 24-03  0.3150
165 24-09  0.4125
166 24-10  4.0375
167 24-11  4.2075
168 24-12 16.4575
169 25-01  3.0150
170 25-02  6.2950
171 25-03  0.3500
172 25-09  3.8625
173 25-10  9.7725
174 25-11  4.8350
175 25-12 10.5150
176 26-01  1.1850
177 26-02  7.8750
178 26-03  0.2625
179 26-09  7.8450
180 26-10  6.4100
181 26-11  1.1125
182 26-12  9.3075
183 27-01  0.7025
184 27-02 14.1325
185 27-03  0.3975
186 27-09  1.6325
187 27-10  5.0350
188 27-11  6.5425
189 27-12  7.8400
190 28-01  2.4850
191 28-02 13.2125
192 28-03  1.9425
193 28-09  1.7775
194 28-10  5.6000
195 28-11  4.9150
196 28-12  6.2075
197 29-01  7.4350
198 29-02 15.9200
199 29-03  7.2550
200 29-09  1.4950
201 29-10  6.2625
202 29-11  1.8150
203 29-12  3.1350
204 30-01  6.4025
205 30-03  2.0925
206 30-09  6.9225
207 30-10 11.1275
208 30-11  7.5550
209 30-12  2.5000
210 31-01  1.6925
211 31-03  4.7750
212 31-10  5.4025
213 31-12  7.5950

Here I'm having two big troubles:

1) My approach returns the data.new dataset with dates ordered. For example, all first days of each month, then all second days, and so on... What I want is the dm column to be: 01-01, 02-01, 03-01, ... 30-12, 31-12.

2)

I want to use ggplot2 to make a plot representing the averaged values across a time-specific season, starting from September 1st (01-09) and ending in March 31 (31-03) representing for example spring/summer days, which means the average of values in this time-period, and not starting at January 01 (01-01) and ending at December 31 (31-01) as it would be plotted once this dataset got ordered correctly.

Any thoughts on how to solve that or to make it easier?



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source