'Rearranging dataset to make plot with ggplot for specific time-period
I have a long data.frame with daily precipitation values. What I want to do is:
- Select a specific time-frame
- Group and plot the average values by day and month
To select a specific time-period I've used the openair package:
library(openair)
filtered <- selectByDate(mydata, year = 2017:2020, month = c(9, 10, 11, 12, 1, 2, 3))
Which gives this:
date value
1: 2017-01-01 0.04
2: 2017-01-02 0.08
3: 2017-01-03 8.83
4: 2017-01-04 1.51
5: 2017-01-05 7.71
---
845: 2020-12-27 0.82
846: 2020-12-28 2.29
847: 2020-12-29 2.29
848: 2020-12-30 0.03
849: 2020-12-31 2.36
However, I need dates in Day-Month format, so I did:
dm.date <- format(filtered$date, "%d-%m")
filtered$dm <- factor(dm.date)
And got:
date value dm
1: 2017-01-01 0.04 01-01
2: 2017-01-02 0.08 02-01
3: 2017-01-03 8.83 03-01
4: 2017-01-04 1.51 04-01
5: 2017-01-05 7.71 05-01
6: 2017-01-06 1.62 06-01
What I want to do is summarize the information for each day and month (DD-MM), the dm column in my new dataset. So I've used dplyr's group_by:
library(dplyr)
data.new <- filtered %>% group_by(dm) %>% summarise(avg = mean(value))
data.new
dm avg
1 01-01 2.6400
2 01-02 1.9575
3 01-03 11.4300
4 01-09 0.8150
5 01-10 3.0200
6 01-11 4.8975
7 01-12 34.0525
8 02-01 6.2275
9 02-02 8.6775
10 02-03 9.8800
11 02-09 0.1150
12 02-10 2.1925
13 02-11 1.2000
14 02-12 12.2425
15 03-01 6.7125
16 03-02 14.9300
17 03-03 4.8825
18 03-09 0.1250
19 03-10 1.5175
20 03-11 1.5275
21 03-12 5.3400
22 04-01 9.7525
23 04-02 13.5575
24 04-03 3.9450
25 04-09 0.4975
26 04-10 1.4725
27 04-11 11.8100
28 04-12 6.7850
29 05-01 9.2700
30 05-02 16.5575
31 05-03 2.6850
32 05-09 0.3225
33 05-10 0.6150
34 05-11 3.0500
35 05-12 9.5600
36 06-01 2.3375
37 06-02 16.0350
38 06-03 3.7325
39 06-09 0.0825
40 06-10 0.4375
41 06-11 3.4400
42 06-12 11.6975
43 07-01 7.2575
44 07-02 7.1350
45 07-03 3.2650
46 07-09 0.0125
47 07-10 1.1125
48 07-11 6.7450
49 07-12 16.1550
50 08-01 4.9225
51 08-02 2.0500
52 08-03 8.9175
53 08-09 0.0075
54 08-10 3.2425
55 08-11 18.9150
56 08-12 8.2575
57 09-01 0.5700
58 09-02 1.9050
59 09-03 2.3925
60 09-09 0.0000
61 09-10 5.3925
62 09-11 11.2500
63 09-12 4.1200
64 10-01 2.9100
65 10-02 5.6975
66 10-03 11.8350
67 10-09 0.0375
68 10-10 2.5375
69 10-11 7.2300
70 10-12 6.0800
71 11-01 2.0850
72 11-02 13.2200
73 11-03 6.9900
74 11-09 0.0700
75 11-10 0.7775
76 11-11 10.1525
77 11-12 7.2675
78 12-01 4.9575
79 12-02 5.9300
80 12-03 1.7375
81 12-09 0.2600
82 12-10 0.9650
83 12-11 8.5350
84 12-12 5.3500
85 13-01 8.0350
86 13-02 12.4100
87 13-03 2.5500
88 13-09 0.0250
89 13-10 0.9425
90 13-11 5.0075
91 13-12 4.6200
92 14-01 9.2025
93 14-02 1.8525
94 14-03 2.9325
95 14-09 8.0950
96 14-10 5.1650
97 14-11 4.9600
98 14-12 0.8575
99 15-01 5.1700
100 15-02 2.5900
101 15-03 1.9800
102 15-09 2.6225
103 15-10 5.5625
104 15-11 7.4550
105 15-12 2.0725
106 16-01 7.7775
107 16-02 10.3700
108 16-03 5.9400
109 16-09 0.2925
110 16-10 1.1475
111 16-11 9.3025
112 16-12 2.8100
113 17-01 6.5500
114 17-02 9.6000
115 17-03 7.7925
116 17-09 3.4125
117 17-10 2.7100
118 17-11 1.7425
119 17-12 6.2200
120 18-01 4.0150
121 18-02 3.6025
122 18-03 4.8150
123 18-09 1.0450
124 18-10 3.2725
125 18-11 11.6025
126 18-12 2.9650
127 19-01 1.5075
128 19-02 3.2500
129 19-03 12.1250
130 19-09 0.2450
131 19-10 5.6000
132 19-11 16.7350
133 19-12 3.4350
134 20-01 3.6975
135 20-02 2.1825
136 20-03 6.1700
137 20-09 5.0525
138 20-10 0.5350
139 20-11 14.2575
140 20-12 2.5000
141 21-01 1.6850
142 21-02 6.4650
143 21-03 9.8800
144 21-09 3.9850
145 21-10 4.8875
146 21-11 5.6325
147 21-12 6.4300
148 22-01 4.6950
149 22-02 8.7150
150 22-03 5.7250
151 22-09 2.0300
152 22-10 6.4875
153 22-11 11.7175
154 22-12 12.5875
155 23-01 7.8350
156 23-02 5.0875
157 23-03 1.4150
158 23-09 0.4300
159 23-10 3.0675
160 23-11 12.5700
161 23-12 20.4850
162 24-01 23.4325
163 24-02 3.7100
164 24-03 0.3150
165 24-09 0.4125
166 24-10 4.0375
167 24-11 4.2075
168 24-12 16.4575
169 25-01 3.0150
170 25-02 6.2950
171 25-03 0.3500
172 25-09 3.8625
173 25-10 9.7725
174 25-11 4.8350
175 25-12 10.5150
176 26-01 1.1850
177 26-02 7.8750
178 26-03 0.2625
179 26-09 7.8450
180 26-10 6.4100
181 26-11 1.1125
182 26-12 9.3075
183 27-01 0.7025
184 27-02 14.1325
185 27-03 0.3975
186 27-09 1.6325
187 27-10 5.0350
188 27-11 6.5425
189 27-12 7.8400
190 28-01 2.4850
191 28-02 13.2125
192 28-03 1.9425
193 28-09 1.7775
194 28-10 5.6000
195 28-11 4.9150
196 28-12 6.2075
197 29-01 7.4350
198 29-02 15.9200
199 29-03 7.2550
200 29-09 1.4950
201 29-10 6.2625
202 29-11 1.8150
203 29-12 3.1350
204 30-01 6.4025
205 30-03 2.0925
206 30-09 6.9225
207 30-10 11.1275
208 30-11 7.5550
209 30-12 2.5000
210 31-01 1.6925
211 31-03 4.7750
212 31-10 5.4025
213 31-12 7.5950
Here I'm having two big troubles:
1) My approach returns the data.new dataset with dates ordered. For example, all first days of each month, then all second days, and so on... What I want is the dm column to be: 01-01, 02-01, 03-01, ... 30-12, 31-12.
2)
I want to use ggplot2 to make a plot representing the averaged values across a time-specific season, starting from September 1st (01-09) and ending in March 31 (31-03) representing for example spring/summer days, which means the average of values in this time-period, and not starting at January 01 (01-01) and ending at December 31 (31-01) as it would be plotted once this dataset got ordered correctly.
Any thoughts on how to solve that or to make it easier?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
