'How to store datasets in a list in a loop - R
Suppose I have dataset A including information on the month ("Date"):
| station ID | precipitation (mm) | LONG | LAT | Date |
|---|---|---|---|---|
| 1 | 70 | 5 | 50 | 2010-01 |
| 1 | 60 | 5 | 50 | 2010-02 |
| 1 | 61 | 5 | 50 | 2010-03 |
| 2 | 75 | 10 | 47 | 2010-01 |
| 2 | 65 | 10 | 47 | 2010-02 |
| 2 | 70 | 10 | 47 | 2010-03 |
I have a while loop that creates separate datasets from dataset A based on the month. My aim is to create a list list_months that stores all information of each dataset (i.e. month).
#used as criterium for selecting data per loop
months = c("2010-01", "2010-02","2010-03")
#used in different variable names, based on month
abr = c("jan", "feb", "mar")
#set item from list to 1 (=January)
i = 1
j = 1
#create empty list to store individual datasets that are generated for each loop
list_months = list()
#while loop. Aim: create 3 separate datasets, based on month, and store each one per loop
while(i <= length(months) && j <= length(abr))
{
#store each dataset to a different variable name, corresponding to the month
assign(paste("dataset", abr[j], sep="_"),subset(A, Date == months[i]))
#assign variable name to variable that is appended to list
ap <- paste("dataset", abr[j], sep="_")
#append variable name to list for further data processing
list_months <- append(list_months, ap)
#next loop (i.e. following month)
i = i+1
j = j+1
}
However, when I try to view the first item via view(list_months[1]) the output is as follows:
| X.dataset_jan. | |
|---|---|
| 1 | dataset_jan |
How can I store the variable name in a list in such a way that it creates the output:
| station ID | precipitation (mm) | LONG | LAT | Date |
|---|---|---|---|---|
| 1 | 70 | 5 | 50 | 2010-01 |
| 2 | 75 | 10 | 47 | 2010-01 |
Creating a new list after the loop list_months = list(dataset_jan, dataset_feb, dataset_mar) does the trick. However, I would like to store the datasets during each loop.
Thanks
Solution 1:[1]
this is quite strfaightformward using data.table's split
library(data.table)
# make sure your data in in a data.table format
setDT(mydata)
# use split.data.table to split on a certain column
split(mydata, by = "Date")
# $`2010-01`
# station_ID precipitation LONG LAT Date
# 1: 1 70 5 50 2010-01
# 2: 2 75 10 47 2010-01
#
# $`2010-02`
# station_ID precipitation LONG LAT Date
# 1: 1 60 5 50 2010-02
# 2: 2 65 10 47 2010-02
#
# $`2010-03`
# station_ID precipitation LONG LAT Date
# 1: 1 61 5 50 2010-03
# 2: 2 70 10 47 2010-03
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Wimpel |
