'How to create an empty datatable with columns names and then append datatables to it?
First I want to create an empty datatable with column names but it fails:
data <- data.table(va, vb, vc)
> Error in data.table(va, vb, vc) : object 'va' not found
Second I want to append datatable to it but it fails too :
data2 <- data.table(va=c(-1,0,1), vb=c(-1,0,1), vc=c(-1,0,1))
data2
va vb vc
1: -1 -1 -1
2: 0 0 0
3: 1 1 1
merge(data2,data2)
> Error in merge.data.table(data2, data2) :
Can not match keys in x and y to automatically determine appropriate `by` parameter. Please set `by` value explicitly.
Apparently the function can't identify the by parameters with two identical datatables. Any idea?
Solution 1:[1]
To create an empty data.table, you can start from an empty matrix:
library(data.table)
data <- setNames(data.table(matrix(nrow = 0, ncol = 3)), c("va", "vb", "vc"))
data
Empty data.table (0 rows) of 3 cols: va,vb,vc
Then you can use rbindlist to append new data.table to it:
data2=data.table(va=c(-1,0,1), vb=c(-1,0,1), vc=c(-1,0,1))
data2
va vb vc
1: -1 -1 -1
2: 0 0 0
3: 1 1 1
rbindlist(list(data, data2))
va vb vc
1: -1 -1 -1
2: 0 0 0
3: 1 1 1
Or even simpler, the following also works:
data <- data.table()
data <- rbindlist(list(data, data2))
data
va vb vc
1: -1 -1 -1
2: 0 0 0
3: 1 1 1
Solution 2:[2]
Another way to create an empty data.table with defined column names but without having to define data types:
data <- data.table(1)[,`:=`(c("va", "vb", "vc"),NA)][,V1:=NULL][.0]
This does the following
data.table(1): Create a non-NULLdata.table to which you can add columns- Has a one column
V1with one row. Value1 - You can use any value (other than
NULL) in the place of1
- Has a one column
[,`:=`(c("va", "vb", "vc"),NA)]: Add columnsva,vb,vc- Now has four columns (starting with
V1) and one row. value1,NA,NA,NA - Any non-
NULLvalue can be substituted for NA
- Now has four columns (starting with
[,V1:=NULL]: Remove theV1column[.0]: Return a blank row- You can actually use [.n] where n is any integer.
If you don't like the black magic of [.0] you can also use
data <- data.table(1)[,`:=`(c("va", "vb", "vc"),NA)][,V1:=NULL][!is.na(va)]
Edit several years later:
note that these columns are initially classed as logical (for the NA example as above). The column classes are normally coerced into the classes of the columns of any appended data, but this appears to fail with Date data.
> alldata[,lapply(.SD,class)] # 0-row data seeded with NA in each column as above
va vb vc vd
1: logical logical logical logical
> filedata[,lapply(.SD,class)] # lines of real data that you are trying to merge
va vb vc vd
1: character character integer Date
> rbindlist(list(alldata,filedata))
Error in rbindlist(list(alldata, filedata), use.names = FALSE) :
Class attribute on column 4 of item 2 does not match with column 4 of item 1.
To navigate around this error, one solution is to use @R Yoda's answer with that column declared as e.g. vd=as.Date(character(0), origin = "1970-01-01")
Note that this error was reported to the data.table github repo here for this specific use-case. It had generally been reported here previously.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 |
