'data.table::setorder on data.frame input order only the sort column
I'm passing a data.frame to a function that sort it using data.table::setorder. Before sorting, I'm calling data.table::setDT on the input (in actuality I need to later call some specific data.table functions on the input). This is the function I'm using:
test_order <- function(x) {
data.table::setDT(x)
data.table::setorder(x, score)
}
and this is how I'm calling it:
x0 <- data.frame(id = 1:3, score = c(20, 10, 30))
x1 <- test_order(x0)
The results seem strange:
- The type of
x0is changed to adata.table- that seems okay; - The column
x0$idis not modified - it's address (usingdata.table::address) remains the same, and it also remains in the original order; - The column
x0$scorealso remains at the same address, but it is now sorted, so it's out of sync with theidcolumn; - The resulting
x1has the rows sorted correctly; also, thescorecolumn is at the same address as thescorecolumn ofx0, but theidcolumn is, of course, at a different address. Here is what I get:
> x0
id score
1 1 20
2 2 10
3 3 30
> x1 <- test_order(x0)
> x0
id score
1: 1 10
2: 2 20
3: 3 30
> x1
id score
1: 2 10
2: 1 20
3: 3 30
The most "surprising" thing is the loss of data integrity of the x0 object - the id column keeps the original order but the score column got sorted, so the rows are different from the rows of the original object.
This does not happen if I'm not calling setDT or if I'm passing a data.table object instead of a data.frame.
The version of the data.table package is 1.14.2
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|
