'data.table::setorder on data.frame input order only the sort column

I'm passing a data.frame to a function that sort it using data.table::setorder. Before sorting, I'm calling data.table::setDT on the input (in actuality I need to later call some specific data.table functions on the input). This is the function I'm using:

test_order <- function(x) {
  data.table::setDT(x)
  data.table::setorder(x, score)
}

and this is how I'm calling it:

x0 <- data.frame(id = 1:3, score = c(20, 10, 30))
x1 <- test_order(x0)

The results seem strange:

  • The type of x0 is changed to a data.table - that seems okay;
  • The column x0$id is not modified - it's address (using data.table::address) remains the same, and it also remains in the original order;
  • The column x0$score also remains at the same address, but it is now sorted, so it's out of sync with the id column;
  • The resulting x1 has the rows sorted correctly; also, the score column is at the same address as the score column of x0, but the id column is, of course, at a different address. Here is what I get:
> x0
  id score
1  1    20
2  2    10
3  3    30
> x1 <- test_order(x0)
> x0
   id score
1:  1    10
2:  2    20
3:  3    30
> x1
   id score
1:  2    10
2:  1    20
3:  3    30

The most "surprising" thing is the loss of data integrity of the x0 object - the id column keeps the original order but the score column got sorted, so the rows are different from the rows of the original object. This does not happen if I'm not calling setDT or if I'm passing a data.table object instead of a data.frame.

The version of the data.table package is 1.14.2



Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source