'How to sort and filter data.frame in R?
I understand how to sort a data frame:
df[order(df$Height),]
and I understand how to filter (or subset) a data frame matching some predicate:
df[df$Weight > 120,]
but how do I sort and filter (as an example, order by Height and filter by Weight)?
Solution 1:[1]
Either in two steps
df1 <- df[df$weight > 120, ]
df2 <- df1[order(df1$height), ]
or if you must in one step -- but it really is not any cleaner.
Data first:
R> set.seed(42)
R> df <- data.frame(weight=rnorm(10, 120, 10), height=rnorm(10, 160, 20))
R> df
weight height
1 133.7 186.1
2 114.4 205.7
3 123.6 132.2
4 126.3 154.4
5 124.0 157.3
6 118.9 172.7
7 135.1 154.3
8 119.1 106.9
9 140.2 111.2
10 119.4 186.4
And one way of doing it is double-subsetting:
R> subset(df, weight > 120)[order(subset(df, weight > 120)$height),]
weight height
9 140.2 111.2
3 123.6 132.2
7 135.1 154.3
4 126.3 154.4
5 124.0 157.3
1 133.7 186.1
R>
I'd go with the two-step.
Solution 2:[2]
The package data.table allows you to this in one short line of code:
Borrowing Dirk Eddelbuettel's example, set up some data:
set.seed(42)
df <- data.frame(weight=rnorm(10, 120, 10), height=rnorm(10, 160, 20))
Convert the data.frame to a data.table and subset on weight, ordering by height:
library(data.table)
dt <- data.table(df)
dt[weight>120][order(height)]
weight height
[1,] 140.1842 111.1907
[2,] 123.6313 132.2228
[3,] 135.1152 154.3149
[4,] 126.3286 154.4242
[5,] 124.0427 157.3336
[6,] 133.7096 186.0974
Solution 3:[3]
df1 <- df[order(df$height), ][df$weight > 120, ]
Just make sure to put the order before the filter.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | Andrie |
| Solution 3 | Jacob Levine |
