'Selecting the top X values of each individual column in R
Say you have the following data set.
df1<-matrix(data = 1:10,
nrow = 5,
ncol = 5)
colnames(df1)=c("a","b","c","d","e")
How would you extract the top X values from each individual column as a new data frame?
The expected output would be something like this (for the top 3 values in each column
| a | b | c | d | e |
|---|---|---|---|---|
| 5 | 10 | 5 | 10 | 5 |
| 4 | 9 | 4 | 9 | 4 |
| 3 | 8 | 3 | 8 | 3 |
Solution 1:[1]
You can use apply to apply a function to each column (MARGIN = 2). Here, the function is \(x) head(sort(x, decreasing = T), 3), which sorts the column by decreasing order, and select the top three values (head(x, 3)).
apply(df1, 2, \(x) head(sort(x, decreasing = T), 3))
a b c d e
[1,] 5 10 5 10 5
[2,] 4 9 4 9 4
[3,] 3 8 3 8 3
Note: \(x) is a shorthand for function(x) in lambda-like functions since 4.1.0.
Solution 2:[2]
We can sort, then use head:
head(apply(df1, 2, sort, decreasing = TRUE), 3)
# a b c d e
# [1,] 5 10 5 10 5
# [2,] 4 9 4 9 4
# [3,] 3 8 3 8 3
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | |
| Solution 2 | zx8754 |
