'subtract one row from the other for several columns in r
I am doing some manipulations with dplyr. I am working with the brca data set. I have to find a solution for the below question.
" We are interested what variable might be the best indicator for the outcome malignant ("M") or benign ("B"). There are 30 features (variables) and we want to select one variable that has the largest difference between means for groups M and B."
Now i want to find the difference between the two resulting rows and then find the maximum difference and the resulting column name.
Can anyone help me with this?
Thanks... :)
Solution 1:[1]
To get column name and the value with the highest absolute difference between two rows you can do -
library(dplyr)
library(tidyr)
sumOutcome %>%
summarise(across(-outcome, diff)) %>%
pivot_longer(cols = everything()) %>%
slice(which.max(abs(value)))
# name value
# <chr> <dbl>
#1 concave_pts_worst 436.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Ronak Shah |
