'How to find a row in dataframe whose values in 2 columns are closest to my own values in R?
For example, I have this dataframe:
| ID | height | price |
|---|---|---|
| 1 | 10 | 12 |
| 2 | 13 | 7 |
| 3 | 4 | 33 |
| 4 | 10 | 15 |
| 5 | 8 | 49 |
| 6 | 4 | 2 |
| 7 | 5 | 11 |
And I have my own values
height = 11
price = 14
I want to locate the row where ID is 4 because its height and price are closest to my own values. How am I supposed to achieve this in R? I've been trying some dplyr functions but got no luck so far.
Solution 1:[1]
Another possible solution:
library(tidyverse)
h = 11
p = 14
df <- data.frame(
ID = c(1L, 2L, 3L, 4L, 5L, 6L, 7L),
height = c(10L, 13L, 4L, 10L, 8L, 4L, 5L),
price = c(12L, 7L, 33L, 15L, 49L, 2L, 11L)
)
df %>%
mutate(dist = sqrt((height-h)^2 + (price-p)^2)) %>%
slice_min(dist) %>%
select(ID)
#> ID
#> 1 4
Solution 2:[2]
Assuming you want to use the Euclidean distance (I am using the squared distance since it just for sorting purpose), here is a quick way to do it.
df |>
mutate(dist = (11 - height)^2 + (14 - price) ^2) |>
filter(dist == min(dist))
##> ID height price dist
##> 1 4 10 15 2
Solution 3:[3]
This function chooses the first minimum euclidian distance from the data to the given point.
dat <- read.table(text = "
ID height price
1 10 12
2 13 7
3 4 33
4 10 15
5 8 49
6 4 2
7 5 11
", header = TRUE)
choice <- function(x, height, price){
d <- function(x, y) sqrt(sum((x - y)^2))
y <- apply(x[-1], 1, d, y = c(height, price))
which.min(y)
}
choice(dat, 11, 14)
#> [1] 4
Created on 2022-03-23 by the reprex package (v2.0.1)
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | PaulS |
| Solution 2 | Stefano Barbi |
| Solution 3 | Rui Barradas |
