'Unique combinations by group
I have the following data frame structured in terms of 3 variables, i.e Location, Latitude, and Longitude within every single group. I would like to calculate the euclidean distance between all unique location combinations within each group. So for instance, based on the data frame below: the euclidean distance between - (A - London and A - Zurich) and (A - Zurich and A - New York) and (A - New York and A - London). And on a similar note (B - New York and B - London).
Then the average of all these unique distance pairs then needs to be calculated.
euc_dist <- function(x1, x2){
return(sqrt(sum((x1 - x2)^2)))
}
id Group Location Latitude Longitude
1 A London 1 2
2 A New York 3 4
3 A Zurich 5 6
4 B New York 7 8
5 B New York 9 10
6 B London 11 12
The output should look like:
id Group Average Euclidean distance
1 A xx
2 B xx
Thank you in advance!
Solution 1:[1]
Here's a dplyr solution:
library(dplyr)
data.frame(Group=gl(2, 3, labels = c("A", "B")),
Latitude=seq(1, 11, 2),
Longitude=seq(2, 12, 2)) %>%
group_by(Group) %>%
summarise(mean_dist=mean(dist(cbind(Latitude, Longitude))))
(R's dist function defaults to calculating Euclidean distance and does it very, very efficiently)
# A tibble: 2 x 2
Group mean_dist
<fct> <dbl>
1 A 3.77
2 B 3.77
I'm not totally clear on what the "unique locations" means because each location should only have a single latitude and longitude, correct?
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | Dubukay |
