'Hierarchical Clustering related

I have a question concerning hierarchical clustering using R Language. I am trying to find out what values does the argument dist.metric takes of the function hclust. I wonder whether this is the same as in Python’s “metric”, but apparently I tried to use “cosine” there and the result was an error. So what values does this argument accept? If possible with an example provided.



Solution 1:[1]

You can use different metrics in the hclust function:

the agglomeration method to be used. This should be (an unambiguous abbreviation of) one of "ward.D", "ward.D2", "single", "complete", "average" (= UPGMA), "mcquitty" (= WPGMA), "median" (= WPGMC) or "centroid" (= UPGMC).

This means you can use for example average. Here is an example:

data <- matrix(rnorm(100), nrow=3)
d_m = dist(data, method="maximum")
hclust(d_m, method="average")

Call:
hclust(d = d_m, method = "average")

Cluster method   : average 
Distance         : maximum 
Number of objects: 3 

As you can see the cosine is not available.

When using dist.metric in hclust:

hclust(d_m, method="average", dist.metric = "cosine")    
Error in hclust(d_m, method = "average", dist.metric = "cosine") : 
  unused argument (dist.metric = "cosine")

So there is no argument called dist.metric.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1