'Threshold in hierarchical clustering

I have a dataset that generates the chart attached at the end of the post. I want to cluster the data. Visually I identify 4 different clusters. As this data may change and so the number of clusters I'm using a hierarchical clustering in which I don't specify in advance the number of clusters to do.

The code

thresh = 1.00
clusters = hcluster.fclusterdata(df[['time', 'mean']].to_numpy(), thresh, criterion="distance")

Produces more than 12000 different clusters while when thresh is 1.02 (an increase of 0.02 points) gives only 1 cluster.

Can someone help to understand where is the problem in my code and how should I do in order to identify the 4 data groups that it is possible to visually observe.

enter image description here

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution	Source

'Threshold in hierarchical clustering

Sources

Related Questions