'How to manually set K-means cluster's centers?

I would like to not to predic centers, but to assign each object to an already defined center. How can I?



Solution 1:[1]

You can fit your KMeans to the desired cluster centers, and then use this model to predict your data.

from sklearn.cluster import KMeans

cluster_centers = [[1, 1], [0, 0]]
data  = [[1, 2], [1, 1], [3, 1], [10, -1]]

kmeans = KMeans(n_clusters=2)
kmeans.fit(cluster_centers)
kmeans.cluster_centers_
> array([[0., 0.],
         [1., 1.]])

kmeans.predict(data)
> array([1, 1, 1, 1])

Note: n_clusters has to match the number of your cluster centers

Solution 2:[2]

You have to define an array( let's call it X) containing your desired centers and in the kmean algorithm put 'init= X' take a look at the following example from sklearn:

class sklearn.cluster.KMeans(n_clusters=8, *, init='k-means++', n_init=10, max_iter=300, tol=0.0001, verbose=0, random_state=None, copy_x=True, algorithm='auto')

>>> from sklearn.cluster import KMeans
>>> import numpy as np
>>> X = np.array([[1, 2], [1, 4], [1, 0],
...               [10, 2], [10, 4], [10, 0]])
>>> kmeans = KMeans(n_clusters=2, random_state=0).fit(X)
>>> kmeans.labels_
array([1, 1, 1, 0, 0, 0], dtype=int32)
>>> kmeans.predict([[0, 0], [12, 3]])
array([1, 0], dtype=int32)
>>> kmeans.cluster_centers_
array([[10.,  2.],
       [ 1.,  2.]])

for more information : https://scikit-learn.org/stable/modules/generated/sklearn.cluster.KMeans.html

Solution 3:[3]

One way to do this would be to use the n_init and random_state parameters of the sklearn.cluster.KMeans module, like this:

from sklearn.cluster import KMeans

c = KMeans(n_init=1, random_state=1)

This does two things: 1) random_state=1 sets the centroid seed(s) to 1. This isn't exactly the same thing as specifically selecting the coordinates of the centroid you want, but it does allow you to control and reproduce the seed.

2) n_init=1 sets the number of iterations to 1, which means that you will limit the cluster attempt to only that seed which you select yourself in the random_state step.

You can additionally select the number of centroids you want created by using the n_clusters parameter.

From here, fitting and predicting will allocate points to the different clusters you have pre-established.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 TheRibosome
Solution 2 Zahra Safari-d
Solution 3 user6275647