Practical Considerations in K-Mean Clustering

Lekha Priya
1 min readJan 30, 2020

--

Below are the points to be considered before we start to make clusters to solve business problems.

The number of clusters that you want to divide your data points into, i.e. the value of K has to be pre-determined.

The choice of the initial cluster centers can have an impact on the final cluster formation.

The clustering process is very sensitive to the presence of outliers in the data.

Since the distance metric used in the clustering process is the Euclidean distance, you need to bring all your attributes on the same scale. This can be achieved through standardization.

The K-Means algorithm does not work with categorical data.

The process may not converge in the given number of iterations. You should always check for convergence.

--

--

Lekha Priya
Lekha Priya

Written by Lekha Priya

Specializing in Azure-based AI, Generative AI, and ML. Passionate about scalable models, workflows, and cutting-edge AI innovations. Follow for AI insights.

No responses yet