K-Means clustering

General

K-means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean (cluster centers or cluster centroid), serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells. k-means clustering minimizes within-cluster variances (squared Euclidean distances), but not regular Euclidean distances, which would be the more difficult Weber problem: the mean optimizes squared errors, whereas only the geometric median minimizes Euclidean distances. For instance, better Euclidean solutions can be found using k-medians and k-medoids.

The problem is computationally difficult (NP-hard); however, efficient heuristic algorithms converge quickly to a local optimum. These are usually similar to the expectation-maximization algorithm for mixtures of Gaussian distributions via an iterative refinement approach employed by both k-means and Gaussian mixture modeling.

credits - Wikipedia

Usage

make

./bin/K-Means <dataset path> <N epochs> <K clusters>

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
data		data
distributed @ 0fef07c		distributed @ 0fef07c
include		include
scripts		scripts
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

K-Means clustering

General

Usage

About

Releases 1

Packages

Languages

eliazonta/K-Means

Folders and files

Latest commit

History

Repository files navigation

K-Means clustering

General

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages