Skip to content

hjian42/K-Means-and-K-Modes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Python Implementation of K-means clustering(PAM) and K-modes clustering(Huang)

A robust implementation of K-means and K-mode clustering algorithm with max-min normalization in Python to cluster continuous variables. K-means is for datasets with continous attributes and K-modes is for datasets with categorical attributes.

Download and Usage

To use this program you can download this package from Github and run the following command after you are under the directory of K-means:

	python kmeans.py glass.csv 2 glass.out
	
	python kmeans.py wine_data.csv 4 wine_data.out

The first argument can be any input file. The second argument is the k, the number of clusters we want the program to gather. The third argument is the name of any output file where each cluster and its centroids are written.

To run k-modes: (the dafault dataset is the mushroom.training dataset)

	python kmodes.py

The dependencies for the programs include (Python2.7):

  • pandas
  • numpy
  • sklearn

Other

Alpha version, so it might not be the best implementation.

About

Kmeans and kmodes implementation in Python

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages