Skip to content

in this project we are going to implement fuzzy c-means clustering in java

Notifications You must be signed in to change notification settings

amoazeni75/fuzzy-C-mean-clustering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fuzzy C-Mean Clustering Algorithm

in this project we are going to implement fuzzy c-means clustering algorithm in java. first we are going to give a brief look at the algorithm steps, then dive into details of methods we used.
for more details you can visit this link : https://home.deib.polimi.it/matteucc/Clustering/tutorial_html/cmeans.html

Fuzzy C-Mean Algorithm Steps

  1. initial membership
  2. calculate centroides value
  3. update membership values
  4. check convergence

there is a loop berween step 2 til step 4, after specific number of iterations or satisfying error condition the algorithm will be stop.

Source code guide

the source code contains a class named "FuzzyClustering" that has some fields and methods, in the below we will give a brief information about them.
here is the fields of FuzzyClustering.java class

U matrix
- matrix of membership values with n * m dimensiones (n = dataset size and m = cluster number size)
iteration
- #iteration that algorithm perform calculation
fuzziness
- value of parameter M in c-mean formula
epsilon
- threshold of error between current membership values and prevoius step

next we will describe arguments and functionality of methods

createRandomData
- this function get dataset size, min and max range, number of clusters and generate random number with gaussian distribution
assignInitialMembership
- initialize first values for membership of data
calculateClusterCenters
- this function will calculate value of centroids
updateMembershipValues
- this function will update membership values depends on current centroids value
checkConvergence
- this function will calculate norm 2 of current U matrix and previous U matrix

after running algorithm two file will be generate, "data_set.csv" and "cluster_center.csv" that contains random data and calculated centroids

Releases

No releases published

Packages

No packages published