Skip to content

Rjaat/Cluster-Analysis

Repository files navigation

Proximity Measures & Cluster Analysis

Statistical approach to Cluster Analysis in Data Mining using R

Rlogo

Clustering:

  • Clustering is a technique in data mining in which, group of different data objects is classified as similar objects.
  • One group means a cluster of data.
  • Data sets are divided into different groups in the cluster analysis, which is based on the similarity of the data.
  • After the classification of data into various groups, a label is assigned to the group. It helps in adapting to the changes by doing the classification.
  • So, The process of dividing and storing data in these groups is known as cluster analysis.

1_Ag5L08TUqLGKmhFvm2StTQ

Cluster Analysis, Why?

  • As a data mining function, cluster analysis serves as a tool to gain insight into the distribution of data to observe characteristics of each cluster.
  • Makes Content Analysis Easy.
  • Market research, pattern recognition, data analysis, and image processing.
  • Identification of areas of similar land use in an earth observation database.
  • Classifying documents on the web for information discovery.
  • Outlier detection, such as detection of credit card fraud.

Libraries/Packages Used:

  • R :
    • tidyverse
    • ggplot2
    • cluster
    • factoextra
    • dbscan

Tasks Performed:

  • Data Visualization
  • Data Cleaning
  • Proximity Measures
  • Clustering

References :

Note:

For More, Detailed Report Is Attached ['ClusterAnalysis_Report.pdf'].

About

Statistical approach to Cluster Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages