I-DBSCAN

This is an implementation of IDBSAN which stands for Intersection DBSCAN.

DBSCAN

Density-Based Spatial Clustering of Applications with Noise is a classic clustering algorithm that has a major advantage of capturing amorphic clusters, where other clustering algorithms, such as k-means, fails:

DBSCAN algorithm:

I-DBSCAN

Before applying DBSCAN we locate the leaders with the improved leader* algorithm. The goal of this approach is to reduce DBSCAN algorithm complexity by running it on reduced set of samples that are the representatives of the whole dataset. The main steps of IDBSCAN full applications are the following:

Apply Leader* to find the leaders and their corresponding followers, while allowing more than one leader to each example.
Apply a sampling of the intersected samples so that a dense leader in the original data will remain dense in the created sub-data.
Apply DBSCAN on the sub-data (S_data) that contains both the leaders and the sampled examples from step 2.
Get the prediction of the leaders and pass their predictions to their followers.

To execute the code go to main.py and adjust the parameters as you wish:

Choose number for dataset to use from the following possible: "abalone" - 0, "mushroom" - 1, "pendigit" - 2, "letter" - 3, "cadata" - 4, "sensorless" - 5, "shuttle" - 6.
Which algorithms you would wish to execute. It is possible to execute all of them at once: "IDBSCAN", "DBSCAN", "stdbscan", "hdbscan", "leader".
flag_save if you wish to save the clustering of IDBSCAN to txt file.
path - in case of flag_save == True.
verbose - True if you are interested in seeing more detailed results/tracking the execution details.

####Please approach the "Report.pdf" file for deeper explanations on each of the algorithms, their implementations and eventually Experiments results.

Citing

Luchi, Diego, Alexandre L. Rodrigues and Flávio Miguel Varejão. “Sampling approaches for applying DBSCAN to large datasets.” Pattern Recognit. Lett. 117 (2019): 90-96.

@article{Luchi2019SamplingAF, title={Sampling approaches for applying DBSCAN to large datasets}, author={Diego Luchi and Alexandre L. Rodrigues and Fl{'a}vio Miguel Varej{~a}o}, journal={Pattern Recognit. Lett.}, year={2019}, volume={117}, pages={90-96} }

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
Results		Results
__pycache__		__pycache__
datasets		datasets
images		images
.gitignore		.gitignore
README.md		README.md
Report.pdf		Report.pdf
algorithms.py		algorithms.py
main.py		main.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

I-DBSCAN

DBSCAN

I-DBSCAN

Citing

About

Releases

Packages

Languages

tairtahar/IDBSCAN

Folders and files

Latest commit

History

Repository files navigation

I-DBSCAN

DBSCAN

I-DBSCAN

Citing

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages