GitHub - shahiryar/Naive-Bayes: Naive Bayes Classification Algorithm from scratch, without using sklearn

Naive Bayes Algorithm

Naive Bayes is a probabilistic algorithm used for classification tasks. It is based on Bayes' theorem, which describes the probability of an event occurring based on prior knowledge of conditions that might be related to the event. The algorithm is called "naive" because it makes a simplifying assumption that the features used for classification are independent of each other, which is often not the case in real-world scenarios.

This implementation uses the Gaussian distribution to model the probability density function of each class. The algorithm assumes that the features follow a normal distribution, and estimates the mean and variance for each class based on the training data. Given a new instance, the algorithm calculates the likelihood of each class using the estimated parameters and predicts the class with the highest likelihood.

Implementation

The GaussianNB class provides methods for training the model and making predictions. The class has the following attributes:

classes: an array containing the unique class labels observed in the training data.
class_priors: an array containing the prior probabilities of each class.
mean: a matrix containing the mean of the feature values for each class.
variance: a matrix containing the variance of the feature values for each class.

The class has the following methods:

fit(X, y)

This method trains the Gaussian Naive Bayes model using the input data X and the corresponding target values y. The method estimates the prior probabilities, mean, and variance for each class based on the training data.

predict(X)

This method predicts the class labels of the input data X using the fitted model. The method calculates the likelihood of each class for each instance in X, and predicts the class with the highest likelihood.

confusion_matrix(X_test, y_test)

This method calculates the confusion matrix for the trained model given the test data and the corresponding true labels. The method returns an array containing the number of instances where the actual class was i and the predicted class was j.

accuracy(X_test, y_test, threshold=0.5)

This method calculates the accuracy of the trained model on the test data. The method returns the number of correct predictions divided by the total number of predictions.

Uses and Applications

Naive Bayes is a simple yet effective algorithm that can be used for classification tasks. It is often used in text classification tasks such as spam filtering, sentiment analysis, and document classification. It has also been applied to other domains such as image recognition, medical diagnosis, and fraud detection. Naive Bayes is particularly useful in situations where the number of features is large compared to the number of instances, as it can handle high-dimensional data efficiently.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
docs		docs
src/naive_bayes		src/naive_bayes
tests		tests
.coveragerc		.coveragerc
.gitignore		.gitignore
.readthedocs.yml		.readthedocs.yml
AUTHORS.rst		AUTHORS.rst
CHANGELOG.rst		CHANGELOG.rst
CONTRIBUTING.rst		CONTRIBUTING.rst
LICENSE.txt		LICENSE.txt
README.rst		README.rst
pyproject.toml		pyproject.toml
setup.cfg		setup.cfg
setup.py		setup.py
tox.ini		tox.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Naive Bayes Algorithm

Implementation

Uses and Applications

About

Releases

Packages

Languages

License

shahiryar/Naive-Bayes

Folders and files

Latest commit

History

Repository files navigation

Naive Bayes Algorithm

Implementation

Uses and Applications

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages