Skip to content

Build a program that, given a text message, says if it is a spam or not.

Notifications You must be signed in to change notification settings

devGhiles/sms_spam_classifier

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SMS Spam Classification Using a Naive Bayes Classifier

Dependencies

The program is written in Python 3.6.1, but it should work for any version of Python 3. See the requirements.txt file for the required libraries. If you have pip installed, you can install all the required libraries by typing the following command in the command-line:

$ pip install -r requirements.txt

Description of the files

  • training.py: generates the model by training a naive bayes classifier.
  • main.py: classifies a text message into spam/not spam using the model learned by training.py.
  • spam.csv: the dataset used to train the classifier.

Generate the model

The first step is to generate the model (which consists of training a naive bayes classifier). For that, you need to execute the file training.py:

$ python training.py

This will create two files: clf.pkl and vectorizer.pkl.

Classification of a new text message

The main.py file is used for that:

$ python main.py path/to/textfile

References

The dataset (the file spam.csv) was obtained from here.

For more details on how the model (multinomial naive bayes) was chosen, check out my kaggle kernel here.

Releases

No releases published

Packages

No packages published

Languages