Skip to content

This project uses the Multinomial Naive Bayes classifier to enhance movie genre classification based on metadata such as descriptions and ratings. Utilizing a dataset from Kaggle, it aims to improve content recommendation systems through accurate genre prediction.

Notifications You must be signed in to change notification settings

SreecharanV/Utilizing-Multinomial-Naive-Bayes-for-Enhanced-Movie-Genre-Classification-and-Analysis

Repository files navigation

Netflix Movie Genre Classification

This repository contains a project focused on classifying movie genres using the Multinomial Naive Bayes classifier. The goal is to improve genre classification accuracy, thus enhancing recommendation systems for streaming platforms like Netflix.

Table of Contents

  1. Introduction
  2. Dataset
  3. Model
  4. Results
  5. References

Introduction

In the constantly developing multimedia entertainment industry, particularly classifying movies into genres is a challenging yet essential task for more effective user suggestions and content management. This project provides a unique approach to this issue through using the Multinomial Naive Bayes classifier on a mixed and diverse movie dataset. Workflow

Dataset

The dataset used in this project is sourced from Kaggle. It contains a comprehensive collection of movie and TV show data from Netflix, including titles, directors, actors, country, year, and descriptions.

Model

The Multinomial Naive Bayes classifier is particularly suited for this project due to its effectiveness in text classification and handling of categorical data. The model is trained on movie descriptions, leveraging the Term Frequency-Inverse Document Frequency (TF-IDF) vectorization method to convert text data into numerical format.

Environment Setup

  • Python: 3.8 or newer
  • Libraries:
    • Pandas
    • NumPy
    • Scikit-learn
    • Matplotlib
    • Seaborn
    • Jupyter Notebook

Results

The model was evaluated using various metrics such as accuracy, precision, recall, and F1-score. The Multinomial Naive Bayes classifier showed high precision in genre classification and outperformed traditional methods like Decision Trees, K-Nearest Neighbors, and Support Vector Machines.

Visualization

Visualizations such as confusion matrices, bar plots, and line graphs were used to illustrate the model's performance. These visualizations help in understanding the strengths and limitations of the classifier.

Confusion Matrix

CM1

CM2

Accuracy Comparision between MNN and KNN

Accuracy Comparision

Correlation B/W Genre Frequency and Model Accuracy

Genre Frequency and Model Accuracy

MNB Accuracy

Accuracy

References

  1. Collaborative Filtering Recommender System Based on Memory Based in Twitter Using Decision Tree Learning Classification
  2. A comprehensive survey on support vector machine classification: Applications, challenges and trends
  3. A multimodal approach for multi-label movie genre classification
  4. A Movie Recommendation System Design Using Association Rules Mining and Classification Techniques
  5. Multinomial Naїve Bayes for Documents Classification and Natural Language Processing (NLP)
  6. Installing Jupyter
  7. Anaconda Software Distribution
  8. Pandas Library
  9. NumPy The fundamental package for scientific computing with Python
  10. scikit-learn: Machine Learning in Python
  11. Matplotlib: Visualization with Python
  12. seaborn: statistical data visualization
  13. Multilabel Genre Prediction Using Deep-Learning Frameworks

About

This project uses the Multinomial Naive Bayes classifier to enhance movie genre classification based on metadata such as descriptions and ratings. Utilizing a dataset from Kaggle, it aims to improve content recommendation systems through accurate genre prediction.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published