Skip to content

Text Classification that works on identifying different authors writing styles in Gutenberg Digital Books | NLP.

Notifications You must be signed in to change notification settings

shahendae/Gutenberg-Digital-Books

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

Gutenberg-Digital-Books

This project focuses on text Classification that works on identifying different authors writing styles in Gutenberg digital books and predict to which author or genre the piece of writing belongs.

image

Prepare and Preprocess the data which include Clean Data, Feature Extraction, and Transform the data to Bag of Words (BOW) and TF-IDF Vectorizer with and without N-Grams.

for the classification Decision Tree Model, SVM Model, KNN Model were tested. The SVM model gave the best accuracy with the "linear" kernel we obtained an accuracy of 98.7%, the linear kernel provides faster performance.

Collected all the accuracy we obtained from each model:

image

About

Text Classification that works on identifying different authors writing styles in Gutenberg Digital Books | NLP.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published