Skip to content

Classifying a website based on it's URL. I've implemented Stochastic Gradient Descent, Multinomial Naive Bayes and Convolutional Neural Network for classifying the category of the URL.

Notifications You must be signed in to change notification settings

Shaurov05/Website-Classification

Repository files navigation

URL Based Website Classification Using Deep Learning and Word Based Multiple N-gram Models

I’ve applied SGD and MNB classifiers for website classification by performing stemming on words within URLs and then also applied the same algorithms on n-grams without performing stemming. I’ve also implemented CNN on unigram, bigram, and trigram models.

DMOZ dataset is used for this task. It was known as open directory project(ODP). This dataset has over 1.5 websites with 15 categories that they belong like sports, Arts, Business etc. (you can find it here https://www.kaggle.com/shaurov/datasets).

About

Classifying a website based on it's URL. I've implemented Stochastic Gradient Descent, Multinomial Naive Bayes and Convolutional Neural Network for classifying the category of the URL.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published