Skip to content

A solution that should recognize and classify the news articles based on their labels

Notifications You must be signed in to change notification settings

sriphaniN/news-articles-sorted

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

news-articles-sorting

Problem Statement:

In today’s world, data is power. With News companies having terabytes of data stored in servers, everyone is in the quest to discover insights that add value to the organization. With various examples to quote in which analytics is being used to drive actions, one that stands out is news article classification. Nowadays on the Internet there are a lot of sources that generate immense amounts of daily news. In addition, the demand for information by users has been growing continuously, so it is crucial that the news is classified to allow users to access the information of interest quickly and effectively. This way, the machine learning model for automated news classification could be used to identify topics of untracked news and/or make individual suggestions based on the user’s prior interests.

Approach: Techniques like clustering and associating rule-based algorithms can be applied to group together similar text. The ML algorithms learn the mapping function between the text and the tags based on already categorized data.

Data Exploration : I started exploring dataset using pandas,numpy and pandas-profiling.

Data visualization : Ploted graphs to get insights about dependend and independed variables.

Feature Engineering : Removed missing values and created new features as per insights.

Model Selection I : Tested all base models to check the base accuracy, Also ploted residual plot to check whether a model is a good fit or not.

Pickle File : Selected model as per best accuracy and created pickle file .

Project Title: News Articles Sorting

Technologies: Deep Learning Technology (NLP)

Domain: Media

Project Difficulties level: Intermediate

Technologies Used python nltk numpy pandas matplotlib nltk-word_tokenize PorterStemmer pandas_profiling sklearn

video link of depolyment: https://github.com/sriphaniN/news-articles-sorted/blob/a006cd1a1770cbff717729bfebefb2ff3e51c6d3/project2/video-record.webm

Releases

No releases published

Packages

No packages published