Skip to content

iamakashrout/SIH-2023

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SIH-2023

Smart India Hackathon 2023

Team Avengers

Team Members-

Akash Rout
Nandini Gera
Sparsh Singh Bhatia
Ujjawal Gupta
Dhruv Goyal
Ankur Gupta

Project Name:

Automated Crawling, Categorization and Sentiment Analysis of Digital News with Incorporated Feedback System.

Problem Statement- 1329

The project addresses the need for a 360-degree feedback software for monitoring Government of India-related news stories in regional media using Artificial Intelligence and Machine Learning.

Solution Proposed

We've developed a smart system that automatically scrapes news from numerous sources across the internet including text articles as well as video news. After fetching these articles, these are then classified into categories of which ministry’s jurisdiction they come under followed by their sentiment analysis as positive, neutral, or negative scores are assigned to each news article fetched. If negative news is detected, alerts are sent to the respective government department through their concerned email address. This system keeps the government updated with news events and allows for quick responses when needed. The news are then displayed on a visually appealing and easy to use user-friendly interface where user can refresh and load the latest news when required. If not refreshed manually, the news is automatically refreshed after every hour. Option to fetch news articles in Engish, Hindi and multiple regional languages has been provided.

Tech Stack Used

  • AI: PyTorch, TensorFlow, and BERT libraries for creating ML models.
  • Crawling: Beautiful Soup, Selenium
  • Server: Django backend.
  • Frontend: Next.js and Tailwind CSS frontend.

Run Commands

To run the project locally:

  1. Clone the repository:
git clone https://github.com/iamakashrout/SIH-2023.git
  • Navigate to project directory.
cd SIH-2023
  1. Install dependencies for the client (Next.js):
cd client
npm install
  1. Start the Next.js development server:
npm run dev
  1. Install the necessary libraries and Paste the contents from here into the server folder.

  2. Start the Django backend server.

python manage.py runserver

Approach Details

  • Crawled 12000+ news articles and videos using Python Beautiful Soup and Selenium Library.
  • Applied clustering on these articles to label them into different categories to prepare labeled dataset.
  • Trained this dataset of articles using DistilBERT model to generate department predictions. Accuracy - 83%
  • Used Roberta model to implement sentiment analysis on news articles.
  • Sending mail of Negative News to respective departments using NodeMailer and Gmail - SMTP
  • Integrated this model and crawling functionality with a Django backend and wrote APIs for generating predictions and sentiments.
  • Merged this backend with a simple and attractive UI where user can give triggers to load latest news articles with their analysis.
  • Implemented video news analysis using Selenium library by first extracting audio and converting it into text. Then applied classification and sentiment analysis on the extracted text.
  • Developed the same functionalities for news in Hindi and others languages as well using Google Translate API.

Screenshots

Frontend

Frontend

Frontend

Screenshots Link

Project Links

- Abstract

- Description

- Youtube Demo

About

Smart India Hackathon 2023

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages