MiniStack

Task-1

Search Engine on Stackoverflow corpus nearly 160k+ docs

I split the code in three notebook files

DataExtraction is code for collecting data
DataPreprocessing is code for processing the data
Retrieval is code for retrieving top 10 similar docs

Required libraries and packages are

pandas, numpy, sklearn, nltk, re, os ,sys, csv, xml

Dataset

For this project I collected data from Stack Exchange Data Dump website

Task-2

A web crawler which crawls the Stack Overflow website and finds the most popular technologies at current point of time by getting the tags information of the newest questions asked on the website.

webcrawler is the code for the this task

Required libraries are

urllib3, requests, bs4, zlib, operator, os, sys

How to run

Download the files and make sure all the files and folders are in the same directory

UI Demo

To run the code in server

Go to UI-demo folder
create a virual environment ( Command : virtualenv env for windows)
activate the virtual environment (Command : env/Scripts/activate)
install requirements.txt (Command: pip install -r requirements.txt)
run python app.py in the terminal

Github repository Link

https://github.com/Saideepthi123/MiniStack

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.ipynb_checkpoints		.ipynb_checkpoints
UI-demo		UI-demo
DataExtraction.ipynb		DataExtraction.ipynb
DataPreprocessing.ipynb		DataPreprocessing.ipynb
Documentation.pdf		Documentation.pdf
Information Retrieval Project -REPORT.pdf		Information Retrieval Project -REPORT.pdf
README.md		README.md
Retrieval.ipynb		Retrieval.ipynb
WebCrawler.ipynb		WebCrawler.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MiniStack

Task-1

Required libraries and packages are

Dataset

Task-2

Required libraries are

How to run

UI Demo

Github repository Link

About

Releases

Packages

Languages

Saideepthi123/MiniStack

Folders and files

Latest commit

History

Repository files navigation

MiniStack

Task-1

Required libraries and packages are

Dataset

Task-2

Required libraries are

How to run

UI Demo

Github repository Link

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages