This repository contains the code for my web scraping and topic modelling project
This project contains two main steps.
- Scraping the Trustpilot website to fetch user comments about the company named PlusDental.
- After succesfully scraping the Trustpilot, using the acquired data for topic modeling purposes via LDA.
The ETL file contains:
- Scraper.py
- testing.py
- Transform.py
Scraper.py contains the code for web scraping. The other two are not of vital importance for this project
Data folder contains the scraped data
lda.ipynb is where I applied topic modelling.