Skip to content

Scrapes a product’s Amazon reviews, extracts entities, and reveals sentiment

License

Notifications You must be signed in to change notification settings

mansimann/amazon-reviews-analyzer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

16 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

🛒 Amazon Reviews Analyzer

Custom badge Custom badge Custom badge Custom badge

This application scrapes a product’s Amazon reviews, extracts entities, and reveals sentiment.

Contents

Technologies

Java Maven

Implementation details

jsoup

This application uses the jsoup library to scrape and parse HTML from a URL and extract product reviews using CSS selectors.

TextRazor NLP API

This application communicates with the TextRazor API to extract entities.

Google Cloud NL API

This application communicates with the Google Cloud NL API to detect sentiment score and magnitude.

Installation

First steps

After cloning the project, you must set up the following two environment variables:

  • TEXT_RAZOR_API_KEY: ***
  • GOOGLE_APPLICATION_CREDENTIALS: service-account-file.json

For more info, see https://www.textrazor.com/signup & https://cloud.google.com/docs/authentication/production.

You also may need to install Apache Maven (https://maven.apache.org/) on your system.

How to compile the project

mvn clean compile

How to create a binary runnable package

mvn clean compile assembly:single

How to run

mvn -q clean compile exec:java -Dexec.executable="service.Main" 

How to run all the unit test classes

mvn clean compile test checkstyle:check  spotbugs:check

How to run spotbugs

To see bug details using the Findbugs GUI, use the following command "mvn findbugs:gui"

Or you can create a XML report by using

mvn spotbugs:gui 

or

mvn spotbugs:spotbugs
mvn spotbugs:check 

For more info see https://spotbugs.readthedocs.io/en/latest/maven.html

How to run checkstyle

CheckStyle code styling configuration files are in config/ directory. Maven checkstyle plugin is set to use google code style.

mvn checkstyle:check

Generate a report in XML format:

target/checkstyle-checker.xml
target/checkstyle-result.xml

Generate a report in HTML format:

mvn checkstyle:checkstyle
target/site/checkstyle.html