Movie-Analytics-Big-Data

IMDB Dataset

The movies dataset includes 85,855 movies with attributes such as movie description, average rating, number of votes, genre, etc. Attributes

imdbId: title ID on IMDb(integer)
Title:title name(string)
orginal_title:original title name(string)
year:year of release(string)
date_published:date of release(string)
Genre:movie genre(string)
Duration:duration in min(integer)
Country:movie country(string)
Language:Movie language(string)
director:Director name(string)
Writer:Writer name(string)
Production_Company:production company(string)
Actors:actor names(string)
Description:plot description(string)
Avg_vote:average votes(string)
Reviews_from_users:no of review from user(string)
Reviews_from_users:no of review from critics(string)

Movielens Dataset

The datasets describe ratings and free-text tagging activities from MovieLens
rating.csv that contains ratings of movies by users: userId : movieId : rating : timestamp
movie.csv that contains movie information: movieId : title : genres

The main aim of this project is to demonstrate the movie analytics using spark technology. In the MLR model, we are predicting the average votes based on other attributes of IMDb dataset (director, writer, genre, duration, year).

Collaborative filtering is commonly used for recommender systems. These techniques aim to fill in the missing entries of a user-item association matrix. In collaborative filtering, the users and movies are described by a small set of latent factors. Here we are using ALS model to find the latent factors and to suggest some movies to user 20 based on our model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Movie-Analytics-Big-Data

IMDB Dataset

Movielens Dataset

The main aim of this project is to demonstrate the movie analytics using spark technology. In the MLR model, we are predicting the average votes based on other attributes of IMDb dataset (director, writer, genre, duration, year).

Files

README.md

Latest commit

History

README.md

File metadata and controls

Movie-Analytics-Big-Data

IMDB Dataset

Movielens Dataset

The main aim of this project is to demonstrate the movie analytics using spark technology. In the MLR model, we are predicting the average votes based on other attributes of IMDb dataset (director, writer, genre, duration, year).