Skip to content

[Tokenization, Topic Modeling, Sentiment Analysis, Network of Bigrams] The purpose of this project is to see if text mining techniques can ease better analysis for categorizing movies with just the Descriptions while ignoring the Genre from the dataset, IMDB_movies.csv, which is stored under the data frame variable, movies_desc. Tokenization (TF…

Notifications You must be signed in to change notification settings

mhasegawa7045/Film_Movie_Text_Mining_Sentimental_Analysis_Machine_Learning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 

Repository files navigation

Film Movie Machine Learning Project: Text Mining and Sentimental Analysis

Tokenization, Topic Modeling, Sentiment Analysis, Network of Bigrams

Film Genre and Story Description Machine Learning: Data Mining and Text Mining

The purpose of this project is to see if text mining techniques can ease better analysis for categorizing movies with just the Descriptions while ignoring the Genrefrom the dataset, IMDB_movies.csv, which is stored under the dataframe variable, movies_desc.

  • Tokenization (TF-DF) was used to increase efficiency to analyze term frequencies in movie Descriptions, so that the conceptual theme of a movie franchise would be determined even if a person has never watched any of the films.
  • Create mixtures of terms that are correlated to every topic and the mixture of topics that distinguishes each document through Topic Modeling in the dataset, IMDB_movies.csv.
  • Sentimental Analysis focused on Movies with Sentimetal Clusters that were using bing and nrc lexicons to see how Sentiment affects Rating and Revenue.
  • The network of bigrams for the Movies dataset help summarize how frequented Movie Description word-terms create term relationships and how they connect to other movies.

Presentation Link

https://www.youtube.com/watch?v=AbwBXCEKPAs&t=9s

References

(n.d.). Retrieved from http://saifmohammad.com/WebPages/NRC-Emotion-Lexicon.htm

Robinson, J. S. A. D. (2020). 2 Sentiment analysis with tidy data | Text Mining with R. Titdy Text Mining. https://www.tidytextmining.com/sentiment.html

About

[Tokenization, Topic Modeling, Sentiment Analysis, Network of Bigrams] The purpose of this project is to see if text mining techniques can ease better analysis for categorizing movies with just the Descriptions while ignoring the Genre from the dataset, IMDB_movies.csv, which is stored under the data frame variable, movies_desc. Tokenization (TF…

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages