Can book covers help predict bestsellers using machine learning approaches?

Abstract

As the book publishing market changes from offline to online, readers tend to purchase books while paying more attention to book covers and metadata rather than the actual book contents. We examine whether publishers can know users’ satisfaction with books in advance, and both metadata and book covers help predict this satisfaction.
Exploring effects of metadata and book covers on the satisfaction is not only necessary for publishers’ perspectives, but also for librarians’ perceptions. However, the majority of prior research on user preference-based book recommendation systems in both book industry and library system employed review comments, ratings, or book loan records.
Thus, we open up the potentiality of other factors, which implicitly affect the satisfaction with books. We collected book titles, authors, publishers, reviews, ratings, and covers from the “Literature and Fiction” genre in the Amazon bookstore and conducted an experiment to predict readers’ satisfaction ratings based on book reviews, metadata, and book covers. Several deep learning classifiers (CNN, ResNet, LSTM, BiLSTM, GRU, BiGRU) were employed.
Reviews alone can reach a certain level of prediction performance, but adding metadata, cover images, and cover objects to a review-based predictive model slightly improves that performance. Based on these results, we confirmed that both metadata and book covers improve predicting readers’ perceived satisfaction.
This study is a pilot exploration of the idea that multimodal approaches can improve the prediction of the perceived satisfaction of book readers.

Data Collection

Methods – classification models

We implemented four case models according to the input data:

models with book reviews
models with book reviews and metadata
models with book reviews and metadata, and cover images
models with book reviews, metadata, cover images, and cover objects

We used CNN, LSTM, BiLSTM, GRU, and BiGRU for review and metadata, CNN, ResNet for cover images, and DNN for cover objects.

Architecture of the fused deep learning model

(Example of CNN+LSTM+ DNN)

Results

1. Performance comparison of machine learning and deep learning models based on book reviews

The best accuracy of the deep learning models is higher than that of machine learning models.

2. Performance comparison of the baseline model and all other improved models

In terms of best accuracy and average accuracy, adding metadata and cover images to a review-based predictive model slightly improved performance, but adding cover objects reduced performance.

Dataset

We put 100 sample data for testing in the 'test_data' folder. The 'test_data' folder contains raw data and preprocessed data. 'test.ipynb' is the file for testing.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
test_data		test_data
.gitignore		.gitignore
.ipynb		.ipynb
README.md		README.md
data_init.py		data_init.py
dataloader.py		dataloader.py
main.py		main.py
test.ipynb		test.ipynb
yolo.py		yolo.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Can book covers help predict bestsellers using machine learning approaches?

Abstract

Data Collection

Methods – classification models

Architecture of the fused deep learning model

Results

1. Performance comparison of machine learning and deep learning models based on book reviews

2. Performance comparison of the baseline model and all other improved models

Dataset

About

Releases

Packages

Languages

dxlabskku/Prediction_Reader_Satisfaction

Folders and files

Latest commit

History

Repository files navigation

Can book covers help predict bestsellers using machine learning approaches?

Abstract

Data Collection

Methods – classification models

Architecture of the fused deep learning model

Results

1. Performance comparison of machine learning and deep learning models based on book reviews

2. Performance comparison of the baseline model and all other improved models

Dataset

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages