Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Standard VAE and Multinomial VAE models (DIFFERENT PREPROCESSING) #1194

Merged
merged 13 commits into from
Nov 5, 2020

Conversation

kmussalim
Copy link
Contributor

Description

Added two notebooks with Standard VAE and Multinomial VAE to recommenders/examples/02_model_collaborative_filtering/

standard_vae_deep_dive
multinomial_vae_deep_dive
Also, added corresponding utils for this model to recommenders/reco_utils/recommender/vae:

multinomial_vae - contains classes for the model
standard_vae - contains classes for the model
sparse_vae - contains a function for binarization and a class for obtaining click matrix. This class is a modified version of AffinityMatrix class (can be found at recommenders/reco_utils/dataset/sparse.py /). There are 3 additional lines of code, which can be found by searching a comment "# LSECHANGE".
In this notebook we use Variational Autoencoders for recommending Top k items on MovieLens-1M dataset.
Also, the model has extension: using annealing to find optimal Beta. Using the model with optimal Beta has better performance than constant Beta = 1.

DIFFERENCE FROM PREVIOUS PR:
In these notebooks we are following the author's preprocessing procedure, i.e. filter out ratings below 3.5.
Then we convert positively rated items' ratings to 1s, while unrated items to 0s, in order to obtain click matrix.

However, we lose information about users' preferences when we filter out low ratings. Since we need true preferences in the testing set, to correctly compute NDCG, we found the way how to recover this lost information. We keep ratings below 3.5 in a separate dataframe, and use appropriate mapping to recover lost preferences of users in the test set.

In these notebooks we are following the author's preprocessing procedure, i.e. filter out ratings below 3.5.
Then we convert positively rated items' ratings to 1s, while unrated items to 0s, in order to obtain click matrix.
At first glance, this approach may look like, we lose information about users' preferences when we filter out low ratings so this may cost in performance of our model. However, by applying this filtering we make sure that if a movie is rated less than 3.5 from the users that they watched it, it will not be contained in the final click matrix. In this way, we are sure that the final click matrix contains movies that are preferred by at least one user. If we do not apply this filter, the final click matrix will be even sparser.
Since we need true preferences/ratings in the testing set in order to correctly compute NDCG, we found the way to recover the real rating instead of using “0”, “1”. We keep ratings below 3.5 in a separate dataframe, and use appropriate mapping to recover lost preferences of users in the test set.

Checklist:

  • I have followed the contribution guidelines and code style for this project.
  • I have added tests covering my contributions.
  • I have updated the documentation accordingly.
  • This PR is being made to staging and not master.

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

Review Jupyter notebook visual diffs & provide feedback on notebooks.


Powered by ReviewNB

@miguelgfierro
Copy link
Collaborator

This PR is similar to #1191 and it looks that have repeated code. Could you please merge the content and not repeat code?

About the other preprocessing, is this something that can be added into one notebook?

@kmussalim
Copy link
Contributor Author

Miguel, this PR is indeed almost the same as PR #1191.

We can do it into one notebook, but we believe that the notebook will be too complicated.
Also , the models will not be comparable since each one of them should be trained with a different dataset.

As a result, we recommend to keep one of the two different preprocessing approaches:

  1. PR # 1191 - does not apply any filtering. To obtain click matrix, converts ratings above 3.5 to "1" and everything below 3.5 to "0" as well as unrated items.

  2. PR # 1193 - follows the chosen paper's approach with filtering out ratings below 3.5. To obtain click matrix, convert remaining ratings (i.e. 4 and 5) to "1", and everything else to "0".

@miguelgfierro
Copy link
Collaborator

miguelgfierro commented Aug 27, 2020

@kmussalim about:

PR # 1193 - follows the chosen paper's approach with filtering out ratings below 3.5. To obtain click matrix, convert remaining ratings (i.e. 4 and 5) to "1", and everything else to "0".

I don't understand this. If you filter out ratings below 3.5, then you only have ratings 3.5, 4, 4.5 and 5. Then those that are 3.5 are set to 0 and the rest to 1? is that what you are doing?

In the paper they say the following: "We binarize the ex-
plicit data by keeping ratings of four or higher and interpret them as implicit feedback. We only keep users who have watched at least five movies."

Another question is, why do you need two different PRs if the only difference is the binarization? is there any other difference?

Finally, I just noticed that this PR is going to master, instead of staging

@kmussalim
Copy link
Contributor Author

I don't understand this. If you filter out ratings below 3.5, then you only have ratings 3.5, 4, 4.5 and 5. Then those that are 3.5 are set to 0 and the rest to 1? is that what you are doing?
RESPONSE: Let us explain more clearly what we are doing. We do the filtering at the beginning in order to filter out the movies that don’t have any positive review (rating > 3.5). So, for example, if all the users rate a movie under 3.5, this movie will not exist in the final click matrix (we don’t need for recommendation if no one like it). We do that because of the sparsity (less sparse click matrix). The users-to-item interactions that exist in training set determine the movies that will exist in the final click matrix. As a consequence, using the user-to-item interactions of the training users we set 1 to the movies the user has rated > 3.5 and the rest of them – the movies that haven’t rated yet - to 0.
In the paper they say the following: "We binarize the explicit data by keeping ratings of four or higher and interpret them as implicit feedback. We only keep users who have watched at least five movies."
RESPONSE: In this PR we are following the process of authors described above. However, in order to calculate NDCG correctly, we found the way how restore ratings for data used for evaluation.

Another question is, why do you need two different PRs if the only difference is the binarization? is there any other difference?
RESPONSE: The difference indeed is only in binarization process. And we are not sure which one is better for the repo. As long as you decide which binarization is more suitable for the repo, we can close one of the Pull Requests.

Two options of binarization:
Option 1 (PR #1194): Follow paper's approach, keep ratings above 3.5 and interpret them as implicit feedback, and restore ratings for evaluation part.
Option 2 (PR #1191): Not to follow paper's approach, keep all ratings, assign "1" to ratings above 3.5 and "0" to ratings below 3.5 and to unrated items.

Finally, I just noticed that this PR is going to master, instead of staging.
RESPONSE: Is there any way to change it from master to staging ? Sorry for that, my fault.

@gramhagen gramhagen changed the base branch from master to staging August 28, 2020 13:44
@gramhagen
Copy link
Collaborator

updated base to staging

@miguelgfierro
Copy link
Collaborator

Two options of binarization:
Option 1 (PR #1194): Follow paper's approach, keep ratings above 3.5 and interpret them as implicit feedback, and restore ratings for evaluation part.
Option 2 (PR #1191): Not to follow paper's approach, keep all ratings, assign "1" to ratings above 3.5 and "0" to ratings below 3.5 and to unrated items.

I don't see how they are different, what would be the output in both cases for these 2 users:

ui = np.array([1, 5, 4, 0], [0, 0, 2, 5])

@EvgeniaChroni
Copy link
Contributor

This PR is similar to #1191 and it looks that have repeated code. Could you please merge the content and not repeat code?

About the other preprocessing, is this something that can be added into one notebook?

In the 2nd PR and specifically in data filtering part we have to make sure that:

  1. user-to-movie interactions with rating <=3.5 are filtered out. Applying this filtering we make sure that if a movie is rated less than 3.5 from the users that they watched this movie, it will not be contained in the final click matrix. If we fo not apply this filter, the final click matrix will be even sparser.
  2. the users who clicked less than 5 movies are filtered out.
  3. the movies which are not clicked by any user are filtered out.

In the 1st PR we did the same filtering without step 1). As a result, in the train_data may exist a column that contains only zeros. Another drawback is that in the train_data may exist movies that are not preferred so without applying filtering 1) it is possible that the model may recommend movies that are not preferred by anybody. The final NGCD metric of test set yields worst results in 1st PR in comparison to the 2nd PR.

So, if it is also accepted by you, we recommend to keep the 2nd PR as the final choice of code and proceed with the changes and corrections at this code.

@miguelgfierro
Copy link
Collaborator

user-to-movie interactions with rating <=3.5 are filtered out. Applying this filtering we make sure that if a movie is rated less than 3.5 from the users that they watched this movie, it will not be contained in the final click matrix. If we fo not apply this filter, the final click matrix will be even sparser.

I think a clean way to approach this is to leave PR 2 with the option of filtering out nothing, so instead of 3.5, you would add None. Would that be possible?

@anargyri
Copy link
Collaborator

user-to-movie interactions with rating <=3.5 are filtered out. Applying this filtering we make sure that if a movie is rated less than 3.5 from the users that they watched this movie, it will not be contained in the final click matrix. If we fo not apply this filter, the final click matrix will be even sparser.

I think a clean way to approach this is to leave PR 2 with the option of filtering out nothing, so instead of 3.5, you would add None. Would that be possible?

We also checked the code and it looks like the low ratings are used inside the validation and test sets (_te). But this is not what the paper by Liang et al. is doing. In 4.1 they say that they remove the low ratings from Movielens, so they are not used at all. As they say, they treat the data as implicit feedback i.e. there are only positive labels in the data set (e.g. they can be interpreted as clicks).

@miguelgfierro
Copy link
Collaborator

hi, I hope you are ok, is there any advancement on this project?

Copy link
Collaborator

@miguelgfierro miguelgfierro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome work guys! this is top 👏👏👏👏

@miguelgfierro
Copy link
Collaborator

for people that have done significant contributions to the repo, we ask them to be added in the author list:

If you are ok with it, please add your names here: https://github.com/microsoft/recommenders/blob/master/AUTHORS.md

Also, would you please add the description of the algos in the main page: https://github.com/microsoft/recommenders/blob/master/README.md#algorithms and in the algo page: https://github.com/microsoft/recommenders/tree/master/examples/02_model_collaborative_filtering

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants