Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BertForPreTraining with NSP #6330

Closed
choidongyeon opened this issue Aug 7, 2020 · 4 comments
Closed

BertForPreTraining with NSP #6330

choidongyeon opened this issue Aug 7, 2020 · 4 comments
Labels

Comments

@choidongyeon
Copy link
Contributor

❓ Questions & Help

Details

I am trying to train BERT from scratch following a modification of https://huggingface.co/blog/how-to-train, where I use a BertTokenizer and BertForPreTraining. The documentation for BertForPreTraining states that it has two heads on top for both pre-training processes (MLM and NSP), but the example provided only provides an example of MLM.

Based on a comment provided by @LysandreJik in a previous issue, it seems that none of the provided datasets (i.e. LineByLineTextDataset) will handle the NSP objective and this objective is excluded because the RoBERTa paper has proven that the NSP objective was not particularly helpful.

@LysandreJik additionally noted that anyone who wants to implement the NSP objective can do so by changing the dataset/training loop, and I was wondering if there were any plans to add support for NSP for the sake of completeness?

It seems that something similar to what is going on in a PR (#6168) for Albert SOP can be done. Is this correct and can anyone provide me with some guidance moving forward?

@LysandreJik
Copy link
Member

Hi! Supporting the NSP objective is not on our roadmap, due to the reason you've linked and because of insufficient bandwidth.

However, similar to the work in #6168 for SOP, we're very open to contributions and would accept a PR adding the BERT NSP objective to the datacollators/datasets.

@choidongyeon
Copy link
Contributor Author

Awesome, I've been working on something similar. Will open a PR, thanks!

@stale
Copy link

stale bot commented Oct 10, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@Jetcodery
Copy link

Jetcodery commented Feb 7, 2021

@choidongyeon May i ask if the work on dataset part using in BertForPreTraining APIs is finished? Any example codes like run_mlm.py (is there a run_mlm_nsp.py?) can help, looking forward to your reply, thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants