-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load BioBERT weights #135
Comments
Documenting how I finally got this to work:
Where
model = BertForTokenClassification.from_pretrained('path/to/biobert.gz', num_labels=num_labels)
self.tokenizer = BertTokenizer.from_pretrained('path/to/biobert.gz', do_lower_case=False) |
Hi, we've updated all the other BioBERT weights (v1.0) as the same format as v1.1, so it should work now. |
That’s great, thanks for letting me know. Is there any reason to use v1.0 if I just want the best performance possible? Or should I stick with v1.1? |
For most tasks, it will be better to stick with with v1.1, but v1.0 (+PubMed 200K +PMC 270K) works well, too as shown in the paper (only minor differences). Note that we haven't updated our paper with performance of v1.1 (will take some time). |
Right. Okay great, thanks for the response! |
Thanks for sharing what worked for you. I followed the steps provided and everything worked except I discovered (as of writing) that when compressing the files together they can't be in a directory, they just have to be flat. |
Hi @jhyuklee , the download files[BioBERT-Base v1.1 (+ PubMed 1M] donot contain .ckpt file, it has : model.ckpt-1000000.data-00000-of-00001, model.ckpt-1000000.index, model.ckpt-1000000.meta which one is an accurate checkpoint file? DataLossError: Unable to open table file biobert_v1.1_pubmed/model.ckpt-1000000.data-00000-of-00001: Data loss: not an sstable (bad magic number): perhaps your file is in a different file format and you need to use a different restore operator? need to port the bio-bert to pytorch, to be able to compare with other SOTA/research models |
Hi @JohnGiorgi , I tried the steps you mentioned, but this error while importing the gz file: any hints? |
Are you simply looking to load BioBERT with HF Transformers? If so, you can follow this code: https://huggingface.co/monologg/biobert_v1.1_pubmed. If you search BioBERT here you can see several varients and how to load them. |
Figure out how to load BioBERTs weights.
See these links for help.
The text was updated successfully, but these errors were encountered: