Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
jhyuklee committed Apr 11, 2019
1 parent c02d9a8 commit f3d4399
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@ This repository provides pre-trained weights of BioBERT, a language representati
## Downloading pre-trained weights
Go to [releases](https://github.com/naver/biobert-pretrained/releases) section of this repository, and download pre-trained weights of BioBERT. We provide three combinations of pre-trained weights: BioBERT (+ PubMed), BioBERT (+ PMC), and BioBERT (+ PubMed + PMC). Pre-training was based on the [original BERT code](https://github.com/google-research/bert) provided by Google, and training details are described in our paper. Currently available versions of pre-trained weights are as follows:

* **BioBERT v1.0 (+ PubMed 200K)** - based on BERT-base (same vocabulary)
* **BioBERT v1.0 (+ PMC 270K)** - based on BERT-base (same vocabulary)
* **BioBERT v1.0 (+ PubMed 200K + PMC 270K)** - based on BERT-base (same vocabulary)
* **BioBERT v1.0 (+ PubMed 200K)** - based on BERT-base-Cased (same vocabulary)
* **BioBERT v1.0 (+ PMC 270K)** - based on BERT-base-Cased (same vocabulary)
* **BioBERT v1.0 (+ PubMed 200K + PMC 270K)** - based on BERT-base-Cased (same vocabulary)

Make sure to specify the versions of pre-trained weights used in your works. Note that as we are using WordPiece vocabulary (`vocab.txt`) provided by Google, any new words in biomedical corpus can be represented with subwords (for instance, Leukemia => Leu + ##ke + ##mia). Building a new subword vocabulary for BioBERT could lose compatibility with the original pre-trained BERT. More details are in the closed [issue #1](https://github.com/naver/biobert-pretrained/issues/1).

Expand Down

0 comments on commit f3d4399

Please sign in to comment.