Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example how to pretrain lm + introduction of config_name #57

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

PiotrCzapla
Copy link
Member

I've added ability to limit training set so we can use a test configuration 'multifit_mini_test` that executes in ~20 secs to test that the scripts are working.

Why config_name?

I've added it so we can know what training parameters we should load for the finetune-lm and and classifier. This parameters aren't stored along with a language model, only parameters used to build that model are saved.

@@ -280,7 +288,7 @@ def train_(self, dataset_or_path, tokenizer=None, **train_config):
print("Language model saved to", self.experiment_path)

def validate(self):
raise NotImplementedError("The validation on the language model is not implemented.")
return "not implemented"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really just want to return a string here?

From command line:
```
$ bash prepare_wiki.sh de
$ python -W ignore -m multifit new multifit_paper_version replace_ --name my_lm - train_ --pretrain-dataset data/wiki/de-100
Copy link
Collaborator

@sebastianruder sebastianruder Nov 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there's a superfluous space between - and train-. Why do we use train_ here? What is the difference between train_ and train?

Copy link
Collaborator

@sebastianruder sebastianruder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Piotr, thanks for adding this. Looks good in general. I've added a few comments about minor things. In general, do you think it'd be possible to add a few short docstrings to explain things like bs, bptt, limit in load_lm_databunch for people not familiar with the library?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants