Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate Hydra for composing config and command line arguments #807

Closed
sai-prasanna opened this issue Feb 9, 2020 · 35 comments
Closed

Evaluate Hydra for composing config and command line arguments #807

sai-prasanna opened this issue Feb 9, 2020 · 35 comments
Labels
discussion In a discussion stage feature Is an improvement or enhancement help wanted Open to be worked on won't fix This will not be worked on
Milestone

Comments

@sai-prasanna
Copy link

sai-prasanna commented Feb 9, 2020

🚀 Feature

We can evaluate hydra for configuration.

Hydra is an open-source Python framework that simplifies the development of research and other complex applications. The key feature is the ability to dynamically create a hierarchical configuration by composition and override it through config files and the command line. The name Hydra comes from its ability to run multiple similar jobs - much like a Hydra with multiple heads.

Motivation

The image on the left is from the PyTorch ImageNet training example. Despite being a minimal example, the number of command-line flags is already high. Some of these flags logically describe the same component and should ideally be grouped (for example, flags related to distributed training) — but there is no easy way to group those flags together and consume them as a group.

Hydra provides config file composition with overrides, command line completions and parameter sweep.

https://medium.com/pytorch/hydra-a-fresh-look-at-configuration-for-machine-learning-projects-50583186b710

Pitch

Evaluate whether hydra is more powerful than argparse. Check whether the overhead of it doesn't make stuff too complex.

@sai-prasanna sai-prasanna added feature Is an improvement or enhancement help wanted Open to be worked on labels Feb 9, 2020
@hadim
Copy link
Contributor

hadim commented Feb 9, 2020

In lightning, hparams needs to be flat and so it cannot contain a dict which can be problematic in certain cases. For example, I would like to define optimizer parameters within hparams['optimizer'].

Using hydra could probably solve this issue.

@williamFalcon
Copy link
Contributor

williamFalcon commented Feb 11, 2020

@hadim @sai-prasanna this sounds awesome. I'd love to support hydra. Mind looking into it and submit a proposal here?

cc: @Borda

@Borda Borda mentioned this issue Feb 18, 2020
@tchaton
Copy link
Contributor

tchaton commented Feb 18, 2020

My current ML project is using Hydra. It is pretty awesome !

Have a look: https://github.com/nicolas-chaulet/deeppointcloud-benchmarks

@Borda
Copy link
Member

Borda commented Feb 19, 2020

I think that having parameters organised in groups would make it easier to navigate, on the other hand, passing dict as not very safe...
Ahat about having classes this kind of parameter groups which would ensure default values in resolve missing params as not it is handled by defaults...

class ParamaterGroup(ABC):
    pass

class OptomizerParams(ParamaterGroup):
    def __init__(self, param1: int = 123, ...):
        self.param1 = param1
        ...

@PyTorchLightning/core-contributors ^^

@hadim
Copy link
Contributor

hadim commented Feb 19, 2020

I like the idea of ParamaterGroup (or using hydra API would be ok too).

ParamaterGroup should also have a way to serialize itself to common formats such as JSON/YAML/TOML.

That being said I think it's important to keep allowing dict type for hparams for more simple scenarios. We could simply convert hparams to ParamaterGroup whenever a dict is detected during init. We should also be careful when a nested dict is detected; is it a new ParameterGroup or just a parameter with a dict type?.

@Borda
Copy link
Member

Borda commented Feb 20, 2020

YAML is great and compares to JSON you can write a comment inside... lol @williamFalcon ^^ ?

@sai-prasanna
Copy link
Author

I would look into the current configuration flow and try to evaluate the pros and cons of hydra over it this weekend.

@mikerossgithub
Copy link

Not a current lightning user, but our team is evaluating switching over.

When you look at hudra, you might want to take a look at jssonet too. From what I can tell, hydra has a pretty mediocre use of variables, compared to jsonnet where you can cleanly define variables and do simple math and boolean logic on them. Jsonnet also implements composing multiple configs, just like hydra. And it can implement command line overrides via something like jsonargparse.

Anyway, its useful for allennlp so might be worth taking a look to see if it is the right thing for pytorch-lightning

@hadim
Copy link
Contributor

hadim commented Feb 22, 2020

I looked at hydra and this pretty high level and also opinionated the way they handle the configuration.

That being said the configuration object they use is based on omegaconf which looks really really nice (similar to what @Borda proposes).

So I would propose to use/support omegaconf. Hydra can then be used by anyone who wants, there is no need to integrate it to PL IMO.

@sai-prasanna
Copy link
Author

I am a Allennlp contributor. Allennlp provides abstractions to make models generic. i.e. Say you have a text classifier, instead of putting an LSTM inside it, you would rather recieve a abstract Seq2SeqEncoder via constructor and use it's interface. LSTM would be a subclass of Seq2Seq abstract class and registered with a string name to it. This allows one to use configuration to control what model is actually built. It primarily relies on constructor dependency injection. This is a opinionated way of doing things. IMO it won't suite pytorch lightning's ideology of providing training boiler plate and leaving the rest to the implementer (who can choose to do dep injection or not).

I agree with @hadim 's assessment that being too opinionated is wrong choice for lightning. Instead it would be better to remove impediments for users to use their own config system, while maybe providing a sane default maybe as an example.

I have two questions for regular users of lightning.

  1. What flaws do people see in the current configuration flow in lightning and What currently prevent the user from using their configuration file/argparse system of choice in the current design?

  2. Does lightning have to concern itself with stuff like hyperparameter search?

@williamFalcon
Copy link
Contributor

  1. Ideally lightning should support any hyperparameter configuration/ search library. Would welcome a PR to make that happen.

  2. @sai-prasanna you can pass in models into the constructor in lightning right now, but why wouldn’t you just import the model in the lightningmodule and init in the constructor?

from allennlp import lstm

class MyS2S(pl.LightningModule):

def init(self, hparams):
self.encoder = lstm()
...

@sai-prasanna
Copy link
Author

@williamFalcon In allennlp it is done that way to search registry using with type signature of objects in the constructor.

@Model.register("seq2seq")
class Seq2Seq(Model):
   def __init__(self, encoder: Seq2SeqEncoder):
         ...
@Seq2SeqEncoder.register("lstm")
class LSTM(Seq2SeqEncoder):
   def __init__(self, h: int):
       ...

Now in a configuration

{
  "model": {
    "type": "seq2seq"
    "encoder": {
        "type": "lstm"
        "h": 1000
    }
}

This allows multiple experiments to be made by configuration changes, without having to write a mapping between the configuration and the actual class.
Anyway, I would prefer plt should be agnostic to all this as it is now.

@williamFalcon I used plt few months back, so was a bit hazy. Since in the current work flow, creating models is explicit there is no need for anything more. Anyone wishing to use hydra can use that instead of argparse to construct the models easily. We can have an example of using hydra for some complex configuration case instead of argparse and be done with that.
Would that do?

@sai-prasanna
Copy link
Author

In AllenNLP there is effort now to get pytorch-lightning work as its trainer abstraction.

allenai/allennlp#3860

I think AllenNLP's jsonnet configuration works well for it. Would be interesting to see how this synthesis goes. AllenNLP though focused on NLP, has very good dependency injection + configuration management abstractions that make writing models that are easily ammenable to experimentation via simple configuration changes.

With this integration, one can use the powerful config system in allennlp for many different types of tasks (maybe with few changes here and there).

@S-aiueo32
Copy link
Contributor

Hi all!
The hydra support is ready after merging #1152!
I plan to send PR containing the following changes:

  1. LightningModule explicitly accepts DictConfig as the hparams
  2. log_hyperparams converts the hparams to built-in dict using OmegaConfig.to_container(self.hparams, resolve=True)

Please correct me if I'm wrong and suggest the other features to add!

@lkhphuc
Copy link

lkhphuc commented Apr 4, 2020

Yeah, the way I'm currently using Hydra is:

  • In config.yaml
# Put all other args here
batch_size: 128
path: my_data/train

# can organize as needed
model:
  - pretrained: albert-based-uncased
  - num_layers: 2
  - dropout: 0.5

optim:
  - lr: 3e-4
  - scheduler: CosineLR

# Group all trainer args under this
trainer:
  - gpus: 1
  - val_percent_check: 0.1
  .... Put all trainer args here ...

and then in training

class MyModule(LightningModule):
    self.__init__(hparams: DictConfig):
         self.hparams = hparams
         self.model = BaseModel(version=self.hparams.model.version, layers= self.hparams.model.layer)
    self.__configure_optimizer(self):
         opt = Adam(lr=self.hparams.optim.lr)
.....
module = MyModule(hparams=cfg)
trainer = Trainer(**OmegaConf.to_container(cfg.trainer, resolve=True))
trainer.fit(module)

@Borda Borda added this to the 0.8.0 milestone Apr 4, 2020
@Borda Borda added the discussion In a discussion stage label Apr 4, 2020
@yukw777
Copy link
Contributor

yukw777 commented Apr 26, 2020

I've tried a few different approaches in using hydra with PL, and I believe this is the cleanest approach:

  1. https://github.com/yukw777/leela-zero-pytorch/blob/d90d5fc93e86647638a59a7957c6c930ec4268fe/leela_zero_pytorch/train.py#L15-L47 (note the use of strict=False)
  2. https://github.com/yukw777/leela-zero-pytorch/tree/master/leela_zero_pytorch/conf

Example command:

python -m leela_zero_pytorch.train. network=huge train.dataset.dir_path=some/dataset/train pl_trainer.gpus=-1

Pros:

  1. We don't need more substantial changes on the PL side to integrate Hydra.
  2. You can specify all the PL related parameters in the config files or in the command line.

Cons:

  1. Not all the PL options are displayed in the help message.
  2. You do have to know a bit more about hydra (I don't think this is a big problem.)
  3. Some custom parsing logic and validation (e.g. --gpus) have to be ported.

@yukw777
Copy link
Contributor

yukw777 commented May 6, 2020

I tried mixing argparse with hydra: https://github.com/yukw777/leela-zero-pytorch/blob/983d9568ed34ed06ebde47ecb65b1e3b2d3a37c0/leela_zero_pytorch/train.py#L17-L52

This way Hydra doesn't own the logging directory structure.

@Borda Borda added this to the 0.8.0 milestone May 26, 2020
@Borda Borda modified the milestones: 0.8.0, 0.9.0 Jun 9, 2020
@celsofranssa
Copy link

celsofranssa commented Jun 22, 2020

Friends,

I did something like @lkhphuc:

  • config.yaml
#configs/config.yaml
dataset:
  train_path: "resources/dataset/train.jsonl"
  test_path: "resources/dataset/train.jsonl"
  val_path: "resources/dataset/train.jsonl"

train:
  batch_size: 32
test:
  batch_size: 64
val:
  batch_size: 32

preprocessing:
  max_length: 64
  • entry point
@hydra.main(config_path="configs/config.yaml", strict=False)
def dev_run(cfg: DictConfig):
    
    print(cfg.pretty())

    model = JointEncoder(hparams=cfg, ... )
    trainer = Trainer(fast_dev_run=True)
    trainer.fit(model)


if __name__ == "__main__":
    dev_run()
  • LightningModule
class JointEncoder(LightningModule):
    """Encodes the code and docstring into an same space of embeddings."""

    def __init__(self,
                 hparams: DictConfig,
                 code_encoder,
                 docstring_encoder
                 ):
        super(JointEncoder, self).__init__()
        self.hparams = hparams
        ...
        self.loss_fn = NPairsLoss()

But I'm getting the following error:

ValueError: Unsupported config type of <class 'omegaconf.dictconfig.DictConfig'>.

@williamFalcon
Copy link
Contributor

I tried mixing argparse with hydra: https://github.com/yukw777/leela-zero-pytorch/blob/983d9568ed34ed06ebde47ecb65b1e3b2d3a37c0/leela_zero_pytorch/train.py#L17-L52

This way Hydra doesn't own the logging directory structure.

awesome this is great! mind adding a tutorial to the docs on this? (under hyperparameters)

@williamFalcon
Copy link
Contributor

Yeah, the way I'm currently using Hydra is:

  • In config.yaml
# Put all other args here
batch_size: 128
path: my_data/train

# can organize as needed
model:
  - pretrained: albert-based-uncased
  - num_layers: 2
  - dropout: 0.5

optim:
  - lr: 3e-4
  - scheduler: CosineLR

# Group all trainer args under this
trainer:
  - gpus: 1
  - val_percent_check: 0.1
  .... Put all trainer args here ...

and then in training

class MyModule(LightningModule):
    self.__init__(hparams: DictConfig):
         self.hparams = hparams
         self.model = BaseModel(version=self.hparams.model.version, layers= self.hparams.model.layer)
    self.__configure_optimizer(self):
         opt = Adam(lr=self.hparams.optim.lr)
.....
module = MyModule(hparams=cfg)
trainer = Trainer(**OmegaConf.to_container(cfg.trainer, resolve=True))
trainer.fit(module)

the problem here is that you always have to list out trainer args.

args = Trainer.add_argparse_args(args)

the above will enable all arguments in argparse

@qiuhuaqi
Copy link

qiuhuaqi commented Aug 3, 2020

Friends,

I did something like @lkhphuc:

  • config.yaml
#configs/config.yaml
dataset:
  train_path: "resources/dataset/train.jsonl"
  test_path: "resources/dataset/train.jsonl"
  val_path: "resources/dataset/train.jsonl"

train:
  batch_size: 32
test:
  batch_size: 64
val:
  batch_size: 32

preprocessing:
  max_length: 64
  • entry point
@hydra.main(config_path="configs/config.yaml", strict=False)
def dev_run(cfg: DictConfig):
    
    print(cfg.pretty())

    model = JointEncoder(hparams=cfg, ... )
    trainer = Trainer(fast_dev_run=True)
    trainer.fit(model)


if __name__ == "__main__":
    dev_run()
  • LightningModule
class JointEncoder(LightningModule):
    """Encodes the code and docstring into an same space of embeddings."""

    def __init__(self,
                 hparams: DictConfig,
                 code_encoder,
                 docstring_encoder
                 ):
        super(JointEncoder, self).__init__()
        self.hparams = hparams
        ...
        self.loss_fn = NPairsLoss()

But I'm getting the following error:

ValueError: Unsupported config type of <class 'omegaconf.dictconfig.DictConfig'>.

@ceceu self.hparams=hparams works with omegaconf=2.0.0, needs to upgrade hydra-core as well. (#2197 (comment))

@edenlightning edenlightning modified the milestones: 0.9.0, 0.9.x Aug 18, 2020
@edenlightning edenlightning modified the milestones: 0.9.x, 1.1 Sep 1, 2020
@vict0rsch
Copy link

I tried mixing argparse with hydra: yukw777/leela-zero-pytorch@983d956/leela_zero_pytorch/train.py#L17-L52

This way Hydra doesn't own the logging directory structure.

(just passing by)

Note: strict=False is deprecated in Hydra 1.0 :(

@yukw777
Copy link
Contributor

yukw777 commented Sep 6, 2020

@vict0rsch thanks for passing by! I actually no longer use that approach, as it is not the recommended approach. Please check out my blog post for more details! https://link.medium.com/KaROmhBvz9

@stale
Copy link

stale bot commented Oct 21, 2020

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

@stale stale bot added the won't fix This will not be worked on label Oct 21, 2020
@stale stale bot closed this as completed Oct 28, 2020
@turian
Copy link
Contributor

turian commented Dec 15, 2020

I would like to propose that this issue be re-opened. Facebook Fairseq has now standardized on Hydra

@yukw777
Copy link
Contributor

yukw777 commented Dec 15, 2020

@turian The conclusion was that PL and Hydra are orthogonal, and there isn't really a good reason for PL to integrate Hydra directly as its default config/cli system. Hydra can be viewed as a fancier argparse and can be combined with PL as such. I've been using Hydra that way with my PL projects and it's worked very well, which you can read about in my blog posts (https://towardsdatascience.com/training-neural-networks-for-leela-zero-using-pytorch-and-pytorch-lightning-bbf588683065, https://towardsdatascience.com/keeping-up-with-pytorch-lightning-and-hydra-31e8ed70b2cc) and the project repo (https://github.com/yukw777/leela-zero-pytorch). There also are other seeder projects you can use to integrate Hydra with PL.

@turian
Copy link
Contributor

turian commented Dec 16, 2020

@yukw777 My interest isn't necessarily that ptl uses Hydra as a default, but just has optional support for it. Thank you for the links.

@hadim writes "In lightning, hparams needs to be flat". Does that mean that if I have a non-flat Hydra OmegaConf, I cannot use it in lightning unless I flatten it?

@turian
Copy link
Contributor

turian commented Dec 16, 2020

@yukw777 @williamFalcon I am still interested in an official tutorial / doc, as was suggested in #2322

@yukw777
Copy link
Contributor

yukw777 commented Dec 16, 2020

@turian ah i see. in that case, yes PL definitely supports Hydra. As you can tell from the age of this issue, Hydra's been on our radar for a while, and we've fixed a number of bugs related to its integration into PL. For example, non-flat Hydra OmegaConf is supported, so no need to flatten it.

I guess what we're missing is an official tutorial as we just have a disjoint set of seeder projects and blog posts. It is quite easy to start using Hydra with PL though. Do you have anything in particular you'd like to see in an official tutorial?

@turian
Copy link
Contributor

turian commented Dec 16, 2020

@yukw777 At the bare minimum, I would like to see a skeleton PL with best practices for using Hydra. So that, for example, my Hydra configs appear in Tensorboard. I would also be interested in common pitfalls and gotchas to avoid.

Lastly, some guidance on best practices for what happens if you are loading an old checkpoint, and how to play nicely with Hydra. For example, I might have non-PL-model config (URL for the dataset or whatever). If I load an old checkpoint, what are the pros and cons of:

  1. I overwrite the entire config (including non model stuff) with the checkpoint configs.
  2. I ignore the checkpoint config and continue with the normal config, even with the model.
  3. I overwrite just the config from the model, nothing else.

So I'm mostly interested in patterns, best practices, and pitfalls to avoid.

@turian
Copy link
Contributor

turian commented Dec 16, 2020

Also, I'd just really like to see what are best-practice for structuring your config, particularly if you are also using tools like ray.io and DVC or replicate.ai.

Another example: Let's say I have two different neural networks and I want to switch between them to see what works better. Should I have the optim configs separately in each? Perhaps because each neural network will train better with different optimization learning rates etc. Or should I reduce duplication and just have one high-level optim config?

Talking about hyperparameter optimization (e.g. optuna or ray), I might want in the config file to specify both:

  • default value
  • possible other values (in Optuna, for example, for an float config I might want to have the config allow me to infer that I should do suggest_uniform('x', -10, 10))

@ndrplz
Copy link

ndrplz commented Dec 21, 2020

I would also be interested in having some guidance / best practice about how to best integrate Hydra with PL

@ErenBalatkan
Copy link

+1, There doesnt seem to be any clear guidelines for integrating Hydra to PL. If we had an official best practice example as a starting point it would really help us greatly.

@celsofranssa
Copy link

I would also be interested in having some guidance / best practice about how to best integrate Hydra with PL

I already have this simple example that can be expanded and generalized to something more expressive.

@igorgad
Copy link
Contributor

igorgad commented Jan 30, 2021

I've come to this implementation. I think it's quite flexible

trainer = pl.Trainer.from_argparse_args(Namespace(**OmegaConf.to_container(args, resolve=True)),
                                        callbacks=callbacks, logger=logger, enable_pl_optimizer=True)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discussion In a discussion stage feature Is an improvement or enhancement help wanted Open to be worked on won't fix This will not be worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.