[Bug] Cannot override a default config group that has been set up in a Structured Config #2039

jzazo · 2022-02-17T17:56:49Z

🐛 Bug

Description

I set up structured config schemas with nested config groups as attributes. I want to set one config group as a default value in the structured config schema, and override it in the yaml files. I get different errors depending if I set the default in the defaults list, or directly as an initialized config.

Checklist

I checked on the latest version of Hydra
I created a minimal repro (See this for tips).

To reproduce

This is my example.py file:

#!/usr/bin/env python3
from dataclasses import dataclass, field
from typing import Any, List

import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import MISSING, DictConfig, OmegaConf

@dataclass
class CriterionConf:
    _target_: str = MISSING

@dataclass
class BCEConf(CriterionConf):
    _target_: str = "torch.nn.BCELoss"

@dataclass
class CrossEntropyConf(CriterionConf):
    _target_: str = "torch.nn.CrossEntropyLoss"

@dataclass
class RegressionConf:
    criterion: CriterionConf = MISSING

@dataclass
class Config:
    model: RegressionConf = MISSING

cs = ConfigStore.instance()
cs.store(name="base_config", node=Config)
cs.store(group="ml", name="regression", node=RegressionConf)
cs.store(group="ml", name="cross_entropy", node=CrossEntropyConf)
cs.store(group="ml", name="bce", node=BCEConf)

@hydra.main(config_path="conf", config_name="config")
def my_app(cfg: DictConfig) -> None:
    print(OmegaConf.to_yaml(cfg, resolve=True))

if __name__ == "__main__":
    my_app()

This is my conf/config.yaml file:

defaults:
  - base_config
  - /ml/regression@model
  - /ml/bce@model.criterion

This is my conf/criterion/bce.yaml:

defaults:
  - /ml/bce

This is my conf/criterion/cross_entropy.yaml:

defaults:
  - /ml/cross_entropy

If I run ./example.py (no default list yet), everything works and I get:

$ ./example.py
model:
  criterion:
    _target_: torch.nn.BCELoss

If I modify the RegressionConf with an initialized config, e.g.,

@dataclass
class RegressionConf:
    criterion: CriterionConf = CrossEntropyConf()

I get the following error:

$ ./example.py
In 'ml/bce': Validation error while composing config:
Merge error: BCEConf is not a subclass of CrossEntropyConf. value: {'_target_': 'torch.nn.BCELoss'}
    full_key:
    object_type=Config

If I modify RegressionConf with a defaults list, e.g.,

@dataclass
class RegressionConf:
    defaults: List[Any] = field(default_factory=lambda: [{"criterion": "/ml/cross_entropy"}])
    criterion: CriterionConf = MISSING

I get the following error:

$ ./example.py
In 'ml/regression': Could not find 'ml/criterion//ml/cross_entropy'

Config search path:
        provider=hydra, path=pkg://hydra.conf
        provider=main, path=file:///home/javier/repos/immunocam/immunocam-bolts/src/python/immunocam/bolts_sandbox/hydra/conf
        provider=schema, path=structured://

Expected Behavior

In the first case I would expect that the type of criterion remains CriterionConf and does not switch to CrossEntropyLoss.
In the second case, I would expect that the group config /ml/cross/entropy is global, rather than relative, i.e., ml/criterion//ml/cross_entropy, as the error seems to indicate.

System information

Hydra Version : 1.1.1
Python version : 3.8.10
Virtual environment type and version : poetry 1.1.12
Operating system : Ubuntu 20.04

The text was updated successfully, but these errors were encountered:

Jasha10 · 2022-02-17T22:14:00Z

Hello again @jzazo!

First, let me visualize the tree of input configs that can be accessed by Hydra:

├── config                   # file=conf/config.yaml
├── criterion
│   ├── bce                  # file=conf/criterion/bce.yaml
│   └── cross_entropy        # file=conf/criterion/cross_entropy.yaml
├── base_config              # node=Config
└── ml
    ├── bce                  # node=BCEConf
    ├── cross_entropy        # node=CrossEntropyConf
    └── regression           # node=RegressionConf

Your first example:

If I run ./example.py (no default list yet), everything works and I get:
$ ./example.py
model:
  criterion:
    _target_: torch.nn.BCELoss

Yes indeed. In this case, the following four input configs are being merged:

base_config
ml/regression (at package model)
ml/bce (at package model.criterion)
config

This can be confirmed by using the --info=defaults flag:

 $ ./example.py --info=defaults

Defaults List
*************
| Config path                 | Package             | _self_ | Parent       |
------------------------------------------------------------------------------
...
| base_config                 |                     | False  | config       |
| ml/regression               | model               | False  | config       |
| ml/bce                      | model.criterion     | False  | config       |
| config                      |                     | True   | <root>       |
------------------------------------------------------------------------------

Note that the files conf/criterion/bce.yaml and
conf/criterion/cross_entropy.yaml are not being loaded.
Instead, the defaults list from config.yaml is loading the configs ml/regression and ml/bce directly.

Your second example:

If I modify the RegressionConf with an initialized config, e.g.,

@dataclass
class RegressionConf:
    criterion: CriterionConf = CrossEntropyConf()

I get the following error:

$ ./example.py
In 'ml/bce': Validation error while composing config:
Merge error: BCEConf is not a subclass of CrossEntropyConf. value: {'_target_': 'torch.nn.BCELoss'}
    full_key:
    object_type=Config

This is expected. The issue is that an instance of BCEConf is being merged into an instance of CrossEntropyConf.
This error can be reproduced using OmegaConf: below, I've simulated merging the four input configs that appear in the above --defaults output:

>>> input1 = Config                             # base_config
>>> input2 = {"model": RegressionConf}          # ml/regresssion@model
>>> input3 = {"model": {"criterion": BCEConf}}  # ml/bce@model.criterion
>>> input4 = {}                                 # config
>>> OmegaConf.merge(input1, input2, input3, input4)
Traceback (most recent call last):
  ...
omegaconf.errors.ValidationError: Merge error: BCEConf is not a subclass of CrossEntropyConf. value: {'_target_': 'torch.nn.BCELoss'}
    full_key:
    object_type=Config

Note that the error message exactly the same.
The failure occurs for the same reason that OmegaConf.merge(CrossEntropyConf, BCEConf) would fail: BCEConf is not a subclass of CrossEntropyConf.

Your third example:

If I modify RegressionConf with a defaults list, e.g.,

@dataclass
class RegressionConf:
    defaults: List[Any] = field(default_factory=lambda: [{"criterion": "/ml/cross_entropy"}])
    criterion: CriterionConf = MISSING

I get the following error:

$ ./example.py
In 'ml/regression': Could not find 'ml/criterion//ml/cross_entropy'

This is also expected. Basically, writing {"criterion": "/ml/cross_entropy"} means that Hydra will search for an input config called
"/ml/cross_entropy" inside of a group called "criterion". This is all done relative to the "ml" group where the
RegressionConf node is stored, so the final lookup path is "ml/criterion//ml/cross_entropy".

Instead of searching for a config at "ml/criterion//ml/cross_entropy", you need to search for "ml/cross_entropy":

@dataclass
class RegressionConf:
    defaults: List[Any] = field(default_factory=lambda: [{"/ml@criterion": "cross_entropy"}])
    criterion: CriterionConf = MISSING

The "/ml" part tells Hydra to look for a top-level group called ml; the leading slash means that the address of the "ml" group is
absolute (instead of being relative). The "cross_entropy" part means that the /ml/cross_entropy is what
Hydra should search for, and the @criterion part means that this /ml/cross_entropy node should
be placed in the criterion package.

For this to work, you will also need to change your config.yaml file as follows:

defaults:
  - base_config
  - /ml/regression@model
  - override /ml@model.criterion: bce

$ ./example.py
model:
  criterion:
    _target_: torch.nn.BCELoss

You can find more documentation on how this works in the docs on the defaults list.

Note that once again the files conf/criterion/bce.yaml and
conf/criterion/cross_entropy.yaml are not being loaded. I'd recommend deleting these yaml files to avoid confusion.

A final example:

Below is a strategy for storing RegressionConf directly in the model group (instead of in ml), and
for storing CrossEntropyConf and BCEConf directly in model/criterion:

@dataclass
class RegressionConf:
    defaults: List[Any] = field(default_factory=lambda: [{"criterion": "cross_entropy"}])
    criterion: CriterionConf = MISSING

...

cs = ConfigStore.instance()
cs.store(name="base_config", node=Config)
cs.store(group="model", name="regression", node=RegressionConf)
cs.store(group="model/criterion", name="cross_entropy", node=CrossEntropyConf)
cs.store(group="model/criterion", name="bce", node=BCEConf)

With this setup, the tree of accessible input configs looks like this:

├── config                   # file=conf/config.yaml
├── base_config              # node=Config
└── model
    ├── regression           # node=RegressionConf
    └── criterion
        ├── cross_entropy    # node=CrossEntropyConf
        └── bce              # node=BCEConf

Now, change config.yaml to look like this:

defaults:
  - base_config
  - model: regression
  - override model/criterion: bce

Running the example:

 $ ./example.py
model:
  criterion:
    _target_: torch.nn.BCELoss

jzazo · 2022-02-18T16:54:09Z

Thank you @Jasha10 for this awesome answer. Everything is quite clear, although I am still struggling a bit (in my real problem, see message below).

Regading your proposed fixes, if I try to reproduce the third example with:

@dataclass
class RegressionConf:
    defaults: List[Any] = field(default_factory=lambda: [{"/ml@criterion": "cross_entropy"}])
    criterion: CriterionConf = MISSING

# config.yaml
defaults:
  - base_config
  - /ml/regression@model
  - override /ml@model.criterion: bce

My output becomes:

$ ./example.py
model:
  defaults:
  - /ml@criterion: cross_entropy
  criterion:
    _target_: torch.nn.BCELoss

In your final example, I get the following output as well:

$ ./example.py
model:
  defaults:
  - criterion: cross_entropy
  criterion:
    _target_: torch.nn.BCELoss

It seems that the defaults list is persistent in the returned config, why does this happen? I cannot instantiate objects with such attribute automatically.

jzazo · 2022-02-18T19:55:10Z

I have finally identified the source of the problems I was getting in my original problem. If I change the type of model in Config from RegressorConf to ModelConf, then things break down for the default list in the structured config, but not in yamls:

The following works:

@dataclass
class ModelConf:
    _target_: MISSING

class RegressionConf(ModelConf):
    criterion: LossConf = MISSING

@dataclass
class Config:
    model: ModelConf= MISSING

# config.yaml
defaults:
  - base_config
  - model: regression

# model/regression.yaml
defaults:
  - /model/base_regression@_here_
  - /model/loss@criterion: bce  # <-- no conflict when merging this config into criterion

resulting in:

$ ./example.py
model:
  _target_: ???
  criterion:
    _target_: torch.nn.BCELoss

The follwoing breaks:

@dataclass
class ModelConf:
    _target_: str = MISSING

@dataclass
class Config:
    model: ModelConf= MISSING  # <-- conflict when assigning criterion to model because it assumes it's ModelConf!

class RegressionConf(ModelConf):
    defaults: List[Any] = field(default_factory=lambda: [{"/model/loss@criterion": "bce"}])
    criterion: LossConf = MISSING

# regression.yaml
defaults:
  - /model/base_regression@_here_

fails with:

omegaconf.errors.ConfigKeyError: Key 'criterion' not in 'ModelConf'
    full_key: model.criterion
    reference_type=ModelConf
    object_type=ModelConf

hydra.errors.ConfigCompositionException

This is the config store in both cases:

cs = ConfigStore.instance()
cs.store(name="base_config", node=Config)
cs.store(group="model", name="base_regression", node=RegressionConf)
cs.store(group="model/loss", name="cross_entropy", node=CrossEntropyConf)
cs.store(group="model/loss", name="bce", node=BCEConf)

It seems to me that the behavior is different if the default list is on yaml or in the structured config. In the yaml case the inferred type is of RegressionConf type and things work out, but in the structured config case, the structured config does not udpate the type to RegressionConf and it remains ModelConf. Is this behavior intended or a bug? Shouldn't both cases behave the same?

Thanks for all the support!

jzazo added the bug Something isn't working label Feb 17, 2022

Jasha10 added question Hydra usage question and removed bug Something isn't working labels Feb 17, 2022

AdemFr mentioned this issue Mar 18, 2022

[Bug] Structured config in config store is not found with SearchPathPlugin #2102

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Cannot override a default config group that has been set up in a Structured Config #2039

[Bug] Cannot override a default config group that has been set up in a Structured Config #2039

jzazo commented Feb 17, 2022 •

edited by Jasha10

Loading

Jasha10 commented Feb 17, 2022

jzazo commented Feb 18, 2022 •

edited

Loading

jzazo commented Feb 18, 2022 •

edited

Loading

[Bug] Cannot override a default config group that has been set up in a Structured Config #2039

[Bug] Cannot override a default config group that has been set up in a Structured Config #2039

Comments

jzazo commented Feb 17, 2022 • edited by Jasha10 Loading

🐛 Bug

Description

Checklist

To reproduce

Expected Behavior

System information

Jasha10 commented Feb 17, 2022

Your first example:

Your second example:

Your third example:

A final example:

jzazo commented Feb 18, 2022 • edited Loading

jzazo commented Feb 18, 2022 • edited Loading

jzazo commented Feb 17, 2022 •

edited by Jasha10

Loading

jzazo commented Feb 18, 2022 •

edited

Loading

jzazo commented Feb 18, 2022 •

edited

Loading