-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include support to bytes
built-in type.
#845
Conversation
Hi @pedropgusmao, thanks for getting this started! Here are a few changes that I think should be made:
Edit (2021-01-30):This PR is now ready for review. Here are the enabled use-cases:
@dataclass
class Config:
data: bytes = b"binary_data"
s: str = "string data"
cfg = OmegaConf.create(Config)
assert type(cfg.data) is bytes
cfg.data = b"updated_binary_data"
# For safety's sake, there is no implicit conversion:
cfg.data = "string data" # raises ValidationError: can't convert str -> bytes
cfg.data = 123 # raises ValidationError: can't convert int -> bytes
cfg.s = b"bin_data" # raises ValidationError: can't convert bytes -> str
>>> conf = OmegaConf.create({"key": b"value"})
>>> print(conf)
{'key': b'value'}
>>> yml = OmegaConf.to_yaml(conf)
>>> print(yml)
key: !!binary |
dmFsdWU=
>>> OmegaConf.create(yml)
{'key': b'value'}
>>> # This yaml support is compatible with Hydra.
>>> OmegaConf.create({b"binary_key": 123})
{b'binary_key': 123} Note that OmegaConf is not tuned for working with big data: a config's contents may be copied multiple times during config manipulation (which would be bad if the data is big), and storing big data in a yaml file is strongly discouraged. Support for |
There are several places in OmegaConf's tests where each supported type is tested. To keep consistency with the other tests, it would be good to add
|
An open questions:
|
I have pushed two commits to this PR branch. |
Thanks @Jasha10 , Sorry, I'm not very familiar with OmegaConf's interpolation parser. Where could I learn more about it? #845 (comment) |
working on tests:
Edit: I've added a few more items to the list above |
Don't worry about the parser for now -- we can publish a followup PR if we decide to change that later. |
Hi @Jasha10 , I hope you are doing fine.
|
Hi @Jasha10 , |
Apologies for the late reply! |
Looking good, @pedropgusmao! |
FYI I'm planning to push a few more commits to round-out the testing. |
I've rebased to incorporate some changes from the Additionally, I've added two commits. The first commit ( from dataclasses import dataclass
from omegaconf import OmegaConf
@dataclass
class Conf:
s: str = "abc"
x: bytes = b"binary"
cfg = OmegaConf.create(Conf)
cfg.s = b"bin" # previously performed conversion, now this throws ValidationError
cfg.x = "string" # previously performed conversion, now this throws ValidationError The second commit ( |
Thanks for this, @Jasha10
|
Nope, I am happy with the shape of the PR.
Yes, Hydra+instantiate will support binary ( >>> instantiate({"_target_": "builtins.dict", "abc": b"123"})
{'abc': b'123'}
>>> instantiate({"_target_": "builtins.dict"}, abc=b"123")
{'abc': b'123'} Binary data can be passed as a positional argument using >>> instantiate({"_target_": "builtins.print", "_args_": [b"binary_arg", "foo"]})
b'binary_arg' foo Binary keys are not supported by >>> instantiate({"_target_": "builtins.dict", b"binary_key": "foo"})
...
TypeError: keywords must be strings
Bytes will be supported in yaml files via yaml's !!binary tag: >>> cfg = OmegaConf.create({"x": b"binary_data"})
>>> yml = OmegaConf.to_yaml(cfg)
>>> print(yml)
x: !!binary |
YmluYXJ5X2RhdGE=
>>> OmegaConf.save(cfg, "x.yaml")
>>> print(open("x.yaml", "r").read())
x: !!binary |
YmluYXJ5X2RhdGE=
>>> OmegaConf.load("x.yaml")
{'x': b'binary_data'} This will work when Hydra loads yaml files too.
Storing large binary data will not be encouraged. The Hydra+OmegaConf stack is primarily intended as a human interface rather than as an arbitrary data store. In particular, OmegaConf and Hydra are not tuned to work with big data; they may make copies of data when performing operations such as config creation and merging. Also, yaml files use base64 to encode binary data, which is less memory-efficient than something like Note that you can always use Hydra's instantiate API to invoke e.g. big_data = instantiate({"_target_": "numpy.load", "file": "my_big_data.npy"}) For small blocks of binary data, there is
Yes, automatic conversion will not be supported. Users will need to manually convert using python code or using a custom resolver.
Thank you!! |
Whoops, I closed this PR on accident, and now it seems I'm unable to reopen it. I've opened PR #872 instead. |
Implements support for
bytes
in response to #844 .