Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add default values handling. Fixes #12. #19

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Rogdham
Copy link
Contributor

@Rogdham Rogdham commented May 18, 2013

This PR adds a way to handle default values for optional keys, as requested by #12.

You can pass the default keyword argument quite everywhere to specify a default, and that default is used if the corresponding key is optional:

>>> Schema({
...         'event': str,
...         Optional('date'): Use(int, default=2012),
...     }).validate({'event': 'First commit'})
{'date': 2012, 'event': 'First commit'}

You can also use the Default class, which wraps whatever you gave as the first argument, and tries to find the appropriate default value (here 0 for an int):

>>> Schema({
...         'account': str,
...         Optional('amount'): Default(int),
...     }).validate({'account': 'Piggy bank'})
{'account': 'Piggy bank', 'amount': 0}

Of course, your need to give a proper default value, or else it complains, raising a ValueError (I think it is important that this is not a SchemaError, because that one means that the programmer made a mistake, nothing to do with the user data to validate):

>>> Use(int, default='two')
Traceback (most recent call last):
…
ValueError: Use(<type 'int'>) does not validate its default: two

And in case your optional key is a type, you can specify its default value as well:

>>> Schema({
...         Optional(int, default=42): Or(1, 2, 3, 4, default=3)
...     }).validate({})
{42: 3}

Tell me what you think!

@keleshev
Copy link
Owner

Hm, seems like default=... and Default(...) are redundant. If we introduce something like that into schema we need to pick one I think. And I'm not sure which one.

@Rogdham
Copy link
Contributor Author

Rogdham commented May 18, 2013

In this implementation, Default(x) is just a handy shortcut to make schema finds the appropriate default value from x. For example, Default(42) is the same as Schema(42, default=42), and Default(int) is the same as Schema(int, default=0).

So if we remove Default, we are just removing a shortcut, I don't think that it's a big deal.

Hence if you want to remove one, I suggest removing Default ;-)

Tell me what do you think about it and if you want me to update the PR to remove it.

Edit: an other argument for keeping default=... over Default(...), is that it is used for the keys too, to (in Optional – see last example in the first message of this PR).

@keleshev
Copy link
Owner

I never needed this feature, so I'm not totally sure which API is better yet.

@Rogdham
Copy link
Contributor Author

Rogdham commented May 18, 2013

@reedboat: as you opened the issue, could you share your thought about the API I described on this PR? Thanks!

@chadrik
Copy link

chadrik commented Apr 29, 2014

+1

@chadrik
Copy link

chadrik commented Apr 29, 2014

I prefer the default= approach, since as @Rogdham states, the Default class is just a wrapper for default=.

@Rogdham
Copy link
Contributor Author

Rogdham commented Apr 30, 2014

We seems to move for a consensus against Default, all for the default= approach. I've modified this PR to remove the Default as a consequence.

While I was at it, I also rebased on master, and made a little change to take into account #41. Should be easier to merge that way :-)

Edit:
Travis CI build failled due to:

  • some interpreters not found
  • coverage less than 100% (but no new lines added by this PR)
  • doctest sur README : prints a dicts with the key not in the same order as in the file (key order is quite random)
    Those issues are not related to this PR.

@chadrik
Copy link

chadrik commented May 18, 2014

Hey @Rogdham, I finally had some time to look over your code. I added some comments to here

Aside from my comments, I wonder if the design would be a bit cleaner with a BaseSchema class to provide the default keyword behavior instead of the handle_default decorator? I find the decorator approach to be a bit esoteric, since this type of problem is typically handled through subclassing, though it is a graceful solution if inheriting directly from object is indeed desirable. Perhaps @halst will chime in with an opinion.

chad.

@keleshev
Copy link
Owner

Sorry for taking so long. Recently I actually needed to have some default functionality with schema. Here are my thoughts:

The default value should be used only if no value is provided, not if value is not validated. The later would make errors pass silently.

"Value not provided" could mean different thing, e.g.: a dictionary key is missing altogether, or a value is None. Am I missing any other cases?

If we take only "key is missing" as a reason to apply a default, then a nice API for that would be:

Schema({Optional('key', default='DEFAULT'): str)

If we take only "value is None" as a reason to apply a default, then there are a few options to make it readable. Adding a default argument is quite readable:

Schema({'key': Schema(str, default='DEFAULT')})

But I don't want to add default keyword argument to every class. I also don't want to rely on inheritance.

Maybe we should have a Default class, or WithDefault class like:

Schema({'key': WithDefault(str, 'DEFAULT')})

Another option is to have a Default class which works like this:

Default('DEFAULT').valudate(None) => 'DEFAULT'
Default('DEFAULT').validate(1234) => SchemaError

This would allow it to be used as:

schema = Schema({'key': Or(str, Default('DEFAULT')})
schema.validate({'key': None}) => {'key': 'DEFAULT'}
schema.validate({'key': 1234}) => {'key': 1234}

I think I prefer the last variant the most. It can also be extended for cases when non-None value should be replaced, for example:

Or(str, Default('DEFAULT', when='')).validate('') => 'DEFAULT'

On the negative side, this would not work for keys that are not present.
Then, maybe, we could make Optional take Default as an argument, just for some perceived consistency? Like this:

Schema({Optional('key', Default('DEFAULT')): str})

What is your opinion? And sorry for being a shitty maintainer and leaving this pull request hanging for more than a year.

@andor44
Copy link

andor44 commented May 29, 2014

I like WithDefault the most. It is more flexible than just a default argument and doesn't reimplement certain features like Default (in the example, use a Use(lambda x:...) instead of str to catch empty string).

@Rogdham
Copy link
Contributor Author

Rogdham commented May 29, 2014

@chadrik: Let me try to sum up for clarity what you're suggesting (the main points):

  1. do not do stuff on __init__ if possible, do it at validate time so that things are faster (e.g. complex Schema trees can make whole branches not being validated at all)
  2. decorator v.s. inheritance
  3. better exception messages for clarity

Here are my answers:

  1. good suggestion, haven't thought of that, would be definitively better
  2. no preconceived idea on that, I went for a decorator because I saw that classes like Use, Or did inherit from object… see below for more on that topic
  3. sure, although I don't agree with your comment about ValueError vs SchemaError (in that case, the ValueError would have been raise if the developper made a mistake)

@halst: I guess we should consider the API with more intensity before coding anything. There has been some discussion though, which may be worth re-reading.

I'm not sure if this is the best place to discuss the API, as this is a PR on a particular API ; maybe it is better to discuss the API in #12 ?


However, there is one thing I find bizarre with some of the API you mentioned.
I'm speaking about defining the default value for a dict in the key of that dict, like in the following:

Schema({Optional('key', default='DEFAULT'): str)

I believe this is a pitfall worth considering. As you have understood, I am against it, and my preference goes to the definition of a default value in the value part of the dict.

To give a more in-depth example, consider what is in the API of this PR: default for keys as well:

>>> Schema({Optional(int, default=42): Use(str, default='The answer')}).validate({})
{42: 'The answer'}

An other point: you mentioned that using the default could be triggered when there is no value for a specific key, or when that value is None. I'm not sure if it makes sense to consider None as an absence of a value, as it is a value in itself. I am missing some use cases though.


Last but not least, you mentioned that you would prefer not relying on inheritance for defaults handling (see point 2 on top of that message). Could you elaborate on that a little bit, so that we could understand why it is not a good idea?

@keleshev
Copy link
Owner

Last but not least, you mentioned that you would prefer not relying on inheritance for defaults handling (see point 2 on top of that message). Could you elaborate on that a little bit, so that we could understand why it is not a good idea?

This is a more or less religious thingy I have: I believe in duck-typing and that types don't matter—the interface matters. We should avoid making users of schema subclass anything. I find it hard to justify this opinion, but I have one example:

Imagine a user of schema testing something. If they are required to subclass something, they are forced to make a subclass even for a throw-away scenario. However, if they are not forced to subclass, they can just pass a stub for testing.

@keleshev
Copy link
Owner

Schema({Optional(int, default=42): Use(str, default='The answer')})

In case of passing default to Optional I assumed that default will only be an argument of Optional and no other classes.

@keleshev
Copy link
Owner

@andor44 I'm not sure I follow you. I don't see much difference between WithDefault(str, 'DEFAULT') and Or(str, Default('DEFAULT')).

@andor44
Copy link

andor44 commented May 29, 2014

As you described it, I'd imagined WithDefault is like a short-circuited AND operation (if the first argument validates to a SchemaError then use the second) while Default just validates to SchemaError to anything that's not None or missing. Additionally, I think WithDefault(str, 'foo') is more readable and easier to understand than Or(str, Default('foo')).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants