Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for heterogeneous arrays with additionalItems #694

Merged
merged 1 commit into from
Oct 30, 2023

Conversation

Julian
Copy link
Member

@Julian Julian commented Oct 20, 2023

We actually don't have any of these! -- I.e. tests where the instance is heterogeneous and additionalItems has to cope with items of different types.

Ref: python-jsonschema/jsonschema#1157

@Julian Julian requested a review from a team as a code owner October 20, 2023 23:19
@Relequestual
Copy link
Member

I don't kow how I feel about this one. Does it test something required by the specification? Maybe? But it seems more like a "you should be able to handle all the types of JSON data" sort of issue. In which case you could add MANY tests in this space. Such as additionalItems: false should fail for all JSON types as the instance value. (Although, that WOULD be something the spec requires.)

I think my point is, there are many tests of this class that we don't have because we wouldn't expect to need to test for them. Maybe we do. Like, should we test all types in type against all JSON types? Maybe. Off hand, I do not think we currently do.

@Julian
Copy link
Member Author

Julian commented Oct 23, 2023

Can you clarify what you mean by "maybe" on whether this is required by the spec? This seems clearly required by the spec to me, there's no special behavior here, it's just normal additionalItems behavior.

Like, should we test all types in type against all JSON types?

Those tests are here:

https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/main/tests/draft2020-12/type.json#L10-L53

and indeed test every type against every type.

We have never turned down a test which corresponds to a "real world, actual bug" -- in general our criteria has been:

  • is it required under the spec (would love to hear why you said maybe)
  • does it seem like a good test on its own (in which case it doesn't matter whether it's a "real world bug", we just add it because it's likely to cause one if we don't)
  • OR, if it did appear as a real world bug, we minimize the bug and add whatever must be the missing test.

Fuller criteria are in #439

@Relequestual
Copy link
Member

Can you clarify what you mean by "maybe" on whether this is required by the spec? This seems clearly required by the spec to me, there's no special behavior here, it's just normal additionalItems behavior.

I mean, such a case isn't explicitly called out behaviour.

Say an implementation expects to always have an item after the preixed items, and it errors when it does not. We should add a test which would cover that?

or say, an implementation incorrectly hanldes an items subschema result when using not. Should we then check all keywords correctly handle not? Maybe we should, I don't know.

@Julian
Copy link
Member Author

Julian commented Oct 23, 2023

I mean, such a case isn't explicitly called out behaviour.

And do you mean:

"the spec does not explicitly call out this behavior, so an implementation may not need to support it"?

Or

"the spec does not explicitly call out this behavior, but it's obviously still required to support", and you're questioning why we add tests for such things?

The specification defines prefixItems as:

Validation succeeds if each element of the instance validates against the schema at the same position, if any. This keyword does not constrain the length of the array. If the array is longer than this keyword's value, this keyword validates only the prefix of matching length.

and items as:

This keyword applies its subschema to all instance elements at indexes greater than the length of the "prefixItems" array in the same schema object, as reported by the annotation result of that "prefixItems" keyword.

Surely you're not saying you think it needs to enumerate which instances those apply to? So you must mean "this behavior is required by the spec", no? (In which case I'd say politely I think that's a very deceptive way to use the word "maybe" if you mean "definitely yes, but still are questioning why we'd include the test")

Say an implementation expects to always have an item after the preixed items, and it errors when it does not. We should add a test which would cover that?

In general when one writes tests, because test suites are not the same as formal verification, they are making predictions about the subset of all possible tests which should prevent real world bugs. It's the latter we care about of course, not the tests themselves. When a real world bug happens regardless, it's a sign something was missed, or that that prediction -- which is hard -- was off by a bit. So you add it.

To answer your hypothetical scenario, I do not think such a scenario is likely, so no I would not add it, but if someone came along and said "whoops this happened to me, I had this bug", then I would not hesitate twice to add it.

or say, an implementation incorrectly hanldes an items subschema result when using not. Should we then check all keywords correctly handle not?

The same applies to this scenario IMO, so here no, I would not preemptively add such tests because I don't believe any implementation would have such an issue, although we've had similar ones in the past indeed, and again my position is and always has been "yes we should indeed add them" if you do come up with an example that meets the rest of the criteria, so hopefully I've always at least been consistent.

(I honestly think this is one of the easier and I hoped most incontrovertible PRs to come through the suite, albeit I wrote it so I'm slightly biased, but that's making me particularly interested in what leads to your comment).

@Relequestual
Copy link
Member

Latter for sure.

I broadly agree with all the above.

To answer your hypothetical scenario, I do not think such a scenario is likely, so no I would not add it, but if someone came along and said "whoops this happened to me, I had this bug", then I would not hesitate twice to add it.

That's fair. This would have fallen under something I didn't think was likely.

The same applies to this scenario IMO, so here no, I would not preemptively add such tests because I don't believe any implementation would have such an issue...

We have previously added related tests for a test added where there was a pattern that could be replicated. Which I think you go on to mention in another way.

When I said "Does it test something required by the specification? Maybe?" When I say "required", I meant explicily required with MUST wording (RFC 2119). By "maybe" I just meant I hadn't gone to look today.

I honestly think this is one of the easier and I hoped most incontrovertible PRs to come through the suite...

Having a test added because some code after the validation processing causes an error, feels a little odd to me. Although I can see that such code could be part of the output format creation code, and so considered part of the spec.

I don't mean to unesecerily nitpick. I think I just wanted to understand, if this IS a test we want to have, do we need a whole load of other tests to go with it? The answer is no, but if someone reports/adds them, then that's fine also.

@Julian
Copy link
Member Author

Julian commented Oct 23, 2023

That's fair. This would have fallen under something I didn't think was likely.

I'm not sure what I'd have thought had we considered this case ahead of time -- heterogeneous arrays can cause issues across lots of languages I think (particularly statically typed ones even!) so I dunno, I'm fairly confident I'd have been comfortable merging a PR to add tests concerning them, but I'm on the other hand fairly confident I would not have done work to write this test, I'd have probably considered other things more important, so agree at least that much.

When I said "Does it test something required by the specification? Maybe?" When I say "required", I meant explicily required with MUST wording (RFC 2119).

I'm pretty sure we (FSVO "we" -- not you personally I think) have had a related meta-discussion here in the suite repo about this kind of topic. The specification does not use MUST in many places -- I think correctly! In the language I quoted which defines items, it does not use the language:

Validation MUST succeed if each element of the instance validates against the schema at the same position, if any. This keyword MUST NOT constrain the length of the array. If the array is longer than this keyword's value, this keyword MUST validate only the prefix of matching length.

I would not personally advocate it do so (nor that we adopt such practice in general when defining behavior in the spec). I also do not think other specifications generally do so -- MUST (and related words) are one way to stress required behavior. But saying "X is defined to do Y" is itself, I believe "accepted" to be itself a firm requirement of a specification, and certainly is one this suite relies on in calling things required or not.

That's why I'd react quite firmly on "I would never call that 'maybe'". I think it is very very harmful (to the suite) to propagate such a definition, and have resisted similarly firmly whenever someone has tried to say similar things I believe.
It erodes what I believe is the primary and most important consideration of the suite: does it represent exactly and only the firm requirements of the literal text of the specification? With precisely 1 exception (which we don't need to rehash), I believe I can personally attest "yes", and that itself is IMO 60% or more of the value of the suite (of course there's inherent value in simply having a collection of "good examples", so I wouldn't claim it's anywhere near 100% of the value).

(Obviously not being sure of something I take no issue with though, so yeah not nitpicking on "maybe" too hard myself!)

@Julian
Copy link
Member Author

Julian commented Oct 23, 2023

Oh, by the way, I meant to also mention:

Having a test added because some code after the validation processing causes an error,

It's not really easy to separate the two -- in my implementation, you get an error during validation from seeing the result, there isn't really some second step (as if this were happening during generation of some output format).

Copy link
Member

@jdesrosiers jdesrosiers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have never turned down a test which corresponds to a "real world, actual bug"

I skimmed most of the discussion, but I'll point out that this (:point_up:) has always been my position. While I don't think this test is likely to catch a lot of bugs, we know for a fact that it would have caught at least one. Therefore, I think it should be included.

Copy link
Member

@gregsdennis gregsdennis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These pass in my implementation

@Julian
Copy link
Member Author

Julian commented Oct 30, 2023

Thanks folks.

@Julian Julian merged commit 95fe6ca into main Oct 30, 2023
3 checks passed
@Julian Julian deleted the heterogeneous-additionalItems branch October 30, 2023 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants