Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

several misclassified tests using URNs in $id #630

Closed
karenetheridge opened this issue Jan 8, 2023 · 12 comments
Closed

several misclassified tests using URNs in $id #630

karenetheridge opened this issue Jan 8, 2023 · 12 comments

Comments

@karenetheridge
Copy link
Member

There are several tests that were recently added to ref.json (multiple versions) using URNs in the $id keyword, but they don't do anything interesting with the $ref keyword -- the $refs are all simply json pointers. What is the purpose of these tests? Shouldn't they be in id.json?

@Julian
Copy link
Member

Julian commented Jan 8, 2023

They're there to test resolving of the reference in $ref with a URN $id. In many cases specifically to test relative resolving of the $ref URI against the URN (and doing so correctly), and in others (e.g. if it's a JSON Pointer) to simply ensure that having an URN in $id doesn't break doing so (indeed in my implementation it did until I added them).

What we have now for classification is really only a "loose" one, there's a (single I believe) test in $id which relies on $ref in id.json, and vice versa there are many more ref tests which rely on $id. Some day we may have a "hierarchy" of files perhaps.

You may want to look at #629 though (which I haven't looked at yet) to opine on whether these tests are required at all -- obviously given I added them I believe I checked and read the spec as requiring them, but that issue has some discussion, so at some point I'll double check.

@handrews
Copy link
Contributor

handrews commented Jan 8, 2023

@karenetheridge I'd say they are definitely required in the sense that any conforming implementation MUST support all 5 of the distinct URI grammars from RFC 3986, with the most important being what @Julian notes about same-document references working with any base URI.

Technically these are RFC 3986 requirements rather than JSON Schema requirements, but a.) this is critically important functionality, unlike (say) correct handling of internationalized emails in format, and b.) it is very easy to accidentally not support them because most people don't think of these cases, and some URI libraries (and most hand-coded URI handling) incorrectly doesn't support them.

So I'd argue that this is one of the very few times when pass-through requirements from another spec deserve to be treated as 1st-class required JSON Schema tests. The functionality is too important, and reasonable, well-intentioned programmers can accidentally end up not supporting them by trusting a seemingly-correct library or guide.

We don't need to cover every case, but definitely the major less-common-but-still-important cases. As far as the relationship of this to #629, here we want to cover every major thing required by RFC 3986. But #629 is arguing about treating schemes differently based on behavior that is allowed by but goes beyond RFC 3986 requirements.

I think that scheme/protocol-specific requirements are not required. On-demand retrieval is only a MAY in JSON Schema, and we are silent on URI normalization for comparison purposes (which is an implicit MAY in my book). So those would go in optional or may, unless we add a SHOULD for something somewhere on our own.

@karenetheridge
Copy link
Member Author

karenetheridge commented Jan 9, 2023

I'm not questioning the utility of testing URNs in $ids - I can see that some implementations might naively only support http*. I'm wondering why these specific tests, and why they're in ref.json -- the actual $ref is to a fragment-only uri-reference, which needn't involve the current base URI at all. The tests should instead be "$ref": "..some URN, not our own...#/some/pointer".

@handrews
Copy link
Contributor

@karenetheridge I see. I do think we should test fragment-only/same-document references with different sorts of base URIs as URI implementations are not required to treat them differently. But I agree with you that those are not the only or most interesting cases. Some possible interesting cases include:

  • A urn: that fully replaces the base URI (either another urn: or a base with a different scheme)
  • An about: URI base with a query-only $ref
  • The empty string as a $ref value is also a same-document reference (so not the most interesting, but also a weird case that might be handled wrong)

I used about: for the query-only case because I can't make heads or tails of RFC 8141's r-component and q-component sections.

If there is a rootless scheme that does not allow : in its path component, then it would be possible to have a scheme-relative reference that just replaces the path. Although we're getting pretty obscure at that point, and URNs definitely don't qualify.

@Julian
Copy link
Member

Julian commented Jan 10, 2023

@karenetheridge did my answer answer your question? For the most part most of the tests in id.json are very simple, there's only 1 that deals with ref, more of the tests that deal with ref are in ref.json, even when they involve id.

@karenetheridge
Copy link
Member Author

Not really. I've been trying to say that when using a json pointer-only $ref, it doesn't really matter what's in $id, and such a test is not likely to exercise the proper handlings of non-http URIs.

@Julian
Copy link
Member

Julian commented Jan 12, 2023

and such a test is not likely to exercise the proper handlings of non-http URIs.

I'm not quite sure what you mean by this, where the test lives of course has no bearing on how well it exercises the behavior, it's just a human aide for classification, and precisely as you say, the tests are there precisely to ensure the $id doesn't affect resolving the pointer ref in any way -- again, in my implementation indeed it accidentally did!

But all that being said -- there's no science here, I'm not saying I did a long think on which of the two it belonged in, so if you'd really like to compare and recategorize them, feel free to send a PR.

@handrews
Copy link
Contributor

@karenetheridge

Not really. I've been trying to say that when using a json pointer-only $ref, it doesn't really matter what's in $id, and such a test is not likely to exercise the proper handlings of non-http URIs.

I see your point here. RFC 3986 §5.1 notes that fragment-only references are usable without a base URI. So if the schema document is only one schema resource, the presence, absence, or scheme of the URI is irrelevant to implementations that handle fragment-only references this way.

However, implementations are not required by RFC 3986 to handle fragment-only references this way, so there is a failure mode here where a JSON Schema implementation supplies the base URI to the RFC 3986 implementation anyway, and that implementation tries to use it and fails because it does not understand rootless-syntax schemes like urn:. Or if the JSON Schema implementation rolled its own relative reference handling and failed in the same way.

This is, admittedly, a pretty darn obscure failure mode. With my test planning hat on it would be a low-priority test case, but since it's so easy I'd probably include it. A lot of people treat relative reference resolution as barely more complex than string concatenation (although they'd probably fail some other tests as well).

However, other possible relative tests using urn: (or about: or mailto: since they support more syntax components) are definitely more interesting.

@karenetheridge is right that if the most compliant approach to supporting this is used, the test covers nothing that's not already covered, but it might catch bugs for people who took a weird approach. idk how that fits into the suite. I'd be inclined to add a note that ideally it's irrelevant, but leave it.

@Julian
Copy link
Member

Julian commented Jan 14, 2023

I'm afraid I can't easily see what each of your comments, including the last one, seem to have to do with the question here (something I think @karenetheridge also is having trouble seeing given the previous confused emoji and response). I don't see @karenetheridge questioning any behavior here, this is strictly a classification question.

I'd missed this before by the way @karenetheridge:

The tests should instead be "$ref": "..some URN, not our own...#/some/pointer".

This test I of course wanted to add -- I mentioned as much on the PR adding these:

Missing is a test for an URN retrieval URL, which unfortunately we have
no way of communicating at the minute.

Which as I said we don't really have an easy way to add today, given that all we tell implementers is "we expect all the files in remotes available at http://localhost:1234/" -- we don't have a general way to communicate retrieval URLs in the current layout.

@handrews
Copy link
Contributor

@Julian my last comment was just explaining how I read RFC 3986 to allow both:

  • Using fragment-only references without even looking for a base URI, in which case whether the $id is a URN or HTTPS URI doesn't matter, and...
  • Resolving a fragment-only reference by first using a base URI to resolve it to a full URI, but then not performing a new retrieval, in which case testing with a URN $id could catch a bug in an implementation that cut corners

So, basically, the test isn't always useful, but could be.

I don't have strong feelings on whether to keep it or where it should be, so don't worry too much if I'm not making sense.

@Julian
Copy link
Member

Julian commented Jan 14, 2023

Ah got it -- sounds good, that one certainly makes sense to me now for background, thanks.

@Julian
Copy link
Member

Julian commented Aug 30, 2023

I think I'm going to close this for cleanliness given as I mentioned above that our classification system is pretty loose and this isn't terribly clear cut but I think if someone feels quite strongly that there's a better way to categorize what we have, just send a PR with some concrete reasoning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants