Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Output nodes should be per subschema not per keyword #1249

Merged
merged 13 commits into from
Aug 28, 2022

Conversation

gregsdennis
Copy link
Member

@gregsdennis gregsdennis commented Jun 27, 2022

This is quite a change, so I'd be happy to walk anyone through it.

Summary

Following the idea from this thread and this thread, this PR changes the concept of the output unit from being based on individual output from keywords to being based on output from schemas. The benefits of this are in the thread, so I won't rehash them here. The primary difference is that errors and annotations are included in the output unit as properties (objects with keyword names for keys) rather than as nested units.

It also does a few other things:

  • removes the detailed format
  • renames the verbose format hierarchical (since there's only one now)
  • adds clarification in some of the language
  • adds passing examples to show how annotations are included

Doing this made the formats considerably more information-dense, which meant that the verbose examples now reasonably fit in the spec rather than needing to be defined in an external file.

All of this is pretty related, otherwise I would have broken it into multiple PRs.

@gregsdennis
Copy link
Member Author

I have implemented this in the schema/experiment-new-output-format branch of json-everything

@jdesrosiers jdesrosiers changed the base branch from draft-next to main July 8, 2022 15:27
@jdesrosiers
Copy link
Member

The draft-next branch has been merged and is now closed. The merge target for this PR has been changed to main. Here are the recommended steps to get your branch reabsed properly.

  1. Make sure your remote for the json-schema-org/json-schema-spec repo is up-to-date. (Example: git fetch upstream).
  2. Rebase your commits onto main. (Example: git rebase --onto upstream/main abcd123~1 (replace abcd123 with the commit hash of the first commit in your PR)).
  3. Force push the rebased branch to your fork. (Example: git push --force origin my-branch).

@gregsdennis gregsdennis force-pushed the output-nodes-should-be-per-subschema-not-per-keyword branch from 78b4cde to 742221a Compare July 9, 2022 01:31
@karenetheridge
Copy link
Member

I am strongly against parts of this PR but I am unable to spend much time going through it again in detail until the end of the week.

Copy link
Contributor

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got various nitpicks and wording ideas (some of which may belong somewhere other than this PR), but overall this looks good to me. I like how this gets rid of the duplication of the three location fields.

For the purposes of this PR I'm not worrying about whether "basic" becomes mandatory, or whether there are reasons for a more substantial difference between error and annotation output.

jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Show resolved Hide resolved
@gregsdennis
Copy link
Member Author

@karenetheridge I welcome your feedback, but it's been over two weeks since you posted.

Copy link
Contributor

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update! While i made quite a few comments they're mostly tiny things or just an effort to standardize the terminology. Overall this is looking really good!

jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
jsonschema-core.xml Outdated Show resolved Hide resolved
All output units are included in this format.
</t>
<t>
The location properties of the root output unit MAY be omitted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean exactly? I would really rather not have to special-case any output unit, they should all always provide the same set of location information.

Copy link
Member Author

@gregsdennis gregsdennis Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the basic output, results from the root schema are also an item in the list, which means that there is no need to have location properties (evaluationPath, schemaLocation, & instanceLocation) in the root output unit (and only the root). Thus for basic it's just valid and nested at the root.

This allowance permits these properties to also be omitted from the root of the hierarchical format. The thinking behind this is that it aligns with basic, if that matters to implementors, but also that these location properties will always be

evaluationPath: "" (empty pointer)
schemaLocation: "https://example.com/schema#"
instanceLocation: "" (empty pointer)

Having them at the root isn't useful or really required.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like a design smell from trying to make two things that aren't really the same pretend like they are the same.

@gregsdennis gregsdennis marked this pull request as ready for review August 4, 2022 19:53
@jdesrosiers
Copy link
Member

I'm still not entirely sure how I feel about the approach. I think it will work fine. I think my only concern is that we would be making a drastic change without really knowing how it's going to work out. Not that what we have now is well proven either, I just don't want to be implementing a completely new output format every release. I don't know that there's really a solution to that without a crystal ball, but I wanted to mention it anyway.

@gregsdennis
Copy link
Member Author

@jdesrosiers I understand and share that concern. However, as I mentioned in my blog post, the current design was not met with much acceptance and there were a lot of questions.

Secondly, I think this aligns better with the discussion we had around how annotations are passed on / blocked.

It really needs to change. I really think that we're a lot closer to a final form.

Also, having implemented both formats, I can tell you that this is so much simpler!

@jdesrosiers
Copy link
Member

I completely agree that it needs an update. Doing nothing is not an option. My main concern is maintaining a bunch of different versions of the output format as we iterate and experiment to figure out what's going to work best. I'd feel a lot better about this if it were it's own spec and we actually treated it like a draft where each draft replaces the previous and has no backwards/forwards compatibility guarantees until things stabilize. That way I always only have to maintain the latest version.

@gregsdennis
Copy link
Member Author

I agree with what you're saying. For now, I think we can still make improvements in-place and then extract the output separately later.

@handrews
Copy link
Contributor

handrews commented Aug 7, 2022

@jdesrosiers given that we quite likely won't even put out the next revision under the IETF process, I don't think we should worry too much about what is in which document at the moment.

@gregsdennis
Copy link
Member Author

Let's continue the output-in-a-new-spec discussion in this thread. For now, it's all in Core. We'll open a new PR for further developments. I think this one is complete.

@jdesrosiers
Copy link
Member

given that we quite likely won't even put out the next revision under the IETF process, I don't think we should worry too much about what is in which document at the moment.

I don't think what process we are using makes a difference, unless your point is that it doesn't make sense to split things out until we know what process we're going to use moving forward. That makes sense, but I wasn't suggesting it be split out now, just that it's an important consideration to keep in mind. If we're going to make such a big change we need to consider the effect on implementors and that very much includes discussion of what the life-cycle of this feature will be.

I think it's a relevant an very important to thing to be thinking about now, but it's not a blocker for this PR, just an important thing to call out and follow up on later.

jsonschema-core.xml Outdated Show resolved Hide resolved
Copy link
Contributor

@handrews handrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in favor of merging this. The open points of discussion are either not things that I'd block the PR on anyway, or I'm confident that we'll continue talking about them. What's here right now is a great improvement.

@gregsdennis gregsdennis merged commit cd06bd5 into main Aug 28, 2022
@gregsdennis gregsdennis deleted the output-nodes-should-be-per-subschema-not-per-keyword branch August 28, 2022 20:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

Successfully merging this pull request may close these issues.

4 participants