Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

formats.kdl: init #295211

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

formats.kdl: init #295211

wants to merge 3 commits into from

Conversation

feathecutie
Copy link
Contributor

Description of changes

Adds pkgs.formats.kdl, a Nix representation of the KDL document language using the standard pkgs.formats format.

Also inits json2kdl, a custom-built application which is required for the core functionality of pkgs.formats.kdl.

Closes #198655.

Things done

  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandboxing enabled in nix.conf? (See Nix manual)
    • sandbox = relaxed
    • sandbox = true
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 24.05 Release Notes (or backporting 23.05 and 23.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
  • Fits CONTRIBUTING.md.

Add a 👍 reaction to pull requests you find important.

@feathecutie

This comment was marked as resolved.

@feathecutie feathecutie force-pushed the formats-kdl branch 2 times, most recently from f97ae1b to 7fc1115 Compare March 12, 2024 12:22
@feathecutie feathecutie marked this pull request as ready for review March 12, 2024 22:58
@sodiboo
Copy link
Contributor

sodiboo commented Mar 14, 2024

How come we're depending on an external application to convert to kdl? It's fairly trivial to do in pure nix - and this has the advantage of letting you use the serialized kdl document within nix code without requiring import-from-derivation.

I know this, because i maintain a module for an application with a kdl config, and i use this approach to generate the kdl document: https://github.com/sodiboo/niri-flake/blob/fcc9ec28393f76d269a58db18f6c48595a0da56d/kdl.nix

Note that my implementation makes a couple opinionated decisions that may be somewhat unfit for nixpkgs (since i'm only catering to one application). But the actual serialization should be just about the same, though, as the internal representation is nearly identical.


What about type annotations? (note that in my case, the application i target explicitly rejects any documents with type annotations, so my implementation pretends they do not exist)

@AgathaSorceress mentions in a previous comment that they couldn't represent comments or type annotations in a good way. I conjecture, that it shouldn't be too hard to represent them in a way that we can serialize: simply replace all kdl-value with { type = nullOr identifier; value = kdl-value; }, and add a type field to the nodes. Yes, this internal representation is less "nice" that way, but it's necessary to fully represent KDL's data model in nix. When actually "writing" kdl documents within nix, you will likely use helper functions that can abstract away this field entirely for use cases where it is unwanted, so such code need not worry about type annotations unless the target application's configuration file actually considers them signficant

Comments are much less important, given that they are not significant for KDL's data model, but if we must support slashdash comments, we could totally have some attribute like commented = true; that is fairly trivial to implement. "freeform" comments (i.e. starting with // or delimited by /* */) are less feasible. We could have the children field be an array of kdl-nodes or some "comment" object (either a string, or disambiguated by e.g. _type field), but i find that to be unnecessary given that the only KDL consumer i know of that doesn't outright ignore comments is XML-in-KDL, and even that has an escape hatch with ! nodes.

@feathecutie
Copy link
Contributor Author

feathecutie commented Mar 15, 2024

I agree that KDL could very well be serialized entirely in Nix, and to be honest, that would be my personally preferred method too.

However, this implementation heavily follows the existing standards already set up by other formats in pkgs.formats like JSON or YAML, and both of these formats do not get serialized in Nix, even though they would arguable be even simpler to implement than KDL.
JSON gets serialized via builtins.toJSON while YAML gets serialized via a dedicated json2yaml application (provided by the remarshal package), so for this reason (and because of @AgathaSorceress's comment and reference implementation using a well maintained KDL library, instead of something we would have to write and maintain by hand), this is the path I decided to take too.
I did assume that import-from-derivation is generally frowned upon in nixpkgs, but if something as essential as pkgs.formats.yaml is also implemented this way, there has to be a good reason.

If a purely Nix based implementation is preferred, I would be up for helping to implement this (though it seems you (@sodiboo) do already have a pretty good working implementation), however this is my first nixpkgs PR so I'm unsure who is fit to "decide" this.

Regarding type annotations, I agree that they would be easy to implement, and it would definitely make sense to integrate them. I will try to do that once I have some free time in the upcoming days.

I do honestly think that being able to generate comments is just not important for a language-to-language serializer, since comments do not (or at least should not) have semantic meaning, and the semantic meaning is the main focus of the document here. In my opinion, this is out of scope.

@sodiboo
Copy link
Contributor

sodiboo commented Mar 15, 2024

I did not realize the other formats also generate derivations, since i'm not too intimate with pkgs.formats. I've only been following the KDL discussion (since that's the one that's relevant to what i'm working on) and wanted it to be the best, without thinking about the design goals for pkgs.formats

I suppose, there's not much of a use case for manipulating these formats after the fact (if you need to parse -- just don't serialize in the first place), but i do think it would be necessary for e.g. embedding one format within another. That's probably somewhat of an esoteric use case though, and i don't know of any tooling that actually wants this. Ultimately, it seems pkgs.formats is optimized for creating human-readable config files. Not just valid data in the target format; but nicely formatted data. builtins.toJSON fits my intuitive "it's just nix" (no IFD) even if it technically could be done from actual pure (exclusively) nix code, but pkgs.formats.json doesn't, because it also uses jq on the output to format the resulting file. Even pkgs.formats.elixirConf, which is actually implemented in nix, still depends on a formatter to produce the result, just like pkgs.formats.json does!

Given how elixirConf is implemented, i disagree that using json2kdl over a nix-native serializer is in any way "following standards" better, but if we're ultimately just generating a derivation, i don't think the difference actually matters too much.

@Lyndeno
Copy link
Contributor

Lyndeno commented Jul 30, 2024

Any blockers?

@feathecutie
Copy link
Contributor Author

The only remaining blocker is the type annotation support mentioned in #295211 (comment) that I never got around to implementing.

This should be a quick fix, but could technically also be implemented in a follow-up PR in order for this PR to be merged sooner.

@feathecutie feathecutie force-pushed the formats-kdl branch 2 times, most recently from bf5e571 to 9c463e5 Compare July 31, 2024 14:03
@feathecutie
Copy link
Contributor Author

Type annotation support has been added now, and seems to work fine based on initial tests. From my point of view, this PR is now ready to merge if there are no other issues.

@h7x4
Copy link
Member

h7x4 commented Aug 1, 2024

Thank you for the great work @feathecutie.

I have been thinking about merging and overriding. Considering nix attrsets are completely orderless and KDL nodes are order-sensitive, representing them as lists are the only real choice. But being able to override previously set values is one of the great strengths of the module system IMO. But this doesn't really work that well with lists.

Do you think it would be possible to create a type for the format where you could optionally use attrsets instead of lists (for applications that does not care about order), and defer type coercion until generation or until in needs to merge with something that is already a list? I'm hoping this would let the user override freeform KDL configuration.

I suppose this would not be a breaking change from the existing code, so we could merge this first and then go about extending it in another PR. But I'm interested to hear your thoughts, since you're the author of the current implementation.

@h7x4
Copy link
Member

h7x4 commented Aug 1, 2024

How come we're depending on an external application to convert to kdl? It's fairly trivial to do in pure nix - and this has the advantage of letting you use the serialized kdl document within nix code without requiring import-from-derivation.
[...]
Given how elixirConf is implemented, i disagree that using json2kdl over a nix-native serializer is in any way "following standards" better,

@sodiboo just for some context, there has been a bit of pushback on large generators written purely in nix before. I believe the main concern is evaluation speed, coupled with some other things: #208747 (comment). But I don't think there's a clear consensus

@feathecutie
Copy link
Contributor Author

feathecutie commented Aug 1, 2024

Do you think it would be possible to create a type for the format where you could optionally use attrsets instead of lists (for applications that does not care about order), and defer type coercion until generation or until in needs to merge with something that is already a list? I'm hoping this would let the user override freeform KDL configuration.

That sounds like a good idea, I'm just not sure exactly what would be the best way to implement that. Did you have any specific format of attrset in mind?

One option that would feel "natural" in my opinion would be to simply have the names of nodes be the keys of the attrset, but even disregarding order, that would still not work because of the fact that KDL allows multiple nodes with the same name.
I guess another option would be to group nodes of the same name under their name, and then have a list/attrset of nodes per node name?
Besides that, one could also just completely disregard the keys of the attrset (allowing completely arbitrary keys) and instead specify all the data of the nodes in the values. That would definitely be the most flexible option, but it would require the user to think of a lot of arbitrary key names (that don't ever actually end up being used in the end), which doesn't sound very intuitive either.

I'm also not entirely sure if this is actually the correct level to implement merging at, since the main job of this format is just to somehow translate Nix to KDL, and KDL simply does not have a generalizable concept of "merging" configurations.

Depending on how KDL is used in each specific module or usecase, it might very well be possible that configurations should be merged differently (e.g. simply concatenating all lists, or replacing all nodes of a specific name, or something else entirely). In these cases, it might make more sense to add custom options that do not directly translate to formats.kdl and instead to generate the expected Nix format for formats.kdl from these custom options.

I also realize that some modules might simply wish to use formats.kdl directly (when the expected KDL is relatively freeform and can't be translated to properly typed options) and it should still be possible to somehow properly override that. Like I said, I'm just not entirely sure how to best approach this, so any input would be welcome.

@getchoo getchoo mentioned this pull request Oct 13, 2024
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

request: pkgs.formats.kdl to support applications adopting KDL config language
8 participants