lib.path.subpath.{isValid,normalise}: init #205190

infinisil · 2022-12-08T21:31:36Z

Description of changes

Creates a new lib.path library component, originally proposed in #200718. This commit adds two main parts of it:

The design document covering the main design decisions for this library
A lib.path.subpath.isValid function to check whether a value is a valid subpath according to the design document
A lib.path.subpath.normalise function implementing a safe normalisation of subpaths according to the design document.

In the future further library functions will be implemented upon these two functions, it's the working horse of this path library. The next functions from #200718 that can be implemented are subpath.join and append, see #210426 for progress.

This work is sponsored by Antithesis ✨

Things done

Wrote documentation
Wrote code-covering tests
Wrote property tests

roberth

Need to fix a crucial term.

lib/path-design.md

infinisil · 2022-12-09T19:05:53Z

In #200718 we originally wanted to use path.relative.normalise, but I used path.relativeNormalise because the doc rendering didn't work with path.relative.normalise, it associated the docs with path.relative and not path.relative.normalise. I created a PR to fix this in nixdoc: nix-community/nixdoc#27, so I'll switch this PR back to relative.normalise as originally intended and I'll add this patch to the nixdoc build.

fricklerhandwerk

That's the cleanest, most self-explaining code I have seen in months. Only lots of nitpicks about phrasing in the docs.

I added a few suggestions which add periods to the argument lines, but maybe this is a bad idea. Specifically it makes those lines which end in ./. and the like harder to read, and I don't think it makes sense to reword those artificially. Feel free to amend them and remove the periods if you don't agree, or make the change consistent otherwise.

Most importantly I think we should be careful not to overdo the subpath thing. We can keep it generic where we can be sure it will stay generic. That would allow for minimal changes in case we ever need to extend the library towards actual relative paths in a safe, backwards-compatible way.

lib/path-design.md

lib/path.nix

doc/doc-support/lib-function-docs.nix

fricklerhandwerk

we don't even know if we need a relative normalisation yet. Personally I don't think we'll ever need it.

That's an argument in favor of not making that distinction in the name and go for path.relative.normalise as we originally intended. We can still say in lib.path.append that it takes a subpath as a second argument, as defined in the design document. If we should ever need it, relaxing relative paths and splitting out a subpath variant will not break existing code, and can be supported by warnings to consumers.

@roberth please review. Other than the last open naming question this look fine to merge to me.

lib/path.md

lib/path.nix

fricklerhandwerk · 2022-12-12T21:39:28Z

lib/path.nix

+  # Splits and normalises a subpath string into its components.
+  # Errors for ".." components and doesn't include "." components
+  splitSubpath = path: errorPrefix:


Suggested change

# Splits and normalises a subpath string into its components.

# Errors for ".." components and doesn't include "." components

splitSubpath = path: errorPrefix:

# Split and normalise a relative path into its components, skipping "." components.

split = path:

There is nothing specific to subpaths left here. Since this function is internal we may as well use a short name.

We'd run into infinite recursion with split though, because we use inherit (builtins) split and use it in this function.

splitRelPath and joinRelPath?

TIL I can copy html and github will accept that as markdown.

rel m (plural rellen, diminutive relletje n)

riot

scandal, outrage

Oh neat. Also similarly: You can select text from a GitHub comment, then click "Quote Reply" and it will insert only the selected text as a quote.

This is an internal name anyways, we don't need to care about the name that much imo :)

because we use inherit (builtins) split and use it in this function

You can rename the import to splitString! Okay, this is getting into a level of detail that suggests the implementation is good enough.

Yeah "Quote Reply" is also nice.

internal

Is it a scandal if no-one knows?

I guess what I was going to say is that "relative" isn't quite right here.

because we use inherit (builtins) split

We don't have to.

I guess the risk is someone might want to expose the function and then suddenly it's not internal anymore. Scandalous!

Should I add you @roberth and @fricklerhandwerk as code owners of lib/path.{nix,md}? I'm also already a code owner of lib. This way we'll get notifications for future updates to this library

lib/path.nix

fricklerhandwerk · 2022-12-12T21:54:12Z

lib/path.nix

+    if ! isString value then
+      throw "${errorPrefix}:\n    Not a string"
+    else if value == "" then
+      throw "${errorPrefix}:\n    The string is empty"
+    else if substring 0 1 value == "/" then
+      throw "${errorPrefix}:\n    The string is an absolute path because it starts with `/`"
+    else if match "(.*/)?\\.\\.(/.*)?" value != null then
+      # We don't support ".." components, see ./path.md
+      throw "${errorPrefix}:\n    The string contains a `..` component, which is not allowed in subpaths"


Suggested change

if ! isString value then

throw "${errorPrefix}:\n Not a string"

else if value == "" then

throw "${errorPrefix}:\n The string is empty"

else if substring 0 1 value == "/" then

throw "${errorPrefix}:\n The string is an absolute path because it starts with `/`"

else if match "(.*/)?\\.\\.(/.*)?" value != null then

# We don't support ".." components, see ./path.md

throw "${errorPrefix}:\n The string contains a `..` component, which is not allowed in subpaths"

if ! isString value then throw "${errorPrefix}:

Not a string"

else if value == "" then throw "${errorPrefix}:

The string is empty"

else if substring 0 1 value == "/" then throw "${errorPrefix}:

The string is an absolute path because it starts with `/`"

else if match "(.*/)?\\.\\.(/.*)?" value != null then throw "${errorPrefix}:

The string contains a `..` component, which is not allowed in subpaths"

If it's an ugly hack already, why not make it a creepy hack that leverages the code's own indentation! 😛

Haha damn, I'd rather not because it's really not clear how much it indents like this, but this is interesting.

We could do this though:

if ! isString value then throw '' ${errorPrefix}: Not a string'' else if value == "" then throw '' ${errorPrefix}: The string is empty'' else if substring 0 1 value == "/" then throw '' ${errorPrefix}: The string is an absolute path because it starts with `/`'' # We don't support ".." components, see ./path.md else if match "(.*/)?\\.\\.(/.*)?" value != null then throw '' ${errorPrefix}: The string contains a `..` component, which is not allowed in subpaths''

At least this still expresses intent, rather than a potential misunderstanding that kinda worked out.

It would be good to add the value value to the error message when appropriate. Should make troubleshooting a lot quicker in many cases.

The value is included in the prefix:

nix-repl> path.subpath.normalise "/foo/bar" error: lib.path.subpath.normalise: Argument "/foo/bar" is not a valid subpath string: The string is an absolute path because it starts with `/`

(I would like to a see a streaming toPretty version that can also handle failures and early exit at some point though, see #98761 for a previous attempt of mine)

You can't catch aborts, and currently not even C++ really catches a stack overflow... and then there's non-termination of course.

Not entirely unsolvable for the purpose of error messages, in theory, but not something today's Nix will let us solve.

Perhaps a nice trick for attrsets specifically is to make use of the fact that at least the keys exist. If the attrNames are few enough, we could print them; otherwise show the "first" and "last" attribute name and perhaps make use of _type if there is one, if you're in a daring mood.

Sounds like we could have a safePrint function for such heuristics. It's a general problem.

Every time I try to do common-sense things in the Nix language it feels like writing a standard library from scratch. Even when writing a standard library from scratch.

I also think it should be good enough to account for the common errors at this point. The Nixpkgs Architecture Team can still open the topic of the standard library once the package reorg is done, and address the issue wholesale, with a targeted effort. Otherwise other targeted efforts like this one will constantly stall on side issues.

Every time I try to do common-sense things in the Nix language it feels like writing a standard library from scratch. Even when writing a standard library from scratch.

That's an indication that such code shouldn't be written in Nix at all. It's not a general purpose language and attempts to use it as such will always be painful.

In fact it's not all that painful, it just feels a bit like pioneering because the tools are not readily available.

The reason why "just don't use the Nix language for that purpose" doesn't always work is that the alternative is to use intermediate built steps. And you want to avoid those at almost all costs, because they are orders of magnitude slower than some clumsy string wrangling.

And no, I don't think building the tools into the language wholesale is the right answer to that issue. Because that's where you surely won't ever get rid of unfortunate design decisions.

edolstra · 2022-12-13T11:03:21Z

I'll repeat my earlier objection: Nix is not a suitable language for complex string manipulation, since it's way too slow for that. It's a DSL for describing compositions of packages/configurations. If we need operations for e.g. normalising paths, they should be added to the language (for instance, the CanonPath type introduced by the lazy-trees branch ensures a canonical representation of paths inside the evaluator).

Also, I would suggest doing this as a flake first. If it turns out to be sufficiently useful, it can always be added to the standard library. But once it's in the standard library, we're stuck with it forever.

lib/path.md

fricklerhandwerk · 2022-12-13T11:18:59Z

Nix is not a suitable language for complex string manipulation, since it's way too slow for that.

This is about first making it work. As @infinisil already said, if it turns out to be too slow, we can still make it fast by including it in the Nix language.

Also, I would suggest doing this as a flake first.

There is enough interest and capacity to add this in the standard library maintain it for a reasonable amount of time. It's much less useful and practically impossible to discover outside of Nixpkgs.

Also, the goal of this PR is to clear the ground for source combinators which also has multiple interested parties waiting for it to land.

But once it's in the standard library, we're stuck with it forever.

That's not completely true. We can have breaking changes with sufficient grace periods. Nixpkgs is easy to pin for consumers (as opposed to Nix), and can be refactored internally. The Nixpkgs Architecture Team discussed strategies for handling compatibility multiple times now.

Also, of all things, this design should rather be in the standard library as opposed to some other quite random-sounding functions which are already in there.

roberth

Impressive test. I might have done something a bit more humble like in the source combinator tests, but this is better.

lib/tests/path.sh

infinisil · 2022-12-23T20:14:44Z

Rebased on master after merge conflicts. Also reorganized the commit history.

lib/path/tests/generate.awk

Adds initial work towards a `lib.path` library Originally proposed in #200718, but has since gone through some revisions Co-Authored-By: Valentin Gagarin <valentin.gagarin@tweag.io> Co-Authored-By: Robert Hensing <robert@roberthensing.nl>

The first path library function

edolstra · 2023-01-03T13:18:56Z

I think merging this is a mistake, and I don't think my objections have been addressed.

It's not obvious that we need this in the standard library.
Performance: Nix is not a suitable language for this kind of library - string manipulation is slow and inefficient. It would be very bad for evaluation speed / memory usage if subpath.normalise is used widely.
It imposes path semantics that may be different from Nix's. It's not helpful for Nix users if Nix and Nixpkgs have different notions of how paths behave. If there are problems with Nix's path type, it should be addressed in Nix.
It makes learning the language harder, since it introduces concepts like "subpaths" that don't appear in the Nix reference manual.
It should have been done as a flake initially, since by putting it in the standard library we're stuck with it for a long time.

infinisil · 2023-01-03T13:49:35Z

@edolstra Most of these have been addressed before:

It's not obvious that we need this in the standard library.

Agreed, it's not obvious, but it is needed for things like source combinators which needs to handle subpaths. We plan to add lib.path.append in the next iteration, which is just like <path> + ("/" + <string>) but safer, this depends on lib.path.subpath.isValid. In another iteration we plan to have something like lib.path.relativeTo which removes an absolute prefix from a path, returning the relative subpath that's left, this depends on lib.path.subpath.normalise. Getting the semantics of the base functionality correct is very important.

Performance: Nix is not a suitable language for this kind of library - string manipulation is slow and inefficient. It would be very bad for evaluation speed / memory usage if subpath.normalise is used widely.

Was already addressed twice, in #200718 (comment) and #205190 (comment)

It imposes path semantics that may be different from Nix's. It's not helpful for Nix users if Nix and Nixpkgs have different notions of how paths behave. If there are problems with Nix's path type, it should be addressed in Nix.

This was partly addressed here, but is also addressed in the merged design document.

It makes learning the language harder, since it introduces concepts like "subpaths" that don't appear in the Nix reference manual.

Subpaths aren't part of the language, it's just a nixpkgs concept.

It should have been done as a flake initially, since by putting it in the standard library we're stuck with it for a long time.

Already addressed here and here

vcunat · 2023-01-10T11:42:36Z

The tests fail with (at least) one particular random seed:

Using seed 13110, use `lib/path/tests/prop.sh 13110` to reproduce this result
lib/path/tests/prop.sh: line 82: ['']='': bad array subscript

(I could reproduce that on current master and on this merge commit introducing the test.)

infinisil · 2023-01-10T13:23:15Z

@vcunat Oh wow, thanks for that, fixed in #210042!

infinisil requested review from roberth and fricklerhandwerk December 8, 2022 21:31

infinisil requested review from edolstra and nbp as code owners December 8, 2022 21:31

github-actions bot added the 8.has: documentation label Dec 8, 2022

infinisil mentioned this pull request Dec 8, 2022

Path library design #200718

Closed

ofborg bot added 10.rebuild-darwin: 0 10.rebuild-linux: 1-10 labels Dec 8, 2022

infinisil force-pushed the lib.path.relativeNormalise branch 2 times, most recently from 3d97c55 to c2584ae Compare December 8, 2022 21:49

roberth requested changes Dec 8, 2022

View reviewed changes

lib/path-design.md Outdated Show resolved Hide resolved

lib/path-design.md Outdated Show resolved Hide resolved

lib/path-design.md Outdated Show resolved Hide resolved

lib/path-design.md Outdated Show resolved Hide resolved

roberth reviewed Dec 8, 2022

View reviewed changes

lib/path-design.md Outdated Show resolved Hide resolved

roberth reviewed Dec 8, 2022

View reviewed changes

lib/path-design.md Outdated Show resolved Hide resolved

infinisil force-pushed the lib.path.relativeNormalise branch 2 times, most recently from 8f1e221 to 6e85f9f Compare December 9, 2022 21:16

ofborg bot requested a review from tazjin December 9, 2022 21:20

ofborg bot added 10.rebuild-darwin: 1-10 10.rebuild-darwin: 1 and removed 10.rebuild-darwin: 0 labels Dec 9, 2022

fricklerhandwerk reviewed Dec 10, 2022

View reviewed changes

infinisil force-pushed the lib.path.relativeNormalise branch from eac0c14 to a568997 Compare December 12, 2022 20:14

roberth reviewed Dec 12, 2022

View reviewed changes

lib/path.nix Outdated Show resolved Hide resolved

lib/path.nix Outdated Show resolved Hide resolved

lib/path.nix Outdated Show resolved Hide resolved

roberth reviewed Dec 12, 2022

View reviewed changes

doc/doc-support/lib-function-docs.nix Outdated Show resolved Hide resolved

fricklerhandwerk approved these changes Dec 12, 2022

View reviewed changes

infinisil changed the title ~~lib.path: init with relativeNormalise~~ lib.path: init with subpath.normalise Dec 13, 2022

edolstra reviewed Dec 13, 2022

View reviewed changes

lib/path.md Outdated Show resolved Hide resolved

roberth reviewed Dec 13, 2022

View reviewed changes

lib/tests/path.sh Outdated Show resolved Hide resolved

lib/tests/path.sh Outdated Show resolved Hide resolved

lib/tests/path.sh Outdated Show resolved Hide resolved

lib: Allow doc rendering for lib/<name>/default.nix

8496683

infinisil force-pushed the lib.path.relativeNormalise branch from 82a4aa8 to 1304d92 Compare December 23, 2022 20:13

infinisil changed the title ~~lib.path: init with subpath.normalise~~ lib.path: init with subpath.{isValid,normalise} Dec 23, 2022

infinisil changed the title ~~lib.path: init with subpath.{isValid,normalise}~~ lib.path.subpath.{isValid,normalise}: init Dec 23, 2022

infinisil requested a review from roberth December 23, 2022 20:19

ofborg bot removed the 10.rebuild-darwin: 1 label Dec 23, 2022

fricklerhandwerk reviewed Jan 3, 2023

View reviewed changes

lib/path/tests/generate.awk Outdated Show resolved Hide resolved

infinisil and others added 4 commits January 3, 2023 13:19

lib.path: init README.md document

ba7ed22

Adds initial work towards a `lib.path` library Originally proposed in #200718, but has since gone through some revisions Co-Authored-By: Valentin Gagarin <valentin.gagarin@tweag.io> Co-Authored-By: Robert Hensing <robert@roberthensing.nl>

lib.path.subpath.isValid: init

98fbcf1

The first path library function

lib.path.subpath.normalise: init

63dd6d2

lib.path.subpath.normalise: add property tests

0667ef5

infinisil force-pushed the lib.path.relativeNormalise branch from 1304d92 to 0667ef5 Compare January 3, 2023 12:21

ofborg bot added the 10.rebuild-darwin: 1 label Jan 3, 2023

roberth approved these changes Jan 3, 2023

View reviewed changes

roberth merged commit f61d4d3 into master Jan 3, 2023

edolstra deleted the lib.path.relativeNormalise branch January 3, 2023 12:53

This was referenced Jan 3, 2023

lib.path.append: init #208887

Merged

lib.path.subpath.join: init #209099

Merged

lib.path.difference: init #209375

Closed

infinisil mentioned this pull request Jan 10, 2023

lib/path/tests: Fix property tests when "-n" is generated #210042

Merged

1 task

This was referenced Jan 12, 2023

lib.path.hasPrefix, lib.path.removePrefix: init #210423

Closed

Path library functions #210426

Closed

This was referenced Jan 26, 2023

lib: Create a generic property testing framework #212858

Closed

Experimental python bindings NixOS/nix#7735

Draft

infinisil mentioned this pull request Mar 15, 2023

[WIP] lib.sources.predicateFilter: init #221361

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib.path.subpath.{isValid,normalise}: init #205190

lib.path.subpath.{isValid,normalise}: init #205190

infinisil commented Dec 8, 2022 •

edited

Loading

roberth left a comment

infinisil commented Dec 9, 2022

fricklerhandwerk left a comment

fricklerhandwerk left a comment

fricklerhandwerk Dec 12, 2022

infinisil Dec 12, 2022 •

edited

Loading

infinisil Dec 12, 2022

roberth Dec 12, 2022

infinisil Dec 12, 2022

fricklerhandwerk Dec 13, 2022

roberth Dec 13, 2022

infinisil Dec 13, 2022 •

edited

Loading

fricklerhandwerk Dec 12, 2022

infinisil Dec 12, 2022

roberth Dec 12, 2022

roberth Dec 12, 2022

infinisil Dec 12, 2022

roberth Dec 13, 2022

infinisil Dec 13, 2022

fricklerhandwerk Dec 13, 2022 •

edited

Loading

edolstra Dec 13, 2022

fricklerhandwerk Dec 13, 2022 •

edited

Loading

edolstra commented Dec 13, 2022 •

edited

Loading

fricklerhandwerk commented Dec 13, 2022

roberth left a comment

infinisil commented Dec 23, 2022

edolstra commented Jan 3, 2023

infinisil commented Jan 3, 2023

vcunat commented Jan 10, 2023

infinisil commented Jan 10, 2023

lib.path.subpath.{isValid,normalise}: init #205190

lib.path.subpath.{isValid,normalise}: init #205190

Conversation

infinisil commented Dec 8, 2022 • edited Loading

Description of changes

Things done

roberth left a comment

Choose a reason for hiding this comment

infinisil commented Dec 9, 2022

fricklerhandwerk left a comment

Choose a reason for hiding this comment

fricklerhandwerk left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

infinisil Dec 12, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

infinisil Dec 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fricklerhandwerk Dec 13, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fricklerhandwerk Dec 13, 2022 • edited Loading

Choose a reason for hiding this comment

edolstra commented Dec 13, 2022 • edited Loading

fricklerhandwerk commented Dec 13, 2022

roberth left a comment

Choose a reason for hiding this comment

infinisil commented Dec 23, 2022

edolstra commented Jan 3, 2023

infinisil commented Jan 3, 2023

vcunat commented Jan 10, 2023

infinisil commented Jan 10, 2023

infinisil commented Dec 8, 2022 •

edited

Loading

infinisil Dec 12, 2022 •

edited

Loading

infinisil Dec 13, 2022 •

edited

Loading

fricklerhandwerk Dec 13, 2022 •

edited

Loading

fricklerhandwerk Dec 13, 2022 •

edited

Loading

edolstra commented Dec 13, 2022 •

edited

Loading