Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GC rooting misses C dependencies of a Haskell project #62

Open
shajra opened this issue Jan 10, 2021 · 20 comments
Open

GC rooting misses C dependencies of a Haskell project #62

shajra opened this issue Jan 10, 2021 · 20 comments

Comments

@shajra
Copy link

shajra commented Jan 10, 2021

Hi,

I recently went through every Nix extension of Direnv extension I could find, and did a comparative study. I found that the approach you use seems to miss creating a GC root for C-libraries of a Haskell project.

To test this out, you can clone my shajra/nix-haskell-hls project and do the following steps:

# setup, assuming nix-direnv is installed
#
git clone https://github.com/shajra/nix-haskell-hls.git
echo use_nix > nix-haskell-hls/examples/example-cabal/.envrc
direnv allow nix-haskell-hls/examples/example-cabal/.envrc
cd nix-haskell-hls/examples/example-cabal

# prove that the program compiles and runs
#
cabal run all

# get out of the direnv environment keeping dependencies alive
#
cd ..

# clear out anything not GC rooted
#
nix-collect-garbage

# go back into the environment and see that we're missing a runtime dependency
cd example-cabal
cabal run all

This will give you the following error:

Warning: The package list for 'hackage.haskell.org' does not exist. Run 'cabal
update' to download it.RemoteRepo {remoteRepoName = "hackage.haskell.org",
remoteRepoURI = http://hackage.haskell.org/, remoteRepoSecure = Just True,
remoteRepoRootKeys =
["fe331502606802feac15e514d9b9ea83fee8b6ffef71335479a2e68d84adc6b0","1ea9ba32c526d1cc91ab5e5bd364ec5e9e8cb67179a471872f6e26f0ae773d42","2c6c3627bd6c982990239487f1abd02e08a02e6cf16edb105a8012d444d870c3","0a5c7ea47cd1b15f01f5f51a33adda7e655bc0f0b0615baa8e271f4c3351e21d","51f0161b906011b52c6613376b1ae937670da69322113a246a09f807c62f6921"],
remoteRepoKeyThreshold = 3, remoteRepoShouldTryHttps = True}
Resolving dependencies...
cabal: Could not resolve dependencies:
[__0] trying: example-haskell-app-0.1.0.0 (user goal)
[__1] unknown package: text-icu (dependency of example-haskell-app)
[__1] fail (backjumping, conflict set: example-haskell-app, text-icu)
After searching the rest of the dependency tree exhaustively, these were the
goals I've had most trouble fulfilling: example-haskell-app, text-icu

The problem is that the C-library text-icu that this project depends on is being garbage collected.

You can do the steps above with the .envrc provided by the project (not doing the echo setup step above), and you shouldn't see this problem.

The .envrc file I provide uses a project I ended up making in all of this exploration called Lorelei. This project delegates very heavily to Lorri to sort out what to GC root. I think Lorri has the best algorithm I've seen. My project is an attempt to get the benefits of Lorri, but without needing a daemon running, for those people that find daemons a hassle.

Also, I believe that the Nixify approach doesn't have this problem with GC rooting C dependencies.

For the longest time I've been trying to avoid writing something like Lorelei, because I was frustrated that I'd be making yet-another-option, which can be confusing for new users. But now that it's written, I tried to pull in all the features I saw in your project, Nixify, and Sorri that I could think to. Here's what it has thus far:

  • automatically detectable files to watch (lorri/sorri style)
    • uses Lorri code to avoid implementation drift
    • provides an option to control depth of auto-detection, which can help with Nix evaluation speed
  • user-specified files to watch by mtime (nix-direnv style)
  • user-specified files to watch by content hash (Nixify style)
  • keeping the last-n environments to save time when reverting back
  • provides an option to bypass environment caching, but keep GC rooting

One thing I think your project does a lot more is integrate with the Nix community more. You're in NUR, and you're also in Nixpkgs. This is stuff I could do too. But I was hoping maybe we could join forces if you liked my approach.

There's one more difference of implementation between your project and mine. You provide a shell script directly, whereas I build mine out with Nix, allowing me to slip in references to /nix/store for any interesting executables I decide to use as helpers. This means users don't have to install tools like jq, but I can use them if I want to. Also, it locks things down more, which I think is in the spirit of Nix.

Anyway, let me know what you think. Also thanks for putting work in to this project. I definitely took ideas from it.

@shajra
Copy link
Author

shajra commented Jan 10, 2021

Oh yeah, I forgot to mention. I don't cover Flakes with my project. I just haven't gotten around to using them. I was kind of waiting for them to be released more officially. Someone I trust mentioned that his experience using them was kind of buggy, and I guess that anecdote left its impression on me. But I'm sure Flakes will harden. In the meantime, I'm always a bit conservative about turning on experimental features.

@Mic92
Copy link
Member

Mic92 commented Jan 10, 2021

I tried lorri in the past but its evaluation is very slow when the NIX_PATH points to a nixpkgs git checkout, which is why I cannot use it. Is this different in your implementation? Also I use flakes in more and more projects.

@Mic92
Copy link
Member

Mic92 commented Jan 10, 2021

How does cabal shells track c dependencies? Usually our shell environments uses NIX_CFLAGS_COMPILE/NIX_LDFLAGS, which should have garbage collection roots.

@shajra
Copy link
Author

shajra commented Jan 10, 2021

@Mic92 I have an option to make the evaluation of Lorri a lot faster. It turns out it's due to an overlay they do to get extra tracing when reading files. But that extra tracing doesn't really seem to capture much for my projects.

Also, with my project, you don't have to use Lorri's autodetection at all. If you use manual watching exclusively, I'll just use Lorri for managing the environment variable saving/loading. Sorry, I'm editing this comment because I misspoke. The evaluation of the environment always does some autodetection with Lorri. But that amount of autodetection (what's done in my patched version of Lorri) seems to be as fast as nix-shell.

@shajra
Copy link
Author

shajra commented Jan 10, 2021

@Mic92 Lorri's hack for capturing all environment variables is in this file: https://github.com/target/lorri/blob/master/src/logged-evaluation.nix

This is what I use. However, with switches, I give the user the option to use a patched version that doesn't have the tracing that making Lorri really slow on some projects: https://github.com/shajra/direnv-nix-lorelei/blob/user/shajra/next/nix/remove_trace.patch

@shajra
Copy link
Author

shajra commented Jan 10, 2021

@Mic92 to sum up, with all the options I have in Lorelei, I got speeds as fast a nix-shell call, which seems like the right benchmark for comparison. But there's options to do things exactly as Lorri does it, which makes it notably slower than that.

@Mic92
Copy link
Member

Mic92 commented Jan 10, 2021

When I started this project none of the alternatives existed and I just made a project out of the wiki article in the direnv wiki. I am happy to switch to something else if it implements the features I currently have with direnv-nix and have one project less to maintain.

@shajra
Copy link
Author

shajra commented Jan 10, 2021

@Mic92 yeah, I hear you. I was apprehensive of porting over the use_flake stuff, though, because I'm not emotionally ready to upgrade Nix on my systems just yet. Since I'm not using flakes at all, it seemed like a bad idea to try to support them.

So maybe your project can reduce its surface area to just use_flake? And then maybe in the README point to my project for non-flake usage?

I guess that makes for two projects a user might have to install. I don't know if that kind of ergonomics should be acceptable. Maybe if both projects are in NUR that could be a compromise. Because I named my function use_nix_gcrooted, I shouldn't have any naming conflicts if both projects were installed into ~/.config/direnv/lib.

@Mic92
Copy link
Member

Mic92 commented Jan 10, 2021

I think I will wait until I can deprecate nix-direnv as a whole. Because than also the home-manager module can point to a different project.

@shajra
Copy link
Author

shajra commented Jan 10, 2021

Sounds good, I'll probably update the wiki on the direnv site with a link to my project. In the meantime, you have this issue to track the bug. I think one approach you can do is to use the Nixify style of building out your environment, which makes more GC links based on the derivation.

One limitation of Nixify's approach is that you must have a Nix expression in a file. For example, you can't support nix-shell's -p option. But I think the community generally frowns upon using -p programmatically. It's a bit clever with respect to its implementation. This limitation is true for Lorri and Lorelei as well.

@bbenne10
Copy link
Contributor

Is there anything to actually do about this issue? Or can I go ahead and close it?

@shajra
Copy link
Author

shajra commented Apr 28, 2022

@bbenne10 there's a bug with this algorithm. The solution is implemented by other projects (Nixify being one of them). The author is only using flakes, which are now supported by Direnv's standard library.

I feel it wouldn't be right to just close this issue. It's genuinely a bug, and not just a design choice.

@shajra
Copy link
Author

shajra commented Apr 28, 2022

Rereading the comment history, I think wonder how close we are to the proposal to deprecate nix-direnv as a whole. Is the Direnv flakes support adequate? I haven't done much of a comparison. However, I'm personally in the process of moving to flakes. So this might be something I figure out soon enough.

@bbenne10
Copy link
Contributor

direnv's built-in support for flakes is "good enough", I think.
From my view, the shortcomings are:

  • It doesn't cache the output of nix print-dev-env as we do.
  • It doesn't symlink into gcroots so garbage collecting will clean up the available environments (I'm not sure how much I care about this, personally).
  • It doesn't handle some environment variables in the same way (notably TMPDIR and XDG_DATA_DIR).

My involvement in nix-direnv is relatively new (though I'd been using it far longer than I've been helping with it), so I lack historical context on whether these matter anymore. I'll have to leave it up to @Mic92 to make those calls.

With regards to deprecation, there are reasons to indicate that we're closer than we were in the beginning of 2021, but I don't see it happening YET. @Mic92's rust-based implementation of handling this seems like a GREAT way to handle what all of this is doing in a faster and more convenient package.

@shajra
Copy link
Author

shajra commented Apr 28, 2022

One reason I feel passionately about this is the number of times someone uses what looks like the "official" recommendation Direnv/Nix support and encounters this bug. It's already an uphill battle getting people to adopt what can seem like a Rube Goldberg machine. And this matters a lot for industrial Nix programmers. Just calling this "good enough" is a bad look, I think.

I would feel differently if this wasn't so prominently on display in Nix packages and other documentation. I did my best to be clear in the Direnv wiki, but that's only one place, and people can miss it.

@shajra
Copy link
Author

shajra commented Apr 28, 2022

One more note, when I wrote Lorelei, I really tried to make it address all features and known bugs for non-flakes usage. It does use Lorri code, but I find this approach to be pretty robust. I also worked hard to make the fact it uses Lorri at all hidden from users (including addressing performance). However, I know just declaring oneself superior is also a bad look.

Really, I just want to try my best to dispassionately assess what approach we really need to be recommending to users to maintain Nix adoption and enthusiasm. So whatever gives users the most features, ergonomics (including documentation), the least amount of bugs, and committed support/maintenance.

But I worry that right now, we have way too many buggy "good enough" scripts floating around. Though it's true that when flakes are officially supported and mainstream, this probably may go away. Unfortunately, we're not there yet (the Nix 2.0 nix command is still experimental, and it's been so since 2018, I believe).

@bbenne10
Copy link
Contributor

bbenne10 commented Apr 28, 2022

You'll note that I didn't say that our solution was "good enough". I said that the built-in direnv one is "good enough", which I think is generally true!

If you've not got a large amount of nix expressions to go through, nix-direnvs caching may not matter for you. The expression parsing is an expensive step and caching the output of nix print-dev-env speeds us up from a 6s invocation to a <1s invocation for a large mono-repo project locally.

I tried to replicate your experiment as outlined in the opening comment and received the following:

without nix-direnv:

cd ~/code/personal/nix-direnv/nix-haskell-hls 
direnv: loading ~/code/personal/nix-direnv/nix-haskell-hls/.envrc
direnv: using nix
error: nix-shell requires a single derivation
Try 'nix-shell --help' for more information.

and with nix-direnv:

cd ~/code/personal/nix-direnv/nix-haskell-hls
direnv: loading ~/code/personal/nix-direnv/nix-haskell-hls/.envrc
direnv: using nix
direnv: ([/nix/store/m1bg798cqwny33dziqwzd8qs1a4xlsng-direnv-2.31.0/bin/direnv export zsh]) is taking a while to execute. Use CTRL-C to give up.
error: '' needs to evaluate to a single derivation, but it evaluated to 13 derivations
direnv: nix-direnv: renewed cache
direnv: export ~XDG_DATA_DIRS

@shajra
Copy link
Author

shajra commented Apr 28, 2022

You'll note that I didn't say that our solution was "good enough". I said that the built-in nix one is "good enough", which I think is generally true!

You're right. I was too quick to read and respond. My apologies. Slowing down.

Okay, so regarding flakes support, I'm also catching up myself. It's taking a while to convert all my projects to flakes and gain experience with nuances.

Regarding continuing non-flakes support. Let me try to reproduce my original test case and get back. Though Nix is really reproducible, some things have stayed the same, and some things have changed. Give me a moment, though, because of some other obligations. For all my goings-off I'll try to not leave this issue lingering. I just didn't want it closed.

@bbenne10
Copy link
Contributor

I don't want to give the impression that I don't care about these use-cases. I do. I simply lack some context and the know-how to fix them! I am working to improve the situation on both of those fronts.

The reason I am bringing this up is that we have recently switched to nix print-dev-env for everything, which is a large departure from when this ticket was opened. I want to ensure that the issue is still present. Far from wanting to close prematurely, I want to re-assess what is still relevant in light of somewhat unrelated sweeping changes.

If this is an actionable item still, I'd like to gather context and figure out a way forward in the best iinterest of all involved parties - developers and consumers!

@shajra
Copy link
Author

shajra commented Apr 28, 2022

Sounds good. I'll validate the problem exists with some old version of nix-direnv, and then see if I can validate that it's not there with the latest version. I'll also revisit some questions about timings if there's a new approach. I also have some ambition to upgrade my script to the latest Lorri code, though I suspect the core algorithm there hasn't changed.

As a side note, one of my major motivations for delegating to Lorri was what seemed like a lot of popular usage. Unfortunately, a lot of people got weird glitches with the service running in the background. But I liked that the algorithm seemed to work, and that it was maintained by people who knew Nix really well. So I made Lorelei (actually there was a prior attempt by someone else called Sorri). However, not all laziness is healthy, and it's true that Lorri might not have everything figured out. Also, which codebase is actively maintained ebbs and flows. It could be Lorri now, and then Nix-direnv later.

Will report back.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants