Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite Cabal in Shake #4174

Open
ezyang opened this issue Dec 18, 2016 · 11 comments
Open

Rewrite Cabal in Shake #4174

ezyang opened this issue Dec 18, 2016 · 11 comments

Comments

@ezyang
Copy link
Contributor

ezyang commented Dec 18, 2016

The Cabal build system today is very bad at avoiding redundant work. For example, when you ask it for a rebuild, it must always rerun the preprocessors... because it doesn't know any better. cabal new-build has a complicated change tracking mechanism built on top of the old-style Setup interface, because the Cabal library is simply not setup to handle it. Sharing a common, dependency graph creation abstraction, i.e., Shake, will help us solve these problems.

Here are the major architectural points that make sense to me at this point:

  • The rewritten Shake-based Cabal would not even attempt to work with old custom setups, as it will probably be very hard to maintain library compatibility in the new world order. The new Cabal would initially just be used by cabal-install for Simple setups, and cabal-install would keep its old code to work with (now-legacy) Cabal for existing custom setups.

  • Insofar that cabal-install uses linked Cabal for communicating with external Cabal (type definitions, marshalling, etc) the old Cabal will need to be kept around, so the new one should live in parallel. But parts of legacy Cabal not used by cabal-install can be scrapped as existing Custom setups would only link extent Cabal releases.

  • Per-package custom build rules are still extremely useful. But since we are making a clean-break in the Cabal interface, and since this is one the oldest parts of the cabal-install/cabal ecosystem, this would be a good change to clean-slate rethink the interface. For example we can assume cabal-install/shake/etc would do all the solving so the packages should only work with the packages they are given. @Ericson2314 is a fan of packages defining their own Shake rules, or something else high-level and declarative that will integrate wonderfully with the new Cabal.

  • Hadrian has some good ideas about how to design build systems for Haskell code (https://www.staff.ncl.ac.uk/andrey.mokhov/Hadrian.pdf); a rewrite of Cabal in Shake should take advantage of those ideas. (In particular, they have good ways of dealing with the tons and tons of flags we need to pass to the subprograms we call.)

  • ToolCabal (https://github.com/TiborIntelSoft/ToolCabal) did this already. There are probably lessons to be learned from that code base.

  • Although Shake-ifying Setup scripts would not directly improve parallelism, it could potentially hook into a Shake library for building Haskell modules which IS parallel (this might be the driver program described in Parallelise cabal build over modules #976), and then the parallelism would propagate to the entire build. Good!

  • (More here)

CC @Ericson2314

@gbaz
Copy link
Collaborator

gbaz commented Dec 18, 2016

"so that Cabal can be boot without pulling in Shake"

Skimming the shake deps, it looks like what it depends on, while a bit broader than what cabal does, is actually a pretty minimal set. The nonstandard deps don't have any further non-standard second-order deps themselves. I don't know the current story with cabal deps, but my sense is they've loosened up a tiny bit. In such a circumstance, the broader shake set of deps may not be that bad...

@phadej
Copy link
Collaborator

phadej commented Dec 18, 2016

Cabal is boot library, so extra dependencies are not-so-easy. Luckily Cabal is upgradable package, but still, situation is difficult.

I don't like "Live in parallel". Keeping them comparable is non-trivial work.

I'm not against this task, but would like it to be done when new-build fully replaces build. If the large drawback atm is "we must rerun preprocessors", then IMHO there isn't urgent need to accomplish this.

@gbaz
Copy link
Collaborator

gbaz commented Dec 18, 2016

ah, right, its cabal-install which can acquire deps (slightly) more freely, i suppose.

@ezyang
Copy link
Contributor Author

ezyang commented Dec 18, 2016

See #4175 for another example of something that probably would get better with a Shake based build system.

@ezyang
Copy link
Contributor Author

ezyang commented Dec 18, 2016

I mean, "we must always rerun preprocessors" was just one example of many, many operations that we have to redo when we call Cabal. Other examples are the 'configure' steps: whenever you reconfigure (e.g., Cabal file changed) everything has to be redone, even though most of it doesn't change; Shake would help with that as well.

@Ericson2314
Copy link
Collaborator

@phadej I imagine both cabal-install and ghc-pkg need a not-too-dissimilar subset of Cabal, mainly type definitions and parsing/serializing/marshalling? Perhaps the same, largely self-contained "rump Cabal" could be kept around for them both.

@ezyang
Copy link
Contributor Author

ezyang commented Jan 26, 2017

Yes. Similar with Stack; they only need the parsing code, not the build system.

@Ericson2314
Copy link
Collaborator

Ericson2314 commented Jan 26, 2017

Ah, cause they always do external Setup.hs, right?

@ezyang
Copy link
Contributor Author

ezyang commented Jan 27, 2017

Yep.

@dmwit
Copy link
Collaborator

dmwit commented Oct 9, 2019

Another thing that may be worth pondering: shake has a concept of "resources" which are used to limit parallelism. The canonical example is to have just one "linker" resource, because linkers are traditionally very memory hungry and having multiple run at a time can be a recipe for OOM death. To what extent do we want to identify a core set of resources which cabal knows about and manages across packages -- so that, e.g., we don't have five packages all in the linking phase at one time?

@23Skidoo
Copy link
Member

I have an old branch that implements limiting the number of linking steps that can run concurrently. Maybe we should dust it off and merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants