-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using conda to manage multiple native library configurations without Anaconda or Python in the mix #911
Comments
Wow! This is well-timed! Things are coming to a head with several different proposals along these lines. If you'd like to join us for a call on Monday (5/9) morning at 9 AM Central time, your thoughts and opinions are welcome. The other related issues (I think) are: #728 #747 #848 #857 and probably more that I'm missing (sorry). |
This seems relevant, if we want to try to do this without features (and I think we might): conda-forge/staged-recipes#525 (comment) |
I'd love to join that call - can't guarantee I'll say coherent things though, as 7 AM Pacific time is not the best time for that. @mcg1969 with conda-forge/staged-recipes#525 (comment), our system would have a metapackage |
That's right. The dependency solver will automatically make sure that only one of the build strings will be selected, of course---and packages that don't specify one will defer to the ones that do. |
Basically, what we really want is a sort of "key/value" feature capability, something that might ultimately look like |
Roughly speaking, the existing feature capability is somewhat like having two packages:
By default, version 1 will be preferred, which is "off". But if you either 1) specify |
That sounds great! We've run into some problems isolating environments from the Anaconda packages with the feature (e.g. it used Anaconda's zlib instead of ours, we fixed it by increasing our build number), and would be nice for the solver to the right thing in the face of that ambiguity. Being able to conveniently isolate environments from the Anaconda default packages is another thing to eventually get working better. |
I think you'll find the improvements that we're making to channel handling will help with this second problem. We may not be all the way there yet in master, but for instance, we're no longer "interleaving" identically-named packages from two different channels. The packages from higher-priority channels will always be preferred over lower-priority ones. |
Actually, I'm wrong, the current feature capability can be implemented with just one metapackage representing "on". |
This is very helpful. I'm still learning more about conda (and metapackages) but I'd eventually like to conda-ify the C++ library toolchain I'm currently using: https://github.com/cloudera/native-toolchain The goal would be to be able to more easily build both debug and release builds like Mark is describing. It also needs to be able to build multiple versions of certain libraries (e.g. multiple versions of LLVM and Thrift). As another example of a 3rd-party library toolchain: https://github.com/apache/parquet-cpp/tree/master/thirdparty in particular: https://github.com/apache/parquet-cpp/blob/master/thirdparty/set_thirdparty_env.sh Outside of the compiler configuration, need to be able to indicate to cmake where to look for build / runtime dependencies first. I also am trying to figure out how to manage cmake modules in a way that would be compatible with a conda-supplied toolchain. For example, the Thrift cmake module here depends on the https://github.com/apache/parquet-cpp/blob/master/cmake_modules/FindThrift.cmake I guess one solution is that you can just set |
I'm sure between the lot of us we can figure something out. I won't be much help on the compiler suite, but if we need to get |
Any guidance on automating a toolchain build from a directory of recipes? Or is that a DIY affair right now? What about "build everything that hasn't been built"? |
I often use directories of conda recipes like
Is this along the lines of what you are looking for? |
IIUC, |
Good point, my example only involves packages that are not in |
I don't know if there's an option to clear out the |
Well, there is |
To process builds of many packages, we've created a script which loads all the meta.yaml files from the package directories (with a little bit of monkey-patching in conda-build so we can submit jobs to different platforms than the submitter is running on), then submit all the requested jobs to process on Deadline mirroring the package dependency structure into a job dependency structure there. For submitting rebuilds of everything, this works pretty well, but we haven't tackled doing partial rebuilds of a package and all its downstream dependencies when there's a change in the git repository for a library. Maybe loading metadata of many packages and constructing/processing the dependency graphs is functionality to add in conda-build? |
For serving the conda packages, we've got Apache serving them, and made a trivial server in Python code on port 8080 that accepts PUT requests, and runs In particular, this provides the place for conda to look for dependencies, and the combination of uploading to the server and having the Deadline job dependencies match the package dependencies allows the rebuilding of a bunch of packages to happen in a distributed fashion while respecting those dependencies. |
So what I'm hearing is that if I am unable to depend on repo.continuum.io, then I need to write my own dependency analysis script to build the package tree manually from the bottom up? Feels like this should be a part of conda. |
I'm pretty sure that |
@msarahan and @kalefranz definitely have more knowledge than I do about how |
If you have a flat folder of recipes, conda-build will try first to download packages, but if they are not available, it will build them. It is primitive in knowing where recipes might be, so that flat folder hierarchy is currently essential. Note that this also doesn't have very good ways to specify when things should be rebuilt. If it can be downloaded, conda-build will always prefer that. So, you either have to hide potential download sources and clear existing package builds, or manually specify recipes to be built. No ordering analysis is done for manually specified lists of recipes. |
OK, so there is work that needs to be done in order to build a flat directory of recipes from scratch, in dependency order. |
@msarahan so, is the easiest option to fix this to add an option to |
Another option is to install conda into a purpose-built non-Anaconda Python then emulate a local package server (so you have a single channel available on localhost), but this seems really hacky |
Unless you have all of your system's dependencies as recipes, that probably won't work. I think we should add a |
Let's assume that all of the system's dependencies (including gcc) are available as recipes. This is literally what we are doing currently with a manually-maintained set of shell scripts building things from the ground up. |
Wouldn't |
I've added conda/conda#2901, a feature idea for conda to better support toolchain systems as described here. |
I know this is 3 years old but @mwiebe I stumbled across it while looking for ideas involving Conda over Rez for VFX. Is this still applicable and did you have any success? |
Hi @wizofe, it is still applicable, and the input we and others provided was synthesized by the Conda developers into a feature called Build Variants documented at https://docs.conda.io/projects/conda-build/en/latest/resources/variants.html. Something like VFX Platform could be treated as a self-consistent package ecosystem that's mentioned near the bottom of https://docs.conda.io/projects/conda-build/en/latest/resources/variants.html#self-consistent-package-ecosystems. We haven't migrated our system from using environment variables as parameters to using conda build variants, but the way we have evolved it is successful, and we use it to build hundreds of conda recipes across many supported operating systems, compilers, and software versions like Python 2.7 and Python 3.6 to output thousands of packages. We index these packages into an S3 bucket, which conda supports referencing natively as a conda channel. |
Hi there, thank you for your contribution! This issue has been automatically marked as stale because it has not had recent activity. It will be closed automatically if no further activity occurs. If you would like this issue to remain open please:
NOTE: If this issue was closed prematurely, please leave a comment. Thanks! |
We're most of the way through setting this up so we can use conda for this internally at Thinkbox. It's much better than managing 50 dependent libraries and SDKs manually, but could be better. I'm posting this now because I saw https://twitter.com/wesmckinn/status/725448295774969857 from @wesm, and thought I would describe some parts of what we're doing. Our solution is passable as an internal tool, but integrating support for what we're doing directly from conda and conda-build would be much nicer.
Initially we evaluated between conda and rez (https://github.com/nerdvegas/rez). Being a tool from VFX, I thought rez might make integrating with some graphics-related packages easier out of the box, but conda came out as a clear winner in the comparison. The biggest challenge to be able to use conda was to find a way to manage multiple compiler configurations. Conda-build has some hard-coded internal logic that selects a compiler based on platform and python version, so out of the box it doesn't support this idea at all.
Example compiler configurations one might want:
/Gv
flag, so numeric code uses the faster vectorcall convention.Our solution to defining multiple compiler configurations uses a conda feature paired with a conda package of the same name. The feature gets applied to all packages built with that compiler configuration.
For example, the package
msvc2012rel
provides the featuremsvc2012rel
, and installs files to configure the compiler for builds with MSVC 2012 in release mode. Itsmeta.yaml
file looks like:and its
bld.bat
installs the filecompiler_config.bat
which looks like the following. It's bit messy, admittedly, but provides environment variables to easily tell cmake and other build systems the compiler configuration.With this system, normal conda-build recipes do not work, they must be told which compiler configuration to use, and then they need to depend on the feature tag for that compiler configuration. This is done through the
COMPILER_CONFIG
environment variable, which gets interpolated into the recipe using conda's jinja2 templating. See themeta.yaml
file for a zlib recipe following this standard:Every recipe's
bld.bat
andbuild.sh
first source thecompiler_config.bat
orcompiler_config.sh
file provided by the compiler config package, then use the defined variables and configured compiler to build. Thebld.bat
for zlib is:and the
build.sh
is basically identical to how a typical recipe's would be except for sourcingcompiler_config.sh
:To conclude things, this system is working, but would be a lot better with first-class support from conda. What we're using isn't suitable for dropping directly into conda, but maybe the way we've done things can inspire something that is.
The text was updated successfully, but these errors were encountered: