-
-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: Add a metapackage to select BLAS #525
Conversation
Hi! This is the friendly automated conda-forge-linting service. I wanted to let you know that I linted all conda-recipes in your PR ( Here's what I've got... For recipes/blas_openblas:
|
2587768
to
521c63a
Compare
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
|
||
test: | ||
commands: | ||
- conda list blas_openblas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is a challenging package to test because it doesn't do much, but I tried to test it anyways and this seems like a simple test. However, it fails because it has not activated the _test
environment. Here is the full error message.
TEST START: blas_openblas-1.0.0-0
Fetching package metadata: ........
.Solving package specifications: .........
The following packages will be downloaded:
package | build
---------------------------|-----------------
blas_openblas-1.0.0 | 0 634 B file:///opt/conda/conda-bld/linux-64/
The following NEW packages will be INSTALLED:
blas_openblas: 1.0.0-0 file:///opt/conda/conda-bld/linux-64/
+ conda list blas_openblas
Error: Error: environment does not exist: /opt/conda/envs/_build
#
# Use 'conda create' to create an environment before listing its packages.
TESTS FAILED: blas_openblas-1.0.0-0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened this issue ( conda/conda-build#910 ) with regards to this error.
521c63a
to
dd88ce6
Compare
|
||
test: | ||
commands: | ||
- conda list -n _test blas_openblas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting the environment here feels like a horrific hack and quite fragile. Unfortunately, not setting it runs into this issue ( conda/conda-build#910 ). So, not really sure there is another option ATM.
as I said:
For example, numpy, scipy, and scikit-learn all use BLAS of some sort. People may compile each of them with any BLAS library. Features try to line these all up. In some cases, the only gain is disk space and perhaps memory footprint. In other cases, mixing different builds might be harmful. This is the case with the MS Visual C Runtime. Memory that is allocated in one runtime must be deallocated in that same runtime - or else crashes happen. Bad behavior can be more subtle, too. Using features, we keep all of the libraries that Python uses compiled with with same VS version to avoid these crashes. Features have limitations. In particular, you are not able to mix different programs in a given environment if they haven't been compiled with the same compiler. This is only really a problem when shared libraries are involved, though. For now, the only solution to work around needing a program (say, Python 2.7/vc9) in a different feature environment (say Python 3.5/vc14) is to install it in a separate environment and use filesystem paths to use it. |
I understand that. What I was looking for in that question is how would you say this in one sentence or less. |
Sorry, IMHO it is not a one sentence concept. That's why we need a link to somewhere else. |
Agreed. However, we want to always have a summary here. So, we need to think of what fits there if what I wrote can't go there. Could we do |
@mcg1969, would really appreciate your feedback on this before we go too far. We are trying to use features with BLASes (I don't really see another way ATM and that is how MKL is used too). Should we also track |
I do have very strong opinions here :-) but I am slammed today. I will get to it ASAP |
The problem with features is they are binary: And what happens if want to use MKL with Python and OpenBlas with R? I can simulate the behavior I really want with metapackages. To start, I'll name the package Now, whenever I build a Python package that depends on BLAS, I make it depend on a particular build of I can handle the actual BLAS dependency in several ways.
Because I've named it I this a horrible hack? You betcha. Does it give us desirable behavior? Yep. If we like how it works, can we add syntactic sugar to |
I think I follow what you are saying here @mcg1969 and I appreciate you taking time to comment. The idea sounds quite nice actually (though I might tweak it a little) and I think it's behavior is closer to what we really want. Could you maybe draw up a simple example recipes so it might be a little easier to follow? Maybe just a bare bones recipe for the feature and a bare bones recipe for the package. One feature case should be sufficient to start I think. |
Sure, I'll work on it. Stay tuned |
Awesome. Thanks. If you don't have enough time, I could try drawing up what I understand you to mean and we could iterate. However, you want to go about it. |
Honestly, if I consider the specific case for BLAS & Python, I'd love to see this be implemented not as a metapackage but rather as middleware. That is to say, |
Before I implement an example, care to give me a gut feel about which approach you like the best of the 4 I enumerated in my (edited) comment? And note that I'm being deliberate here to put the BLAS type in the build string and not the version. This is because I don't want to have to rely on lexicographic order to decide which BLAS is the default. |
While it is clumsy, I'm kind of liking 1. We can always use jinja templates to make this feel less clumsy. The trade-offs with the other options I am having trouble getting comfortable with. |
OK, 1 is probably my second choice, but I'm partial to 4. Since they're close I'll do that here. Here's a meta.yaml for the OpenBlas version of
Will the build number get appended to the build string? I don't recall offhand, but I hope not. And a package that uses BLAS:
To implement Option 1, you just drop the |
I see. I guess I wasn't quite following your description of 4. Was worried there was more maintenance burden in there somewhere. That doesn't actually look so bad now that I'm seeing. |
I test this and it does not. Though it doesn't show up in package name at all, which doesn't seem good. |
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
Alright, so I have tried to revise this recipe with the suggestions that you have provided @mcg1969. Please let me know what you think. As the build number is not included in the filename, I felt that this needed to be handled another way so I added some jinja to configure this. Maybe it is too much, but it just makes every build number increase to some |
|
||
test: | ||
commands: | ||
- conda list -n _test blas |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Setting the environment here feels like a horrific hack and quite fragile. Unfortunately, not setting it runs into this issue ( conda/conda-build#910 ). So, not really sure there is another option ATM.
So, this doesn't seem to be winning out against build numbers. I switched to building NumPy 1.10.4 as there is a package with build number
|
Admittedly, this is probably a collision so maybe we need a better example to test this on. I'll try building my own copies of 1.11.0. |
To avoid collisions, I noted that the most recent version of
|
Well yes, but again, these are metapackages that are intended to emulate behavior that might eventually be baked into conda directly. They're not intended to be built as normal packages or behave like them. |
I'm not sure why it should be expected to. We haven't changed conda's versioning rules here. |
The build number for the |
Yeah, didn't change that.
The higher one doesn't have the build string.
So, let me understand. The purpose of the build string will be to select between |
conda doesn't use the build string during the resolution process. Its only purpose is to make the filenames unique. The build number is all that conda considers when prioritizing packages. Remember, this isn't going to behave like a feature. Installing |
All the more reason why we don't want to think of these as features, but rather as variants. |
This is all going to be easier to sort out once the channel collision fix is released (4.1). When that happens, you will be able to stop gaming your version and build numbers to compete with defaults. That said, you may find that it will be worthwhile to consider backporting these metadata tricks to earlier builds, to avoid weird cases where |
This is a pretty lengthy and hugely valuable thread which I'm keen to see continuing, but I'd like to exercise a little caution in pinning conda-forge's ability to deliver numpy, scipy etc. on an interface which is quite so non-standard. Whilst I don't believe anybody here actually likes the way features have been used for blas/mkl for numpy in defaults, the fact remains that we are going to need to maintain compatibility with the As I said, this conversation has huge value, but I don't think we need a conclusion (and conda PRs) to be able to do something useful for conda-forge in the short term. |
Well, I can assure you that there is plenty of buy-in within Continuum about the concept of variants. We may have some issues to work out when it comes to making a clean transition from the current feature-based selection approach. But variants have many uses outside of BLAS. |
This may reflect a bit of a misunderstanding about what we're trying to accomplish here. This isn't really about supporting multiple variants in the same environment---this is primarily about making sure we don't accidentally mix variants. |
The misunderstanding may arise from my desire to support different BLAS versions for different programs---for example, MKL for Python and OpenBLAS for R. But nothing we're doing here prevents that. |
Indeed not. But that is prevented in the existing features approach, and to my understanding is the reason that a Healthy debate brings with it completely orthogonal ideas which as individuals we can easily miss - you may be interested in https://github.com/pelson/conda-execute/issues/20#issuecomment-217570352 as just one of those orthogonal ideas for dealing with variants across the process boundary (this is effectively the idea of child environments, which I don't think is completely new). In order to avoid descending into completely blue sky activity we must try to remember what we are trying to achieve - we need a safe (and ideally convenient) way of packaging blas so that we can build on top of it to provide a suitable numpy, scipy etc.. Right now As I said initially, this conversation is hugely valuable, and I'm hopeful it will lead to an improved conda for both |
I'm pretty sure that's not the case. That is to say: if we were forced to stick with features, we could still have different BLAS libraries in the same environment, with Python linking to one and R linking to another.
Well yes, but even if we were to standardize on the use of a |
I certainly agree that conda forge should not get too far ahead of defaults here. But what I'm saying is that if we have unity on this proposal, and I think we can get there, then we can perform the necessary evolution in both channels, and soon. |
So, as someone who has tried to use features in various ways to package BLASes I can say from experience (please forgive the liberal analogy) it is like walking down a dark alley in Detroit. We don't want to do it and when we do it anyways we hear a voice telling us we never should have done it. This build string approach (which we have called variants) sounds very experimental, but it isn't so wild in the end. It is interesting, but it is effective and simple. This is what I love about @mcg1969's thoughts on this. We are tweaking the incredibly powerful We should not be so fearful here as this is really what we will want in the end IMHO. There is some syntactic sugar and some tweaks to the solver sure, but they shouldn't distract us from the elegance and simplicity that is already here ready for the taking. The future is now gentlemen. Let's embrace it. |
@mcg1969 - if we had the ability to express "conflicts" (in the RPM sense), would the analysis in this PR be different? Would it be easier at that point to be able to state that |
Actually, if you added these variant metapackages as dependencies of the various BLAS packages themselves (that is, We have multiple cases, however, where we want to be able to install the packages side by side but allow only one to be depended upon in a given context (Python, R, standalone, etc.) So we don't want to be locked into providing just a "conflicts" behavior. |
I really don't want that to be a driving usecase. I do not believe the machinery should, and ever will, exist within conda to deal with that in a single environment correctly. The usecase which kills the discussion for me is Jupyter's desire to install two versions of python into the same environment - py3 for Jupyter itself, and py2 for a py2k kernel. Process fork compatibility (is there a better name?) can be solved very well by having sibling or child/parent sub-environments and I believe that to be the easiest and most reliable way to express and manage these dependencies. The discussion on this use case belongs in conda/conda IMO. We shouldn't be hacking our variant names with namespaces ( @jakirkham - as the author of it, is the purpose of this PR to have this debate, or is it a critical piece of being able to move forwards with a numpy recipe? @mcg1969 - can we start looking at this problem from the other end? How do we want this to look for our users? I think it is perfectly reasonable to invent syntaxes here which could become part of conda. Here is a starter for 10 (I haven't thought this through thoroughly, and may be overloading
|
I'm afraid this isn't an option for Continuum. Conda-forge may never support it, and that is fine, but we must. |
The purpose was and still is (for me at least) to find the quickest and most correct way to move NumPy, SciPy, scikit-learn, and so many other BLAS-dependent packages forward. I'm eager to stop maintaining my own builds of these and this is the critical piece. Also, I think it is ready as is. Though I would really like some other people that use BLAS in conda to take a look and share some thoughts on it. |
Longer answer, now that I'm home from travel...
As I've indicated, we have use cases where this capability is necessary. We can debate the syntactic sugar necessary to pull it off, it is imperative that we do not artificially prevent different BLAS or CRT implementations from living side-by-side in the same environment---for instance, by embedding some sort of If the
It seems very odd to allow a single difficult usecase like this to "kill the discussion". In fact, I personally have no desire to enable two versions of Python to be installed in the same environment, even if in theory, it could be done. What I cannot accept is for this particular usecase to prevent me from installing Python and R in the same environment.
We haven't made any final movements yet on the conda syntactic sugar that helps manage variants within conda. Let's certainly continue that discussion. But I don't want to bake anything into conda until we've figured out the behavior we need. |
We're already talking about something here. I'm going to continue the discussion there. |
Let's keep the discussion over there for now. We have opted to go a different route ( #643 ) for now. This pulls ideas from here, but ends up needing to rely on features to some extent. It is what @pelson and I found to work ok for the near term given the current state of affairs. This is already in use at conda-forge now and we will be building it out. Ultimately we would like to fix how we handle BLAS and this conversation is still very important to us, but we needed to do something quickly. By doing so, I hope this will relieve some pressure as we try to figure out the right way to solve this problem in the long term. |
Adds a metapackage to enable the use of OpenBLAS. It is called
blas_openblas
. Have tried to fill in the standard pieces thus far, but it feels a bit verbose for how simple the recipe is.