Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add scipy #648

Merged
merged 6 commits into from
May 21, 2016
Merged

Add scipy #648

merged 6 commits into from
May 21, 2016

Conversation

jakirkham
Copy link
Member

@jakirkham jakirkham commented May 20, 2016

Closes #516

Takes over the excellent work of @msarahan in this PR ( #516 ) so that we can get scipy into conda-forge fast.

The reason this needs to happen fast is we are kind of in limbo with having numpy packaged and having a system for BLAS linkages, but not having scipy packaged, and we are still suffering from problems with the mkl package. As a result, many additions requiring numpy and scipy are broken or in some other weird state (e.g. pinning mkl when they are BLAS invariant).

To be clear, I'm still open to comments and feedback. It just means that it has become clear that this is mission critical and most of this feedback will probably need to occur in the feedstock. Sorry in advance for any trouble this may cause.

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipes/scipy) and found it was in an excellent condition.

build:
number: 100
# We lack openblas on Windows, and therefore can't build scipy there either currently.
skip: true # [win]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skipping Windows for now as we don't have OpenBLAS over there.

@jakirkham jakirkham force-pushed the add_scipy branch 7 times, most recently from 26d772a to 930014f Compare May 20, 2016 18:58
@jakirkham
Copy link
Member Author

jakirkham commented May 20, 2016

So, basically, we are getting 2 test failures if scipy version 0.17.1 (same story with 0.17.0) is built against numpy version 1.10.x on Linux. Though we get none locally if building against numpy version 1.11.x. Am able to reproduce these in the docker container locally. I believe the problem is we are mixing MKL numpy with OpenBLAS scipy and something funny is happening. We know the BLASes are being mixed because when we use numpy version 1.11.x it is our own build that is linked against OpenBLAS, but we don't have a 1.10.x build so it must be pulling in numpy version 1.10.x linked against MKL. The second reason I believe this is because the two tests that are failing are both BLAS tests (run locally). I'm not sure why it is just these 2, but they are without a doubt BLAS tests. See below.

test_complex_dotc (test_blas.TestFBLAS1Simple) ... FAIL
test_complex_dotu (test_blas.TestFBLAS1Simple) ... FAIL

Another concerning issue is I'm seeing a segmentation fault when running the tests locally. This would explain why there isn't more info in the log about what the cause of the failures are. Normally, nose would have explained the failures at the end, but we don't even see this on CircleCI so presumably there is a segmentation fault there. Will try to build numpy version 1.10.4 against OpenBLAS and will then return to this.

@jakirkham
Copy link
Member Author

FWIW, I tried testing the scipy package from defaults and it actually fails 2 test. Keep in mind I used nothing from conda-forge everything was from defaults. Here are the failures for the defaults copy of scipy.

======================================================================
FAIL: test_arpack.test_real_nonsymmetric_modes(False, <std-real-nonsym>, 'f', 2, 'LM', None, 0.1, <function asarray at 0x7f6e3b82eea0>, 'r')
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/conda/lib/python3.5/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/opt/conda/lib/python3.5/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 259, in eval_evec
    assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
  File "/opt/conda/lib/python3.5/site-packages/numpy/testing/utils.py", line 1359, in assert_allclose
    verbose=verbose, header=header)
  File "/opt/conda/lib/python3.5/site-packages/numpy/testing/utils.py", line 713, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=0.000357628, atol=0.000357628
error for eigs:standard, typ=f, which=LM, sigma=0.1, mattype=asarray, OPpart=r, mode=normal
(mismatch 50.0%)
 x: array([[-0.116493+0.j, -0.054353+0.j],
       [ 0.138012+0.j, -0.058940+0.j],
       [-0.212299+0.j, -0.028156+0.j],...
 y: array([[-0.056986+0.080185j, -0.054353+0.j      ],
       [ 0.064395-0.09061j , -0.058940+0.j      ],
       [-0.125040+0.175944j, -0.028156+0.j      ],...

======================================================================
FAIL: test_arpack.test_real_nonsymmetric_modes(False, <std-real-nonsym>, 'f', 2, 'LR', None, None, <function aslinearoperator at 0x7f6e37e59bf8>, None)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/opt/conda/lib/python3.5/site-packages/nose/case.py", line 198, in runTest
    self.test(*self.arg)
  File "/opt/conda/lib/python3.5/site-packages/scipy/sparse/linalg/eigen/arpack/tests/test_arpack.py", line 259, in eval_evec
    assert_allclose(LHS, RHS, rtol=rtol, atol=atol, err_msg=err)
  File "/opt/conda/lib/python3.5/site-packages/numpy/testing/utils.py", line 1359, in assert_allclose
    verbose=verbose, header=header)
  File "/opt/conda/lib/python3.5/site-packages/numpy/testing/utils.py", line 713, in assert_array_compare
    raise AssertionError(msg)
AssertionError: 
Not equal to tolerance rtol=0.00178814, atol=0.000357628
error for eigs:standard, typ=f, which=LR, sigma=None, mattype=aslinearoperator, OPpart=None, mode=normal
(mismatch 50.0%)
 x: array([[ 0.138843+0.j, -1.071121+0.j],
       [-0.089629+0.j, -1.398013+0.j],
       [ 0.217014+0.j, -0.939684+0.j],...
 y: array([[ 0.060205+0.084715j, -1.071121+0.j      ],
       [-0.074974-0.105497j, -1.398013+0.j      ],
       [ 0.133267+0.187521j, -0.939684+0.j      ],...

----------------------------------------------------------------------
Ran 20193 tests in 238.262s

FAILED (KNOWNFAIL=100, SKIP=1655, failures=2)

@jakirkham
Copy link
Member Author

jakirkham commented May 21, 2016

Seems like the numpy built with openblas no longer triggers the same build failure as the on with mkl did.

@jakirkham jakirkham changed the title Add scipy WIP: Add scipy May 21, 2016
@jakirkham jakirkham changed the title WIP: Add scipy Add scipy May 21, 2016
@jakirkham
Copy link
Member Author

CircleCI passes.

Travis CI is timing out because of the long test time. Though, we are confident this is building correctly. We could comment the tests on Travis CI. However, this could allow potentially bad binaries to ship when we convert this to a feedstock. Instead, I would much rather keep the tests in place and convert to a feedstock. If there are test failures, we can handle them there. This way we don't have to worry about bad binaries shipping in those cases. As Travis CI completed two builds in the time, we are also confident that the Travis CI build time should be sufficient for building single combinations of SciPy with given Python and NumPy.

As this was reviewed some before and not having SciPy is causing serious issues in recipes and feedstocks, we are going to send this one through now.

@jakirkham jakirkham merged commit 71dc76a into conda-forge:master May 21, 2016
@jakirkham jakirkham deleted the add_scipy branch May 21, 2016 03:14
@jakirkham
Copy link
Member Author

Feedback welcome at the feedstock. Sorry again for fast-tracking this.

@jakirkham
Copy link
Member Author

jakirkham commented May 21, 2016

@pv @charris @matthew-brett @rgommers @stefanv @juliantaylor @alexbrc @ev-br, just letting you know that we have added scipy to conda-forge. This will allow users to easily install this with conda ( conda install -c conda-forge scipy ). The maintenance burden here is light as all releases are built and deployed automatically with CI. Changes to how it is being built can be controlled by changing the recipe in this feedstock (repo). If you have any interest in being added as a maintainer (to make new releases for instance), we would be more than happy to add any of you that are interested to the maintainers list.

@rgommers
Copy link

Thanks for pinging @jakirkham. I'm definitely interested in (a) understanding the setup here, and (b) helping out. Main questions I have now:

  • Is there a written summary somewhere of the Numpy/Scipy/BLAS strategy for the main Anaconda vs. conda-forge? I'm aware that Numpy has a special status in Anaconda, but not exactly how or why. And not sure if Scipy is treated differently from any other Conda package or not.
  • There's something going on in the Windows build file in this PR, but the comment in meta.yaml says the Windows build is completely skipped. What is the plan here? It looks like icc + gfortran from WIP: scipy #516, but that isn't going to work I think?
  • What is up with all the imports in the tests? Why is just import scipy; scipy.test() not enough?

@rgommers
Copy link

Detail questions about meta.yaml:

  • why is the build number hardcoded to 200?
  • why does skip: true only skip Windows build? Is # [win] not just a comment?
  • do you want to add optional run/test dependencies? Optional run are matplotlib, scikit-umfpack. Optional test dependency is mpmath.

@ocefpaf
Copy link
Member

ocefpaf commented May 22, 2016

I can answer a few of those:

  • why is the build number hardcoded to 200?

That is a workaround to ensure conda will pick this package instead of the default channel version. It is controversial and we don't like it. People are working on conda to improve its use with multiple channels so we can drop this.

  • why does skip: true only skip Windows build? Is # [win] not just a comment?

That is a special comment called pre-processor. That ensure that the line will be present on Windows and ignored everywhere else. So the skip: True will skip the build only on Windows.

@rgommers
Copy link

Thanks @ocefpaf, clear.

@jakirkham
Copy link
Member Author

jakirkham commented May 23, 2016

Thanks for pinging @jakirkham. I'm definitely interested in (a) understanding the setup here, and (b) helping out.

Of course, @rgommers. That's great. Happy to help you understand what is going on over here. Also, happy to help get you access so you can help out.

Main questions I have now:

  • Is there a written summary somewhere of the Numpy/Scipy/BLAS strategy for the main Anaconda vs. conda-forge?

On the Anaconda side, I'm not aware of a formal doc, but I can give you the lay of the land. Originally, everything on Mac was built against the Accelerate Framework. On Linux, ATLAS was used initially then there was a switch to OpenBLAS last year. On Windows, MKL has always been used. Recently, MKL was released to the defaults channel and is used as the default BLAS. The previously listed strategies (with the exception of Windows) were placed under a nomkl feature that could be enabled by installing a package of the same name.

As for conda-forge, @pelson and I sat down and came up with these ground rules for BLAS packaging. Though we may modify them in the future once there are some changes to conda/conda-build w.r.t. to features. In general, we are making OpenBLAS the default BLAS (Windows needs some work in this area).

I'm aware that Numpy has a special status in Anaconda, but not exactly how or why.

So, this is a really long conversation. 😄

The short answer is that NumPy and Python have special status in conda/conda-build for the purpose of ABI compatibility. As so many packages would use NumPy and often link to it, it was very important to control the version of NumPy used in an environment. I or someone else can go over how this is done in a subsequent post.

Generally how pinning of versions is done for ABI compatibility will likely change in the future. This is a product of the success of conda/conda-build and people's attempts to extend it into various domains where specifying this sort of constraint is important. Changes will start with conda-build. It is unclear how this may impact things like NumPy's pinning or how we do things at conda-forge at this time.

And not sure if Scipy is treated differently from any other Conda package or not.

It is not.

  • There's something going on in the Windows build file in this PR, but the comment in meta.yaml says the Windows build is completely skipped. What is the plan here? It looks like icc + gfortran from WIP: scipy #516, but that isn't going to work I think?

The main issue right now is we don't have a BLAS that we can use on Windows. Continuum uses MKL. While supporting MKL is of interest to us if possible, there are some legal and logistic hurdles we are going to have to go through. In the end, we still need to have an open source BLAS (preferably as the default BLAS).

Our plan is to use OpenBLAS on all platforms. We do that on Mac and Linux now. However, we need a build of OpenBLAS on Windows to use it there. Some discussion has been occurring on this issue ( conda-forge/openblas-feedstock#2 ). There is some discussion on Windows intermingled with other points in the PR ( #364 ) that added OpenBLAS here.

To get a reasonable build of OpenBLAS on Windows, I think the general consensus is to use the MinGW-w64 toolchain. There are a number of thoughts on how to do that exactly. This may take some time to get right.

So I'm leaning towards using a repackaging of MSYS2's OpenBLAS to get something working with our stack here soon. Then we can go back and revisit how to build OpenBLAS from scratch if we want to.

  • What is up with all the imports in the tests? Why is just import scipy; scipy.test() not enough?

It's a fair question. This basically had come from the recipe that @msarahan shared. So he may have some more thoughts on this.

The main point of import tests is just to make sure we can import the various modules ok. Normally if these fail it is some sort of linking issue. While this can be caught by the normal test suite, the import tests are a bit faster and help localize the problem a bit better. Though they tend to be more useful when getting a recipe working and less useful after.

Personally I have no strong feelings on them. So if you think they should be removed, I'm ok with that.

@jakirkham
Copy link
Member Author

do you want to add optional run/test dependencies? Optional run are matplotlib, scikit-umfpack.

So, conda recipes don't have a concept of optional run dependencies. One either chooses to install them or not. If we add them to the run requirements, they are no longer optional. We could add a comment in the run requirements section explaining that these are some option dependencies in the recipe. Does that seem reasonable?

We do have matplotlib here. We don't have scikit-umfpack yet, but we should work to package it.

Optional test dependency is mpmath.

Absolutely. We should use all optional dependencies that are involved in tests. We also have mpmath here. Are matplotlib or scikit-umfpack used during testing?

@rgommers
Copy link

So, conda recipes don't have a concept of optional run dependencies. One either chooses to install them or not. If we add them to the run requirements, they are no longer optional. We could add a comment in the run requirements section explaining that these are some option dependencies in the recipe. Does that seem reasonable?

Clear. Since MPL is only used for one function and scikit-umfpack isn't available yet a comment seems better than adding a dependency.

Optional test dependency is mpmath.

Absolutely. We should use all optional dependencies that are involved in tests. We also have mpmath here. Are matplotlib or scikit-umfpack used during testing?

Both are used, but only for a couple of functions (matplotlib for dendrogram and umfpack for some sparse.linalg functions). Not super important (but mpmath is). I'll submit a PR to the feedstock repo to add mpmath.

@rgommers
Copy link

What is up with all the imports in the tests? Why is just import scipy; scipy.test() not enough?

It's a fair question. This basically had come from the recipe that @msarahan shared. So he may have some more thoughts on this.

The main point of import tests is just to make sure we can import the various modules ok. Normally if these fail it is some sort of linking issue. While this can be caught by the normal test suite, the import tests are a bit faster and help localize the problem a bit better. Though they tend to be more useful when getting a recipe working and less useful after.

Personally I have no strong feelings on them. So if you think they should be removed, I'm ok with that.

No, I see what the idea is now - that's probably helpful and it can't really hurt. Better than getting the tests run with dozens of test errors.

@rgommers
Copy link

@jakirkham thanks for the long explanation on BLAS & Numpy, it's much clearer to me now.

The invocation to get a particular BLAS variant and version is indeed not very user-friendly yet, hope that's going to improve in conda. But I guess most users will be OK with MKL Numpy from the defaults channel.

@jakirkham
Copy link
Member Author

jakirkham commented May 24, 2016

thanks for the long explanation on BLAS & Numpy...

Boy, that was long. Sorry about that.

...it's much clearer to me now.

Glad it is clearer at least.

The invocation to get a particular BLAS variant and version is indeed not very user-friendly yet...

So, the numpy from conda-forge should win conflicts with defaults right now (at least that's what we were aiming for). Thus, running conda install -c conda-forge numpy should pull in our NumPy linked to OpenBLAS.

...hope that's going to improve in conda.

We do too. We have been pushing on various points in conda and conda-build to improve this. There was a fair bit of discussion in this PR ( #525 ) though its a bit of a long read. This PR ( conda/conda#2427 ) has some interesting ideas, but is a bit more succinct.

Though there is a long history around features in general and the feeling at this point is they are broken and need fixing. While disheartening that they are a mess, it is encouraging that we are not arguing about whether they are a mess, but only how we should fix them. On that front, there are many promising ideas.

@rgommers
Copy link

So, the numpy from conda-forge should win conflicts with defaults right now (at least that's what we were aiming for). Thus, running conda install -c conda-forge numpy should pull in our NumPy linked to OpenBLAS.

Hmm, doesn't work for me on a freshly installed Miniconda 3.5:

$ conda install -c conda-forge numpy
Fetching package metadata: ......
Solving package specifications: .........

Package plan for installation in environment /home/rgommers/miniconda3:

The following packages will be downloaded:

    package                    |            build
    ---------------------------|-----------------
    mkl-11.3.3                 |                0        77.6 MB
    numpy-1.11.0               |           py35_1         5.7 MB
    conda-4.0.6                |           py35_0         189 KB
    ------------------------------------------------------------
                                           Total:        83.5 MB

The following NEW packages will be INSTALLED:

    mkl:   11.3.3-0     
    numpy: 1.11.0-py35_1

The following packages will be UPDATED:

    conda: 4.0.5-py35_0 --> 4.0.6-py35_0

$ conda search numpy --channel conda-forge only lists versions from the defaults channel, so I guess the conda-forge builds haven't propagated yet. Will try again in a few days.

@jakirkham
Copy link
Member Author

Really? Works for me. Hmm...

Could you please try updating conda first (i.e. conda update -y conda) and then repeat?

@rgommers
Copy link

Ah, found the issue - no 32-bit installers at https://anaconda.org/conda-forge/numpy

@jakirkham
Copy link
Member Author

I see. Yeah, we don't currently build 32-bit Linux. Though we could investigate doing it. Certainly have given it some thought before. Are you using 32-bit regularly?

@rgommers
Copy link

Yes, I'm working mostly on a 32-bit Linux desktop. It helps smoke out some issues that aren't caught otherwise (like issues due to np.float128 not being present recently). But yeah, it's not the most popular config, so I can understand that it has low prio.

@jakirkham
Copy link
Member Author

So, I don't think we are against using 32-bit Linux to build things. The main problem is CircleCI doesn't have matrix support. As a result, we loop through build combos (Python versions, NumPy versions, any other parameters to loop) in the feedstock. The net result is some builds are very close to the CircleCI time limit. Adding 32-bit would put us over the limit in these cases. If there was a proper matrix support or longer build time limits, we would be much better equipped to pursue this problem.

Copy link
Member Author

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also have found this strategy to be useful with scipy. May be useful in construction of PR ( conda-forge/gsl-feedstock#18 ) for gsl, @isuruf.

test -f "${PREFIX}/lib/libopenblas.a" && ln -fs "${PREFIX}/lib/libopenblas.a" "${PREFIX}/lib/libblas.a"
test -f "${PREFIX}/lib/libopenblas.a" && ln -fs "${PREFIX}/lib/libopenblas.a" "${PREFIX}/lib/liblapack.a"
test -f "${PREFIX}/lib/libopenblas.${DYLIB_EXT}" && ln -fs "${PREFIX}/lib/libopenblas.${DYLIB_EXT}" "${PREFIX}/lib/libblas.${DYLIB_EXT}"
test -f "${PREFIX}/lib/libopenblas.${DYLIB_EXT}" && ln -fs "${PREFIX}/lib/libopenblas.${DYLIB_EXT}" "${PREFIX}/lib/liblapack.${DYLIB_EXT}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that we create symlinks to the standard blas and lapack library names before the build begins. The compilers are smart enough to follow them to their true location and thus are linked to libopenblas directly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm -f "${PREFIX}/lib/libblas.a"
rm -f "${PREFIX}/lib/liblapack.a"
rm -f "${PREFIX}/lib/libblas.${DYLIB_EXT}"
rm -f "${PREFIX}/lib/liblapack.${DYLIB_EXT}"
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We then clean them up after the build ends. This is done to ensure they don't get packaged. Plus it won't matter for anything built as they will already be linked against libopenblas.

@matthew-brett
Copy link

We've just got Scipy Windows wheels to build - scipy/scipy#7616 . Most credit goes to a lot of hard work from @xoviat - a tip of the hat from here. Build machinery is all on github / appveyor in case it's useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

7 participants