Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python] [macOS] Segfault in Dataset construction with scikit-learn==1.0.0 #4632

Closed
tony-theorem opened this issue Sep 28, 2021 · 10 comments
Closed
Labels

Comments

@tony-theorem
Copy link

Description

LightGBM segfaults during the construction of any Dataset when used with scikit-learn==1.0.0 on MacOS.

Minimal reproducible examples are shown below. These examples result in segfaults when run with lightgbm==3.2.1 and scikit-learn==1.0.0 on MacOS. Switching to Linux eliminates the segfaults. Downgrading to scikit-learn==0.24.2 also eliminates the segfaults.

Reproducible examples

Example 1:



import numpy as np
import lightgbm as lgbm

X = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1], [0, 0, 1]])
y = np.array([0, 1, 0, 1])
lgbm.Dataset(data=X, label=y, params={"min_data_in_bin": 1}).construct()


Small dataset which exhibits the segfault behavior.


Example 2:


import numpy as np
import lightgbm as lgbm

rng = np.random.default_rng(seed=1)
n_samples = 10000
X = rng.normal(size=(n_samples, 3))
y = rng.binomial(1, 0.5, size=(n_samples,))
lgbm.Dataset(data=X, label=y).construct()


Larger dataset to demonstrate that the segfault is not the result of particularly small dataset.

Environment info

MacOS environment (produces segfault):

macOS-10.16-x86_64-i386-64bit
lightgbm==3.2.1
numpy==1.21.1
scikit-learn==1.0.0


Linux environment (no segfault):

Linux-4.9.0-11-amd64-x86_64-with-glibc2.31
lightgbm==3.2.1
numpy==1.21.1
scikit-learn==1.0.0
@jameslamb jameslamb changed the title [macOS] Segfault in Dataset construction with scikit-learn==1.0.0 [python] [macOS] Segfault in Dataset construction with scikit-learn==1.0.0 Sep 28, 2021
@jameslamb
Copy link
Collaborator

Thanks for the write-up @tony-theorem , and sorry you're experiencing this! I'm very surprised that whether or not this example's code results in a segfault could be fixed by downgrading scikit-learn 😬

I'll try to reproduce this on my Mac tonight. We need one more piece of information...how did you install these packages?

@tony-theorem
Copy link
Author

I am working from Python 3.9.6. My system has libomp (version 12.0.0) installed via brew install libomp.

I created a fresh virtual environment and installed the dependencies there using pip-install. I tried two different way to install the dependencies and both produced the segfault.

python -m pip install numpy==1.21.1 scikit-learn==1.0.0 lightgbm==3.2.1
python -m pip install numpy==1.21.1 scikit-learn==1.0.0 wheel==0.37.0
python -m pip install --no-binary :all: lightgbm==3.2.1

In the second case, the build was performed with cmake version 3.21.3.

@StrikerRUS
Copy link
Collaborator

@tony-theorem

My system has libomp (version 12.0.0) installed via brew install libomp.

LightGBM is incompatible with libomp 12.0 according to users' reports (#4229). Could you please try to downgrade it?

@tony-theorem
Copy link
Author

@StrikerRUS

Downgrading to libomp==11.1 appears to have resolved the issue. Both methods of installing the dependencies into the virtual environment now result in the examples running successfully. As such, this looks like another manifestation of #4229. Thank you for your help.

As a side-note, I'm rather surprised that scikit-learn==1.0.0 vs scikit-learn==0.24.2 appears to be the trigger in this case. I previously have not had issues despite running with libomp==12.0.

@StrikerRUS
Copy link
Collaborator

@tony-theorem I'm very glad that downgrading helped. Thanks a lot for getting back to us and reporting this!

@jameslamb
Copy link
Collaborator

+1, excellent report! Thanks for all the effort you put into making a tight reproducible example.

@jameslamb jameslamb added the bug label Oct 2, 2021
@thomasjpfan
Copy link
Contributor

thomasjpfan commented Oct 3, 2021

Thank you for everyone looking into this issue! scikit-learn 1.0 wheels for OSX was built with libomp 12 which looks to be causing the issue.

We are currently working on a fix at scikit-learn/scikit-learn#21227 which uses libomp 11 for building. I tested these new wheel with the example in this PR and the segfault does not appear anymore. If everything goes smoothly the next scikit-learn bug fix release should allow you to use scikit-learn=1.0.X with lightgbm.

@jameslamb
Copy link
Collaborator

oooooo thank you very much @thomasjpfan !

@ogrisel
Copy link

ogrisel commented Oct 7, 2021

For information, it should be possible to introspect the versions of the openmp runtime in an existing environment using:

python -m threadpoolctl -i lightgbm

@github-actions
Copy link

This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 23, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

5 participants