Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix evalution of linear trees with a single leaf. #3987

Merged
merged 16 commits into from
Feb 21, 2021

Conversation

mjmckp
Copy link
Contributor

@mjmckp mjmckp commented Feb 16, 2021

Note that trees without linear models at the leaf always handle num_leaves = 1 as a special case and directly output the leaf value. Linear trees were missing this special case handling, and hence would have the following issues:

  • Calling Tree::Predict or Tree::PredictByMap would cause an access violation exception attempting to access the first value of the empty split_feature_ array in GetLeaf.
  • PredictionFunLinear would either cause an access violation or go into an infinite loop when attempting to do the equivalent of GetLeaf.

Note also that PredictionFun does not need the same changes as PredictionFunLinear, since both are only called by Tree::AddPredictionToScore, which has a special case for (!is_linear_ && num_leaves_ <= 1) that precludes calling PredictionFun.

matthew-peacock and others added 16 commits November 2, 2018 13:07
…datasets.

Prior to this change, the line "score_t threshold = tmp_gradients[top_k - 1];" would generate an exception, since tmp_gradients would be empty when the cnt input value to the function is zero.
…data set given as of array of pointers to rows (as opposed to existing method LGBM_BoosterPredictForMat which requires data given as contiguous array)
Note that trees without linear models at the leaf always handle num_leaves = 1 as a special case and directly output the leaf value.  Linear trees were missing this special case handling, and hence would have the following issues:
 * Calling Tree::Predict or Tree::PredictByMap would cause an access violation exception attempting to access the first value of the empty split_feature_ array in GetLeaf.
 * PredictionFunLinear would either cause an access violation or go into an infinite loop when attempting to do the equivalent of GetLeaf.

Note also that PredictionFun does not need the same changes as PredictionFunLinear, since both are only called by Tree::AddPredictionToScore, which has a special case for (!is_linear_ && num_leaves_ <= 1) that precludes calling PredictionFun.
@guolinke
Copy link
Collaborator

cc @btrotta

@btrotta
Copy link
Collaborator

btrotta commented Feb 20, 2021

@mjmckp Thanks for finding and fixing this! I reproduced the error you describe with the following script, which causes LightGBM to crash:

import lightgbm as lgb
from sklearn.datasets import load_breast_cancer

X_train, y_train = load_breast_cancer(return_X_y=True)
train_data = lgb.Dataset(X_train, label=y_train)
params = {
    "objective": "binary",
    "linear_tree": True,
    "min_sum_hessian": 5000
}
bst = lgb.train(params, train_data, num_boost_round=5, valid_sets={train_data}, valid_names='train')
bst.predict(X_train)

Your fix looks good to me!

@jameslamb jameslamb merged commit 605c97b into microsoft:master Feb 21, 2021
@jameslamb
Copy link
Collaborator

Thanks for the fix @mjmckp !

@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants