Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First draft of multi-objective optimization #1455

Merged
merged 24 commits into from
May 12, 2022
Merged

First draft of multi-objective optimization #1455

merged 24 commits into from
May 12, 2022

Conversation

mfeurer
Copy link
Contributor

@mfeurer mfeurer commented May 3, 2022

This is the very first implementation of multi-objective optimization in Auto-sklearn, solving issue #1317. It allows users passing in a list of metrics instead of a single metric. Auto-sklearn then solves a multi-objective optimization problem by using SMAC's new multi-objective capabilities.

This PR has several known limitation, some of which can/will be fixed in a future PR, and some for which it is unclear how to approach them in the first place:

  1. The ensemble optimizes only the first metric:
    This will be addressed in a future PR.
  2. _load_best_individual_model() returns the best model according to the first metric:
    It is unclear how to make this multi-objective as this is a clear fallback solution. My only two suggestions would be 1) to always return the first metric and document that the first metric is the tie breaking metric (currently implemented), or 2) use the Copula mixing approach demonstrated in Figure 6 of http://proceedings.mlr.press/v119/salinas20a/salinas20a.pdf to return a scalarized solution. However, this can lead to unpredictable selections.
  3. score():
    This function suffers from the same issue as load_best_individual_model as there is no clear definition which model is the best.
  4. _get_runhistory_models_performance():
    This function suffers from the same issue as load_best_individual_model as there is no clear definition which model is the best.
  5. sprint_statistics()
    This function suffers from the same issue as load_best_individual_model as there is no clear definition which model is the best.
  6. refit()
    This function suffers from the same issue as load_best_individual_model as there is no clear definition which model is the best.
  7. Leaderboard:
    • The first metric is currently reported as "cost" -> If there is more than one metric, we report them as cost_0, cost_1, etc
    • The entries will be sorted by the first metric by default, with all additional metrics being tie breakers
    • The current sorting ["cost", "rank"] appears to me to be the same as would be sorting by ["cost"] as "rank" should have the same order. I therefore propose to use the "model_id" as a secondary metric here
  8. cv_results_
    Updated to follow the output of scikit-learn, see Attributes in here
  9. Accessing the Pareto front of individual models and ensembles
    It is so far completely unclear how to achieve this. Picking a model from the Pareto front would make Auto-sklearn non-interactive. Therefore, it might be a good idea to add a function to return all ensembles on the Pareto front as "raw" ensemble objects that can be further used by the user, or to one after the other load different models as the main model.

TODOs:

  • Generalize code and wrap metric always in a list to have internal methods always work on dicts or lists of metrics (reduces a lot of if-statements)
  • write unit tests
  • pass a metric to meta-learning
  • fix unit tests
  • make the following functions multi-objective:
    • load_best_individual_model
    • leaderboard
    • score
    • _get_runhistory_models_performance
    • cv_results
    • sprint_statistics
    • refit
  • update examples
  • add multi-objective example
  • sub-sampling (i.e. successive halving)

autosklearn/automl.py Outdated Show resolved Hide resolved
autosklearn/automl.py Outdated Show resolved Hide resolved
autosklearn/evaluation/abstract_evaluator.py Show resolved Hide resolved
autosklearn/metrics/__init__.py Outdated Show resolved Hide resolved
autosklearn/metrics/__init__.py Outdated Show resolved Hide resolved
autosklearn/evaluation/train_evaluator.py Outdated Show resolved Hide resolved
@mfeurer mfeurer requested a review from KEggensperger May 3, 2022 15:04
Co-authored-by: Katharina Eggensperger <eggenspk@informatik.uni-freiburg.de>
Copy link
Contributor

@eddiebergman eddiebergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need to change, just if you see it and get a chance :) I think it looks a little Cleaner with as little square brackets [] from Optional[] and Union[] as possible but I'd merge without it. It also translates better with types in sphinx docs if you use the same notation in the docstrings.

@@ -210,7 +210,7 @@ def __init__(
get_smac_object_callback: Optional[Callable] = None,
smac_scenario_args: Optional[Mapping] = None,
logging_config: Optional[Mapping] = None,
metric: Optional[Scorer] = None,
metric: Optional[Scorer | Sequence[Scorer]] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not neccessary, just something to know Optional[X] == Union[X, None] == X | None
i..e you could write Scorer | Sequence[Scorer] | None = None

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's look nice, will do.

@@ -86,10 +99,10 @@ def fit_predict_try_except_decorator(


def get_cost_of_crash(
metric: Union[Scorer, List[Scorer], Tuple[Scorer]]
metric: Union[Scorer | Sequence[Scorer]],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like wise here, Union[X | Y] == X | Y, the | essentially is just the infix operator for Union in the same way you have + instead of add(x, y).

i.e. metric: Scorer | Sequence[Scorer]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching.

Copy link
Contributor

@eddiebergman eddiebergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, more files I didn't see. I'm starting to wonder if it makes sense to have something like a MetricGroup class? A lot of the code changes seem to just be handling that case of 1, many or None metrics.

val_score = metric._optimum - (metric._sign * run_value.cost)
cost = run_value.cost
if not isinstance(self._metric, Scorer):
cost = cost[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this is a point of API conflict? It would be good to know about all the metrics for a model but at the end of the day, we currently only support one and so we choose the first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it would be good to know about all the metrics. I will look into returning multiple metrics here (should be possible).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see my comment wrt this in the PR comment at the top.

autosklearn/ensemble_builder.py Show resolved Hide resolved
autosklearn/evaluation/abstract_evaluator.py Show resolved Hide resolved
autosklearn/automl.py Outdated Show resolved Hide resolved
autosklearn/automl.py Outdated Show resolved Hide resolved
autosklearn/automl.py Outdated Show resolved Hide resolved
autosklearn/automl.py Outdated Show resolved Hide resolved
autosklearn/automl.py Outdated Show resolved Hide resolved
autosklearn/evaluation/__init__.py Show resolved Hide resolved
autosklearn/evaluation/test_evaluator.py Outdated Show resolved Hide resolved
autosklearn/evaluation/train_evaluator.py Outdated Show resolved Hide resolved
autosklearn/evaluation/train_evaluator.py Outdated Show resolved Hide resolved
autosklearn/metrics/__init__.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented May 4, 2022

Codecov Report

Merging #1455 (d432e07) into development (daa9ad6) will increase coverage by 0.00%.
The diff coverage is 81.95%.

@@              Coverage Diff              @@
##           development    #1455    +/-   ##
=============================================
  Coverage        84.31%   84.32%            
=============================================
  Files              147      147            
  Lines            11284    11397   +113     
  Branches          1934     1986    +52     
=============================================
+ Hits              9514     9610    +96     
- Misses            1256     1263     +7     
- Partials           514      524    +10     

Impacted file tree graph

@mfeurer
Copy link
Contributor Author

mfeurer commented May 9, 2022

Alright, I think the functionality of this PR is complete for now. I will add unit tests after another round of feedback.

autosklearn/automl.py Outdated Show resolved Hide resolved
autosklearn/automl.py Show resolved Hide resolved
autosklearn/automl.py Show resolved Hide resolved
autosklearn/ensembles/ensemble_selection.py Outdated Show resolved Hide resolved
autosklearn/estimators.py Show resolved Hide resolved
autosklearn/evaluation/__init__.py Show resolved Hide resolved
@eddiebergman
Copy link
Contributor

Replies to the list of items in the PR first.

  1. The ensemble optimizes only the first metric:
    This will be addressed in a future PR.

Sounds good, we need to know what to do with metrics in the ensemble first.

  1. _load_best_individual_model() returns the best model according to the first metric:
    It is unclear how to make this multi-objective as this is a clear fallback solution. My only two suggestions would be 1) to always return the first metric and document that the first metric is the tie breaking metric (currently implemented), or 2) use the Copula mixing approach demonstrated in Figure 6 of http://proceedings.mlr.press/v119/salinas20a/salinas20a.pdf to return a scalarized solution. However, this can lead to unpredictable selection

An extended version of the first solution, (cost_0, cost_1, ...) <= (cost_0, cost_1, ...) such that if cost_0 is equal, it will then compare the second one and so on. This means the order of the metrics is the order of comparison and means they must be preserved throughout.

        1. ...

Same answer, without a method to present choices we have the requirement to select for the user. I would go with the above solution

  1. Leaderboard

Seems good, we could even extract the metrics name through f.__name__ but this means programatically using leaderboard becomes difficult.

  1. cv_results_

Sure

  1. Ensembles

In theory there's nothing that prevents us from making a selection at predict as we already do, using the above methods to force a selection. For giving a choice, yes we need to somehow wrap up the ensembles with their scores. This below solution would give a lot of choice but maybe there's a better way to present the info?

[
    ((cost_0, cost_1, ...), ens0),
    ((cost_0, cost_1, ...), ens1),
    ((cost_0, cost_1, ...), ens2),
]

I'll do a review now

autosklearn/estimators.py Outdated Show resolved Hide resolved
autosklearn/estimators.py Outdated Show resolved Hide resolved
autosklearn/evaluation/abstract_evaluator.py Outdated Show resolved Hide resolved
autosklearn/metrics/__init__.py Show resolved Hide resolved
autosklearn/evaluation/abstract_evaluator.py Show resolved Hide resolved
autosklearn/estimators.py Outdated Show resolved Hide resolved
autosklearn/estimators.py Outdated Show resolved Hide resolved
autosklearn/metrics/__init__.py Outdated Show resolved Hide resolved
test/test_estimators/test_estimators.py Show resolved Hide resolved
test/test_evaluation/test_train_evaluator.py Outdated Show resolved Hide resolved
test/test_evaluation/test_train_evaluator.py Show resolved Hide resolved
test/test_metric/test_metrics.py Show resolved Hide resolved
solution=y_true,
prediction=y_pred,
task_type=BINARY_CLASSIFICATION,
metrics=[autosklearn.metrics.accuracy, autosklearn.metrics.accuracy],
Copy link
Contributor

@eddiebergman eddiebergman May 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Non-Blocking, Q]
Using the same metric twice seems like it should raise an error to me. Given that the behavior is no error raised, the output is good though. Should we keep it as is?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean we should allow the same metric to be present in both metrics and scoring_functions, but not the same metric twice in one of them?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess so, I didn't really think about the same one twice, once in metrics and once in scoring_functions.

What you stated seems reasonable to me but not a cause to block the PR if it's difficult. I can't imagine a scenario where you would want the same metric, i.e. acc twice in metrics but I could imagine a scenario where you have acc in metrics and acc in scoring_functions.

The scenario where I see this being used is purely to make getting out scores for autosklearn to be done in one place, i.e. you specify acc and metric_z for the optimization with metrics and you specify acc, balanced_acc and f1_score for the scoring_functions when you later want to evaluate the autosklearn run.

test/test_metric/test_metrics.py Show resolved Hide resolved
@mfeurer mfeurer merged commit ed1bc68 into development May 12, 2022
@mfeurer mfeurer deleted the MOO branch May 12, 2022 19:44
eddiebergman pushed a commit that referenced this pull request Aug 18, 2022
* First draft of multi-objective optimization

Co-authored-by: Katharina Eggensperger <eggenspk@informatik.uni-freiburg.de>

* Feedback from Eddie

* Make metric internally always a list

* Fix most examples

* Take further feedback into account

* Fix unit tests

* Fix one more example

* Add multi-objective example

* Simplify internal interface

* Act on further feedback

* Fix bug

* Update cv_results_ for multi-objective sklearn compliance

* Update leaderboard for multi-objective optimization

* Include Feedback from Katharina

* Take offline feedback into account

* Take offline feedback into account

* Eddie's feedback

* Fix metadata generation unit test

* Test for metrics with the same name

* Fix?

* Test CV results

* Test leaderboard for multi-objective optimization

* Last batch of unit tests added

* Include Eddie's feedback

Co-authored-by: Katharina Eggensperger <eggenspk@informatik.uni-freiburg.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants