Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encapsulated the selector training within a function and called it inside _init_ #1473

Merged
merged 11 commits into from
Jun 2, 2022

Conversation

aseemk98
Copy link
Contributor

No description provided.

global metrics
global selector_files
global strategies
global this_directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason to make these global?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those variables were being used inside the fit() function of askl2. A better solution would be to declare them as class variables instead, I guess.

@@ -339,6 +335,7 @@ def __init__(
"classifier": include_estimators,
"feature_preprocessor": include_preprocessors,
}
train_selectors()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we can improve the startup time further by only training the selector for the metric that is given. In the case of no metric given, I guess we can just default to using all of them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see the metric parameter that is accepted by askl2. Should this be a list of metrics? And if nothing is passed, we could use the default ones.

@codecov
Copy link

codecov bot commented May 13, 2022

Codecov Report

Merging #1473 (807175c) into development (daa9ad6) will decrease coverage by 0.16%.
The diff coverage is 92.85%.

@@               Coverage Diff               @@
##           development    #1473      +/-   ##
===============================================
- Coverage        84.31%   84.15%   -0.17%     
===============================================
  Files              147      151       +4     
  Lines            11284    11496     +212     
  Branches          1934     1995      +61     
===============================================
+ Hits              9514     9674     +160     
- Misses            1256     1288      +32     
- Partials           514      534      +20     

Impacted file tree graph

@eddiebergman
Copy link
Contributor

So the report you see at the bottom are our automated checks which run a benchmark of tests against autosklearn. As you can see they have currently failed. You can click on any one of them and look at the log to see which tests failed.

FAILED test/test_estimators/test_estimators.py::test_selector_file_askl2_can_be_created[None]
FAILED test/test_estimators/test_estimators.py::test_selector_file_askl2_can_be_created[/]
FAILED test/test_estimators/test_estimators.py::test_selector_file_askl2_can_be_created[/tmp]

You can run this locally using pytest -k "test_selector_file_askl2_can_be_created" test/test_estimators and debug from there. Anything you push up, we have to manually click that the tests run as you're a first time contributor. See this section on testing to learn some more or else have a read through pytest --help, it's got a lot of features.

The pre-commit check also fails. This is our test to ensure some standard of code quality. It will tell you of a lot of problems before any code is even run. For formatting purposes, you can just run make format and it will take care of most formatting things. Scroll down in the Testing section to see some more about it.

Let me know if something isn't clear, I can't guarantee any feedback over the weekend so don't hesitate to reach out.

@eddiebergman eddiebergman linked an issue May 13, 2022 that may be closed by this pull request
@aseemk98
Copy link
Contributor Author

After encapsulating the selector training, pytest throws the following error: AttributeError: module 'autosklearn.experimental.askl2' has no attribute 'selector_files'. I think the testing script expects the above-mentioned global variables to be initialized for the test.

@eddiebergman
Copy link
Contributor

Hi @aseemk98,

So for the globals, we can just move them to be parameters of the function or define anything that is static and needed as a module level variable (global). In general we would prefer to avoid the global keyword as it's not exactly clear what is in global scope then. i.e. if it needs to be global then just put it in global scope.

As for the tests, feel free to modify as needed so things work as intended, we can review it and make sure it's still correct. Since the scope of variables is changing, it makes sense that the tests should be changed too.

@aseemk98
Copy link
Contributor Author

@eddiebergman ,
I'm unsure why the test here is failing. When I test on my computer, pytest -k "test_selector_file_askl2_can_be_created" test/test_estimators is passing all 3 tests.

Copy link
Contributor

@eddiebergman eddiebergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I would still ask that the global keywords are removed, either they should be properties of the ASKL2Classifer or just moved explicitly into the global state if that doesn't make sense to do.

@eddiebergman
Copy link
Contributor

The tests on the automated tests also seem to pass fine, the doc tests are failing due to a known issue so I wouldn't worry about that one.

@aseemk98
Copy link
Contributor Author

There was an issue with the previous commit with the test-case 2. That has now been fixed in the latest one. There are no global variables either, an instance of AutoSklearn2Classifier is initialized in the testing script, allowing for it to use metrics and selector_files via that object.

Copy link
Contributor

@eddiebergman eddiebergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, looks a lot cleaner now :) Thanks!

So one more thing I would block merging on and it's a minor thing about renaming metrics to selector_metrics. It's our fault as we are using metrics to mean multiple metrics for training and ensemble soon and so it would conflict with what metrics means here which is just for training the selector.

With that changed, I would merge as is

@@ -367,6 +307,81 @@ def __init__(
allow_string_features=allow_string_features,
)

def train_selectors(self, selected_metric=None):
self.metrics = (balanced_accuracy, roc_auc, log_loss)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we call this self.selector_metrics? We have a version of multi-objective autosklearn coming soon and we use metrics to indicate a list of metrics used for training.

metric_list = [selected_metric]
else:
metric_list = self.metrics
for metric in metric_list:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor comment, adding some spacing between different constructs like if/else and for is nice. I imagine it was clumped like this before but there's no reason not to improve it now :) Non-blocking though, it's mergable without it.

@eddiebergman
Copy link
Contributor

If you're up for the task, you could also test that this functionality works as intended, i.e. the selector is only trained for the metric pass to ASKL2Classifier if a metric is passed to __init__. Otherwise I'll add it as an issue as something that we need to do for the future.

@aseemk98
Copy link
Contributor Author

Hi @eddiebergman,
I've renamed metrics to selector_metrics and added spacing between the if/else and for. Regarding your previous comment, should I be writing test cases of my own in the test script for the same?

@eddiebergman
Copy link
Contributor

eddiebergman commented May 17, 2022

Sure seems good as it is now. For the tests we need to consider what we actually want to test, I can think of a few points but if you have more, please feel free to add them!

  1. When we create an ASKL2Classifier and pass it a metric, we will have have the selector trained for that metric, and no others.
  2. When we create an ASKL2Classifier with a metric like acc and we later create another ASKL2Classifier with that same metric, it won't need to retrain a selector.
  3. What should happen when no metric is passed?
  4. What happens when a metric is used that the selector doesn't support? I'm actually not sure right now, not without looking at the code, having a test is a great way to make sure this behaviour is documented in some way.

For 1., it would be good to do this for all metrics but you can use @parametrize to prevent having to duplicate tests, here's an example:

@pytest.mark.parametrize("name", ["Anna", "assemk98", "apple"])
def test_something_something(name: str):
    assert name.lower().startswith("a")

For 2., it can be done along with the first test.

For 3., I think at the moment it will then train a selector for every metric, similar to the old behavior. You might be able to squeeze it these checks with 1. and 2. but perhaps it's better to make it seperate

For 4., the test might require you to look at these examples on how to create a custom metric. I guess it's behaviour should be the same as it was before in that all selectors are trained?

You might consider some tests like:

@pytest.mark.parametrize("metric", [...])
def test_askl2_fits_selector_for_given_metric_at_init(metric):
    # Create askl2 with metric
    # Make sure when it's created, the selector file exists
    # Make sure there's no other unneeded selector trained
    # ... we might need to clean up selector files so it won't interfere with other tests

def test_askl2_when_no_metric_specified():
    # Create Askl2 with no metric
    # Make sure when its created, all the selector files for the default metrics are created
    # ... we might need to clean up selector files so it won't interfere with other tests
    
def test_askl2_fit_with_custom_metric():
    # Create a custom metric 
    # Create askl2 with metric
    # Make sure the behaviour is .... (same as before)
    # ... we might need to clean up selector files so it won't interfere with other tests

@aseemk98
Copy link
Contributor Author

Hi @eddiebergman,

Sure! I can write tests for the code, I was thinking along the same lines too. I'll create a new PR for the same! Also, if there are any other issues apart from the one I just addressed, would love to work to work on that as well!

@eddiebergman
Copy link
Contributor

It would be good to have the tests in the same PR but for sure, if your willing to help out some more, we can find some good issues! Happy to have the extra hands :)

@aseemk98
Copy link
Contributor Author

Hi @eddiebergman ,
For each of the tests, I will have to make sure that no selector files are present. If I don't pass a selector path, askl2 will create these files in either /tmp or /home, I think. Would it be a good idea to pass a selector path explicitly? That way we can clear the path before creating the instance.

@eddiebergman
Copy link
Contributor

Yup, what you say makes sense to pass an explicit path :) You have a good instinct so in general, you can go with what you think makes sense and if it seems off during the code review, we can discuss it then!

@aseemk98
Copy link
Contributor Author

Hi @eddiebergman ,
I added tests for scenarios 1, 3, and 4. For Case 2, I'm unsure how to approach it. The only thing I can think of is that the time taken to create the second object with the same metric should be significantly lower but I don't think it's a whole way to test that cause time can differ due to various factors. There is another thing I can think of, but that would require adding an extra variable to the askl2 class. We could set a boolean depending on this condition and access it after the instance has been created.

@aseemk98
Copy link
Contributor Author

Hi @eddiebergman,
It's been a while, hope you're well. Can we please close this issue if the code is alright?

@eddiebergman
Copy link
Contributor

Hi @aseemk98,

Very sorry for the delay, we got swamped with some other work and this slipped under the radar, thanks for bumping this!

So the way you handled case 2 makes sense and in general the tests seem all good to me :) Two points however.

When I ran the online tests and they seem to fail when calling the clean_folder at the start of each tests. I guess a simple check to make sure they exist first would fix them. Did it work for you locally?

There's also the option to use pytest fixtures which we tend to use often. If you're not familiar, there's a built-in one called tmp_path which auto injects a temporary Pathlib path into your test. It even cleans up and deletes the directory after the test finishes. I think it's the perfect solution for this

def test_something(tmp_path):
    # Gives you a clean directory specifically for this test
    assert tmp_path.is_dir()
    assert len(list(tmp_path.iterdir())) == 0
    # It's deleted once the test ends

We could use it like so:

@pytest.mark.parametrize(
    "metric",
    [metric for _, metric in autosklearn.metrics.CLASSIFICATION_METRICS.items()],
)
def test_askl2_fits_selector_for_given_metrics_at_init(tmp_path, metric):
    # Not that it was added as a parameter

    # No need to clean it up
    
    # Might be good to leave a comment why this was done, doesn't seem obvious from looking
    # only at the test
    with unittest.mock.patch("os.environ.get") as mock_foo:
        mock_foo.return_value = str(tmp_path)  # Unfortunatly autoklearn isn't great with Pathlib
        ...
        
    # No need to clean it up

There's nothing wrong with the clean_folder approach but it ends up being slightly cleaner this way and requires less code :)

Once that issue is fixed in which ever way you prefer, I'm happy to merge! Apologies for the delay once again.

Best,
Eddie

@aseemk98
Copy link
Contributor Author

Hi @eddiebergman,
I did not know the tmp_path fixture, It's pretty neat :)
Yeah, the tests were working for me locally, that's why I had committed them. I have added the tmp_path fixture in the latest commit.

@aseemk98
Copy link
Contributor Author

Hi @eddiebergman ,
The tests failed again. It's unable to find askl2_training_data.json for most classification metrics that I've included in test case 1. Why is that happening? Also, it's working for me locally.

@eddiebergman
Copy link
Contributor

So it makes sense it should fail for other metrics other than the three predefined ones, take a look at the folder experimental, it only has three metrics in there. From what I'm aware, it requires meta-data for which we only have it available for those 3 metrics. I'm surprised it would pass locally for you then.

I took at look at the old implementation and it seems like that if a user passed in a custom metric and it wasn't one of those three predefined ones, it would use the selector for 'balanced accuracy'.

if self.metric in metrics:
metric_name = self.metric.name
selector_file = selector_files[metric_name]
else:
metric_name = 'balanced_accuracy'
selector_file = selector_files[metric_name]

I'll see how to get the automated tests to run on every push for you so that you get faster feedback on it :)

@eddiebergman
Copy link
Contributor

So it appears there's not much I can do about not having to manually approve workflows, it's a safety thing that github has no real opt-out for. In general though, the local tests should be a good indicator of the tests here, the bigger thing to figure out wuld be why it passes for you locally, as it really shouldn't given there's a selector for only these 3 metrics.

@aseemk98
Copy link
Contributor Author

After making changes in the test script, do I have to make changes elsewhere from them to be actually applied? Or is there a different way to run tests? I use this command: pytest -k "test_selector_file_askl2_can_be_created" test/test_estimators

@eddiebergman
Copy link
Contributor

That command should work, you should make sure you have autosklearn install in editable mode, i.e. cd autosklearn; pip install -e ".[test,docs,examples]". Then, any local changes you make will be reflected when you run the tests.

@aseemk98
Copy link
Contributor Author

Yeah, that's exactly how I had installed dependencies. Anyway, pushed a new commit that tests only on the default metrics we have listed in experimental. Hopefully, now it works.

@aseemk98
Copy link
Contributor Author

@eddiebergman
Also when I run pytest -k "test_selector_file_askl2_can_be_created" test/test_estimators it runs only 3 tests but the script has so many. The output looks like this:

(askl_env) (base) aseem@LAPTOP-QQ8MS8NI:/mnt/d/work/FOSS/auto-sklearn$ pytest -k "test_selector_file_askl2_can_be_created" test/test_estimators
=============================================================================================== test session starts ===============================================================================================
platform linux -- Python 3.7.4, pytest-7.1.2, pluggy-1.0.0
rootdir: /mnt/d/work/FOSS/auto-sklearn, configfile: pyproject.toml
plugins: cases-3.6.12, cov-3.0.0, forked-1.4.0, timeout-2.1.0, xdist-2.5.0
collected 57 items / 54 deselected / 3 selected

test/test_estimators/test_estimators.py ...                                                                                                                                                                 [100%]

================================================================================================ warnings summary =================================================================================================
../askl_env/lib/python3.7/site-packages/_pytest/config/__init__.py:719
  /mnt/d/work/FOSS/askl_env/lib/python3.7/site-packages/_pytest/config/__init__.py:719: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: test.fixtures.dask
    self.import_plugin(import_spec)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================================================================== 3 passed, 54 deselected, 1 warning in 13.40s =========================================================================

Is there a reason for this? Or it is running all required tests?

@eddiebergman
Copy link
Contributor

So the filter -k "test_selector_file_askl2_can_be_created" will select only the tests with that name. You can see this by how many tests it selects collected 57 items / 54 deselected / 3 selected.

Secondly I see that you removed the test for a custom metric, we could keep it in and make sure it does the old behaviour which was to use the one for balanced accuracy. However since this has been going on for a while, I'll also just be happy as soon as the tests pass :)

@aseemk98
Copy link
Contributor Author

One test is still failing. Everything else passed.

@aseemk98
Copy link
Contributor Author

And it's failing for a part of the code that doesn't call askl2.
@eddiebergman What should I do about this? Really want to close this issue now.

@eddiebergman
Copy link
Contributor

The other one is unrelated to this PR and it's a known issue we have. Everything looks good, I'll merge this in once we're good ready to with other changes we have coming :)

Thanks for the PR and sorry for the week delay that happened!

@aseemk98
Copy link
Contributor Author

@eddiebergman thank you as well. You were super patient and helpful! This was my first contribution and was really fun for me :)

@eddiebergman
Copy link
Contributor

Glad to hear! The test part of contributions is always difficult for the first time, every project will have it's own test setup and I still struggle with this for every new project, no worries :)

If you'd like to give any feedback on what we could improve for future contributors, finding an issue, things missing from the contribution guide or just general feedback of any kind that might encourage people to contribute, we would love to hear! Feel free to do so here or else you can find my email on my profile page and drop me an email there if you would prefer :)

Also if you ever would like to contribute again, please let us know, as a benefit, we can enable workflows to automatically run for non-first-time contributers which means there's less stalling and waiting for direct feedback!

Best,
Eddie

@aseemk98
Copy link
Contributor Author

aseemk98 commented Jun 1, 2022

Hi @eddiebergman ,

I would love to take on more issues with auto sklearn, maybe something more challenging this time :)
Thank you, once again!
When will you be closing this issue?

@eddiebergman
Copy link
Contributor

Hello again,

Closing this will probably be towards the end of the month, we have some other PR's and things that are currently being used for research and we don't want to make benchmarking harder with new changes being put in.

I'll take a look today and try find an issue that seems manageable :)

Best,
Eddie

@eddiebergman eddiebergman merged commit 36c40d0 into automl:development Jun 2, 2022
eddiebergman pushed a commit that referenced this pull request Aug 18, 2022
…side _init_ (#1473)

* Encapsulated the selector training within a function and called it inside __init__

* Made changes in the test script that trains selectors before testing

* Moved global variables inside the askl2 class, changes in the test script supporting that

* Committing after running pre-commit

* Fixed issue resulting into failure of case 2

* Changed metrics to selector_metrics

* Added additional tests for askl2

* Added a boolean to the class to check for re-training of selectors

* Using tmp_path instead of tmpdir

* Removed tests for all metrics

* Removed test for custom metric
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Export asklearn2 predictions to improve start up time
2 participants