Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_get_corsika_telescope_list takes too long #1130

Closed
orelgueta opened this issue Sep 3, 2024 · 8 comments · Fixed by #1183
Closed

test_get_corsika_telescope_list takes too long #1130

orelgueta opened this issue Sep 3, 2024 · 8 comments · Fixed by #1183

Comments

@orelgueta
Copy link
Contributor

The test_get_corsika_telescope_list test takes roughly 20 seconds to run. Profiling it points to excessive time spent initializing the array model, perhaps related to reading from the DB too many times.

  ncalls  tottime  percall  cumtime  percall filename:lineno(function)
        1    0.000    0.000   21.427   21.427 /workdir/external/simtools/simtools/model/array_model.py:41(__init__)
        1    0.000    0.000   20.855   20.855 /workdir/external/simtools/simtools/model/array_model.py:69(_initialize)
       60    0.004    0.000   20.773    0.346 /workdir/external/simtools/simtools/model/model_parameter.py:49(__init__)
       58    0.002    0.000   19.823    0.342 /workdir/external/simtools/simtools/model/telescope_model.py:38(__init__)
        1    0.000    0.000    0.994    0.994 /workdir/external/simtools/simtools/model/site_model.py:27(__init__)
       61    0.001    0.000    0.574    0.009 /workdir/external/simtools/simtools/db/db_handler.py:57(__init__)

Need to look into it further (in the middle of something else, opening this issue so it is not forgotten).

@GernotMaier
Copy link
Contributor

The array_model consist of the site model, the model of all telescopes and of all calibration devices. So this is naturally that all DB calls are initiated from the array_model_class.

The number of calls to the DB are reduced by having the DatabaseHandler.*cached dicts, where the idea is to not query the same parameter twice.

Obviously always good to check that this is still working.

@orelgueta
Copy link
Contributor Author

The main difference to the old style is that now we call the DB for each telescope separately. In the past we read all of the parameters of all of the telescopes in one query (which is very quick). I suggest to extend the caching functionality to do that as well with the new style. Usually the model parameter version does not change from one telescope to the next (at least for now), so when the first telescope is read, we can fill the cache with the parameters of all telescopes for that same version. The should reduce the DB calls to one in most cases.

@orelgueta
Copy link
Contributor Author

I can confirm that when running at DESY this test takes 1 second, while running it at home takes 15-20 seconds.
On GitHub this test is also the slowest at ~50 seconds (see e.g., this job). I think that adding another cache level is therefore worth it since it will shorten the running time of the GitHub tests. It's far from urgent though, considering it is just a minute in the worst case scenario.

@GernotMaier
Copy link
Contributor

Started to have a look:

  • test_get_corsika_telescope_list is the only test for corsika_config.py which actually accesses the database (all other tests are using the fixture corsika_config_no_db; I actually introduce at some point a @pytest.mark.uses_model_database() marker)
  • so no wonder that this is the slowest function in these tests...

Given that we want to test the code and not the database connection, my suggestion is to change also this function and either use the corsika_config_no_db or some mocking (aligned with the suggestion in issue #1054). OK?

Additionally:

  • we should think about dedicated benchmark tests on accessing the DB for different usage scenarios
  • suggest that this would be an integration test and not a unit test, as it tests several modules

@orelgueta
Copy link
Contributor Author

I find the comments/suggestions a little bit contradictory. I will write comments below. They are out of order, but I think follow some logic still.

  • suggest that this would be an integration test and not a unit test, as it tests several modules

If we want to change it to be an integration test, then I wouldn't use mocking but actually connect to the DB as expected in integration tests and as we do in others.

  • we should think about dedicated benchmark tests on accessing the DB for different usage scenarios

This test actually accomplished that in a way. It showed that we are accessing the DB for each telescope separately, which is something we can probably easily avoid. I agree that having a dedicated test for accessing the DB is good though, let's try to come up with it after we decide on the structure of the DB.

Given that we want to test the code and not the database connection, my suggestion is to change also this function and either use the corsika_config_no_db or some mocking (aligned with the suggestion in issue #1054). OK?

It's fine to change the unit test to use mocking, but then I would add a separate integration test that does the same, actually using the DB. I think it's important to test this in a real scenario, considering the chain of "events" this test triggers (even though it is just one command).

@orelgueta
Copy link
Contributor Author

Oh and I would wait before implementing any changes until we implement the DB restructuring. This problem might resolve itself then because we will probably access the DB once, getting the entries based on the table of production version/parameter versions.

@GernotMaier
Copy link
Contributor

Yes and no:

  • independent of any restructuring should we avoid DB calls in unit tests
  • we need a test for the DB connections (but not a unit test)

I think both points are not affected by restructuring.

@orelgueta
Copy link
Contributor Author

Yes, sorry, I didn't mean that those points would not be valid anymore, just that this specific issue of many unnecessary calls would (hopefully) be resolved. I don't think we should add another cache at the moment (like I proposed above) because it will probably be superfluous once we implement the restructuring.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants