Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST: mark all excel tests as slow #38258

Closed
wants to merge 2 commits into from

Conversation

ivanovmg
Copy link
Member

@ivanovmg ivanovmg commented Dec 3, 2020

  • xref CI: very slow tests #38091
  • tests added / passed
  • passes black pandas
  • passes git diff upstream/master -u -- "*.py" | flake8 --diff
  • whatsnew entry

Marked all excel-related tests as slow.
There are a lot of time for setup and teardown for every test.
Overall, moving these tests to "slow" category may probably save about 5 minutes of test time for "not slow" pipelines.

@ivanovmg ivanovmg marked this pull request as draft December 3, 2020 10:15
@jorisvandenbossche
Copy link
Member

Do you have an overview of timings of the different tests? Are all of them slow, or only a subset? You mention the setup and teardown is slow, but are those needed for all tests, or can this be speed up?

I think we should try to more targetted mark certain tests as slow, not entire subsets of functionality (eg we have tests for minimum versions of dependencies, which would now not be run anymore)

@ivanovmg
Copy link
Member Author

ivanovmg commented Dec 3, 2020

I ran pytest pandas/tests/io/excel/smth.py --durations=100 and observed the output like this:

0.40s call     pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_roundtrip[xlsxwriter-.xlsx]
0.27s teardown pandas/tests/io/excel/test_writers.py::TestRoundTrip::test_set_column_names_in_parameter[.ods]
0.25s teardown pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_raise_when_saving_timezones[tzlocal()-None-openpyxl-.xlsm]
0.25s call     pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_roundtrip_indexlabels[True-xlsxwriter-.xlsx]
0.25s teardown pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_to_excel_multiindex_cols[True-odf-.ods]
0.25s setup    pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_int_types[int16-xlsxwriter-.xlsx]
0.25s teardown pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_write_lists_dict[xlwt-.xls]
0.25s call     pandas/tests/io/excel/test_writers.py::TestExcelWriter::test_roundtrip_indexlabels[False-xlsxwriter-.xlsx]

So, not that these take a massive amount of time individually,
But these are rather slow as compared to milliseconds-type tests.
So, every tests takes like 0.5-0.6 seconds and there is no particular slow test, which require an individual attention.
Combined all excel tests on my machine (ran in a single process) would take about 12 minutes (I am sure that there are at least a couple of 5-minute modules).

================= 1415 passed, 125 skipped, 10 xfailed, 1 warning in 704.44s (0:11:44) ==================

As @jreback mentioned, test run time is a concern.
So I thought that moving some of quite slow tests into a "slow" category may be a way to go, since the pipelines for the "slow" tests get executed rather fast.

However, what I do not understand is how the purpose for the pipelines is selected.
What is the difference between local_slow and slow?
Many of the excel tests may be skipped if some dependencies are missing, thus it is necessary to ensure that we do not skip anything unintentionally.

@jreback
Copy link
Contributor

jreback commented Dec 3, 2020

@ivanovmg I think its ok to move some of the excel tests to slow, but we have to be really careful because there are lots of variations that are tested, currently these are reflected in the .yaml files for the fast tests. so likely some of these variations are NOT running on the slow builds.

@jreback
Copy link
Contributor

jreback commented Dec 3, 2020

I don't think this actually moves the needle either, the main culprit is this build: https://travis-ci.org/github/pandas-dev/pandas/jobs/747392896 is still taking an hour.

@ivanovmg
Copy link
Member Author

ivanovmg commented Dec 8, 2020

All the excel tests have the same duration, so there is no opportunity to select those, which are slow in particular.
I am closing this PR as the effect of improving the performance was not achieved anyway.

@ivanovmg ivanovmg closed this Dec 8, 2020
@ivanovmg ivanovmg deleted the test/excel_slow branch December 8, 2020 03:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants