TST Check file size differences #350

BenjaminBossan · 2023-04-21T15:39:17Z

Description

We already measure the runtime performance difference, but we don't check the file size difference. This PR adds another, similar script that does exactly that. The checks are run in the same workflow as the runtime performance checks.

The results are reported, showing top 10 largest differences, once in terms of absolute, once in terms of relative differences. In contrast to the runtime performance check, there is never any error raised, no matter how big the difference is, because it is unclear what, if any, difference, would count as unacceptable.

I compressed the pickle and skops files with a high compression rate. This is a reasonable choice for the benchmark because we can assume that if file size is a concern, users would choose that option.

Results

Running this on my machine, I get:

Largest absolute difference

                             name  pickle (kb)  skops (kb)    abs_diff
0      GradientBoostingClassifier    28.260742  401.458008  373.197266
1       GradientBoostingRegressor    26.522461  398.620117  372.097656
2            ExtraTreesClassifier    55.287109  204.671875  149.384766
3             ExtraTreesRegressor   155.334961  299.827148  144.492188
4          RandomForestClassifier    21.762695  158.903320  137.140625
5                 IsolationForest    83.681641  218.454102  134.772461
6            RandomTreesEmbedding    66.018555  193.952148  127.933594
7           RandomForestRegressor    96.218750  218.662109  122.443359
8  HistGradientBoostingClassifier    12.216797   95.233398   83.016602
9   HistGradientBoostingRegressor    12.311523   94.998047   82.686523

Largest relative difference

                             name  pickle (kb)  skops (kb)   rel_diff
0       GradientBoostingRegressor    26.522461  398.620117  15.029530
1      GradientBoostingClassifier    28.260742  401.458008  14.205501
2              AdaBoostClassifier     5.845703   68.863281  11.780154
3                       SparsePCA     1.147461   11.383789   9.920851
4                  FactorAnalysis     1.434570   11.860352   8.267529
5                KBinsDiscretizer     1.763672   14.403320   8.166667
6  HistGradientBoostingClassifier    12.216797   95.233398   7.795284
7   HistGradientBoostingRegressor    12.311523   94.998047   7.716189
8                   CategoricalNB     1.371094   10.341797   7.542735
9          RandomForestClassifier    21.762695  158.903320   7.301638

Out of curiosity, I also sorted by the lowest difference and there the factor could be as small as skops being 4% larger. The mean relative difference is skops being 137% larger, median is 92% larger.

The largest differences are clearly found in decision tree ensemble estimators. This is perhaps not surprising, given that those should be the largest models in general, with 100 trees by default. Still, perhaps we can come up with a way to store trees more efficiently.

We already measure the runtime performance difference, but we don't check the file size difference. This PR adds another, similar script that does exactly that. The checks are run in the same workflow as the runtime performance checks. The results are reported, showing top 10 largest differences, once in terms of absolute, once in terms of relative differences. In contrast to the runtime performance check, there is never any error raised, no matter how big the difference is, because it is unclear what, if any, difference, would count as unacceptable. For skops, the zip file is highly compressed. This is a reasonable choice for the benchmark because we can assume that if file size is a concern, users would choose that option.

BenjaminBossan · 2023-04-21T15:51:16Z

@skops-dev/maintainers ready for review

adrinjalali · 2023-04-21T16:07:20Z

I'm happy to have this, but unlike check_persistence_performance we don't really ever raise here. Are you thinking of having this mostly as something which would inform us of the current state?

BenjaminBossan · 2023-04-21T16:59:55Z

I'm happy to have this, but unlike check_persistence_performance we don't really ever raise here. Are you thinking of having this mostly as something which would inform us of the current state?

We could have this just for checking, but if you have a suggestion for a maximum acceptable difference, I can add it.

adrinjalali · 2023-04-24T15:38:15Z

Maybe check every file is like less than 1MB, like a sanity check?

BenjaminBossan · 2023-04-25T09:25:00Z

Done

BenjaminBossan added 2 commits April 21, 2023 17:22

For fairness, also compress the pickle files

0ad4b23

Raise error if any skops file exceeds 1 mb

fa7fd14

adrinjalali approved these changes Apr 25, 2023

View reviewed changes

adrinjalali changed the title ~~Check file size differences~~ TST Check file size differences Apr 25, 2023

adrinjalali merged commit 0bcde53 into skops-dev:main Apr 25, 2023

BenjaminBossan deleted the check-file-size-differences branch May 3, 2023 10:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TST Check file size differences #350

TST Check file size differences #350

BenjaminBossan commented Apr 21, 2023

BenjaminBossan commented Apr 21, 2023

adrinjalali commented Apr 21, 2023

BenjaminBossan commented Apr 21, 2023

adrinjalali commented Apr 24, 2023

BenjaminBossan commented Apr 25, 2023

TST Check file size differences #350

TST Check file size differences #350

Conversation

BenjaminBossan commented Apr 21, 2023

Description

Results

Largest absolute difference

Largest relative difference

BenjaminBossan commented Apr 21, 2023

adrinjalali commented Apr 21, 2023

BenjaminBossan commented Apr 21, 2023

adrinjalali commented Apr 24, 2023

BenjaminBossan commented Apr 25, 2023