MNT: refactor model card to render sections lazily #310

BenjaminBossan · 2023-03-02T16:33:08Z

Currently, the way model cards work is that if a user adds new content,
a Section object is created and added as a (sub)section. That object
contains the formatted content of that section as str, which is then
used when rendering.

So if the user adds, e.g., a table, the table is eagerly formatted to a
str and during rendering, the str is simply returned.

With this refactor, when a user adds new content, a specific type of
section is added instead, which does not eagerly format the string.

For simple text content, this is just a PlainSection which holds the content
as a str, so no big change. However, for more complex content like
tables or figures, a special section type is added that does not contain
the final content as rendered.

When the model card is rendered, the section.format() method is called
to render the actual content, i.e. the formatting is now called lazily.

The main benefit of this refactor is that the sections in a model card
are now "aware" of what type of section they are. This allows us to more
easily modify them later. E.g. let's say we want to modify the path to a
figure or a row in a table: Right now, this would require some quite
complex regex to modify the already formatted str. With this refactor,
we can just access the attributes of the section, e.g. section.path or
section.table, and modify them.

A minor benefit could also be performance, because we don't render until
it's strictly needed, but that's probably very minor.

From the user perspective, this refactor shouldn't change anything as
long as they were sticking with the official API. When calling .render()
or .save(), the output is the same as before. Only if the users were
directly interacting with Sections might there be a breakage.

A small feature added in this PR is that .add_plot and .add_table now take
an additional argument, description. This way, consistent with other .add_*
methods, users can now add a description to their plot or table. If a user adds
multiple plots/tables with a single call, they will all get the same
description. So if that's not desired, users need to call the methods multiple
times.

On top of that, for plots, users can now also pass the alt_text argument,
which allows them to set a different alt texts. Same caveats as for
description apply when adding multiple figures.

Another minor visible change this PR brings as well is that the repr of
the card changes slightly. Now it indicates what type of section we're
dealing with, e.g. TableSection(...), except if it's a PlainSection,
which, same as previously, just returns the repr of its content.

Comment:

The classes TableSection and PlotSection were moved further down in the
file. Therefore, the diff might look bigger than it actually is. Still, there
are some changes to the classes as well, so please review them too.

Currently, the way model cards work is that if a user adds new content, a Section object is created and added as a (sub)section. That object contains the *formatted* content of that section as str, which is then used when rendering. So if the user adds, e.g., a table, the table is eagerly formatted to a str and during rendering, the str is simply returned. With this refactor, when a user adds new content, a specific type of section is added instead, which does not eagerly format the string. For simple text content, this is just a PlainSection which holds the content as a str, so no big change. However, for more complex content like tables or figures, a special section type is added that does not contain the final content as rendered. When the model card is rendered, the section.format() method is called to render the actual content, i.e. the formatting is now called lazily. The main benefit of this refactor is that the sections in a model card are now "aware" of what type of section they are. This allows us to more easily modify them later. E.g. let's say we want to modify the path to a figure or a row in a table: Right now, this would require some quite complex regex to modify the already formatted str. With this refactor, we can just access the attributes of the section, e.g. section.path or section.table, and modify them. A minor benefit could also be performance, because we don't render until it's strictly needed, but that's probably very minor. From the user perspective, this refactor shouldn't change anything as long as they were sticking with the official API. When calling render() or save(), the output is the same as before. Only if the users were directly interacting with sections might there be a breakage. One minor visible change this PR brings as well is that the repr of the card changes slightly. Now it indicates what type of section we're dealing with, e.g. TableSection(...), except if it's a PlainSection, which, same as previously, just returns the repr of its content. This commit is WIP because I also added new "description" parameters to add_* methods that were missing it so far. Now, if a user adds, say, a plot, they can optionally add a description for the plot too. This functionality lacks unit tests as of now.

BenjaminBossan · 2023-03-02T17:15:48Z

@skops-dev/maintainers Ready for review. After 5 restarts or so, codecov also passed (the "uncovered line" is a NotImplementedError in a base class...). Adrin, this refactor is what we shortly discussed earlier today.

(btw., this refactor was greatly facilitated by mypy)

adrinjalali

Nice!

Are you also adding description to all add_ methods?

adrinjalali · 2023-03-03T07:03:05Z

skops/card/_markup.py

+        title, description = "", ""
+        res = TableSection(title, description, table=table).format()


I would probably have done

Suggested change

title, description = "", ""

res = TableSection(title, description, table=table).format()

res = TableSection(title="", description="", table=table).format()

Done (though the argument isn't always description, even if it might serve the purpose, which is why I had it that way).

are there missing commits you haven't pushed? here you say done, but I don't see a change.

No, I just missed this one. Should be done now.

skops/card/_model_card.py

The add_permutation_importances method now also has a description argument. I also reworked the tests for permutation importances a bit: - added a new test for description - put all related tests inside a test class - moved the permutation importance computation into a fixture - moved to the new testing method of selecting section and checking its content exactly instead of checking a substring in model_card.render()

Now, Section is a concrete class (taking the role of PlainSection).

BenjaminBossan

I addressed your comments.

Are you also adding description to all add_ methods?

I only see add_permutation_importance, right? For add itself, it wouldn't make sense.

While making the change for permutation importance, I also reworked the tests a bit:

put all related tests inside a test class
moved the permutation importance computation into a fixture
moved to the new testing method of selecting section and checking its content exactly instead of checking a substring in model_card.render() (more precise testing, consistent with other tests)

But the content itself of the tests remains the same as previously.

BenjaminBossan · 2023-03-03T11:11:00Z

skops/card/_markup.py

+        title, description = "", ""
+        res = TableSection(title, description, table=table).format()


Done (though the argument isn't always description, even if it might serve the purpose, which is why I had it that way).

skops/card/_model_card.py

Missing from last commit.

adrinjalali

Love this.

We merged skops-dev#298 and skops-dev#310 shortly after each other but they contained an incompatibility that broke the fairlearn tests (the code itself was fine). This PR fixes this incompatibility. On top, I added the description argument to add_fairlearn_metric_frame, to be consistent with all the other methods, and also as a test for it. Finally, 2 small fixes: - Added type annotation to transpose argument - Changed order of arguments in docstring to match order in signature

We merged #298 and #310 shortly after each other but they contained an incompatibility that broke the fairlearn tests (the code itself was fine). This PR fixes this incompatibility. To be clear, the only change needed to fix the tests is the following: ```python - actual_table = card.select("Metric Frame Table").content.format() + actual_table = card.select("Metric Frame Table").format() ``` On top, I added the `description` argument to `add_fairlearn_metric_frame`, to be consistent with all the other methods (also changed in #310), and I also added as a test for it. Since we now have 2 tests, I moved the `metric_frame` variable to a fixture. Finally, 2 small fixes: - Added type annotation to transpose argument - Changed order of arguments in docstring to match order in signature

BenjaminBossan added 3 commits March 2, 2023 16:24

Add missing tests and docstrings

d2585a0

Fix repr, Windows does not use /

0669b28

adrinjalali reviewed Mar 3, 2023

View reviewed changes

BenjaminBossan added 3 commits March 3, 2023 12:00

Remove Section abstract class

87547df

Now, Section is a concrete class (taking the role of PlainSection).

Address reviewer comments

18c0855

BenjaminBossan commented Mar 3, 2023

View reviewed changes

This was referenced Mar 3, 2023

FEAT Add skops space creator app #307

Merged

Ensure that all add_* methods on Card have a description keyword #309

Closed

Reviewer comment: how to init class

b66db87

Missing from last commit.

BenjaminBossan mentioned this pull request Mar 7, 2023

Annoyance with adding figures to model cards #300

Closed

adrinjalali approved these changes Mar 7, 2023

View reviewed changes

adrinjalali merged commit 9f73d47 into skops-dev:main Mar 7, 2023

BenjaminBossan mentioned this pull request Mar 7, 2023

BUG: Fix bug in tests for fairlearn + 1 more test #313

Merged

BenjaminBossan mentioned this pull request Mar 9, 2023

Make it possible to collapse/fold any section in model card #315

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MNT: refactor model card to render sections lazily #310

MNT: refactor model card to render sections lazily #310

BenjaminBossan commented Mar 2, 2023

BenjaminBossan commented Mar 2, 2023

adrinjalali left a comment

adrinjalali Mar 3, 2023

BenjaminBossan Mar 3, 2023

adrinjalali Mar 6, 2023

BenjaminBossan Mar 6, 2023

BenjaminBossan left a comment

BenjaminBossan Mar 3, 2023

adrinjalali left a comment

		title, description = "", ""
		res = TableSection(title, description, table=table).format()

	title, description = "", ""
	res = TableSection(title, description, table=table).format()
	res = TableSection(title="", description="", table=table).format()

MNT: refactor model card to render sections lazily #310

MNT: refactor model card to render sections lazily #310

Conversation

BenjaminBossan commented Mar 2, 2023

BenjaminBossan commented Mar 2, 2023

adrinjalali left a comment

Choose a reason for hiding this comment

adrinjalali Mar 3, 2023

Choose a reason for hiding this comment

BenjaminBossan Mar 3, 2023

Choose a reason for hiding this comment

adrinjalali Mar 6, 2023

Choose a reason for hiding this comment

BenjaminBossan Mar 6, 2023

Choose a reason for hiding this comment

BenjaminBossan left a comment

Choose a reason for hiding this comment

BenjaminBossan Mar 3, 2023

Choose a reason for hiding this comment

adrinjalali left a comment

Choose a reason for hiding this comment