Skip to content

Commit

Permalink
MNT: refactor model card to render sections lazily (#310)
Browse files Browse the repository at this point in the history
Currently, the way model cards work is that if a user adds new content,
a `Section` object is created and added as a (sub)section. That object
contains the *formatted* content of that section as `str`, which is then
used when rendering.

So if the user adds, e.g., a table, the table is eagerly formatted to a
`str` and during rendering, the `str` is simply returned.

With this refactor, when a user adds new content, a specific type of
section is added instead, which does not eagerly format the string.

For simple text content, this is just a `PlainSection` which holds the content
as a `str`, so no big change. However, for more complex content like
tables or figures, a special section type is added that does not contain
the final content as rendered.

When the model card is rendered, the `section.format()` method is called
to render the actual content, i.e. the formatting is now called lazily.

The main benefit of this refactor is that the sections in a model card
are now "aware" of what type of section they are. This allows us to more
easily modify them later. E.g. let's say we want to modify the path to a
figure or a row in a table: Right now, this would require some quite
complex regex to modify the already formatted `str`. With this refactor,
we can just access the attributes of the section, e.g. `section.path` or
`section.table`, and modify them.

A minor benefit could also be performance, because we don't render until
it's strictly needed, but that's probably very minor.

From the user perspective, this refactor shouldn't change anything as
long as they were sticking with the official API. When calling `.render()`
or `.save()`, the output is the same as before. Only if the users were
directly interacting with `Section`s might there be a breakage.

A small feature added in this PR is that `.add_plot` and `.add_table` now take
an additional argument, `description`. This way, consistent with other `.add_*`
methods, users can now add a description to their plot or table. If a user adds
multiple plots/tables with a single call, they will all get the same
description. So if that's not desired, users need to call the methods multiple
times.

On top of that, for plots, users can now also pass the `alt_text` argument,
which allows them to set a different alt texts. Same caveats as for
`description` apply when adding multiple figures.

Another minor visible change this PR brings as well is that the `repr` of
the card changes slightly. Now it indicates what type of section we're
dealing with, e.g. `TableSection(...)`, except if it's a `PlainSection`,
which, same as previously, just returns the `repr` of its content.

*Comment*:

The classes `TableSection` and `PlotSection` were moved further down in the
file. Therefore, the diff might look bigger than it actually is. Still, there
are some changes to the classes as well, so please review them too.
  • Loading branch information
BenjaminBossan authored Mar 7, 2023
1 parent 5b3a7a4 commit 9f73d47
Show file tree
Hide file tree
Showing 3 changed files with 481 additions and 290 deletions.
2 changes: 1 addition & 1 deletion skops/card/_markup.py
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ def _table(self, item) -> str:
data_transposed = zip(*body)
table = {key: val for key, val in zip(columns, data_transposed)}

res = TableSection(table).format()
res = TableSection(title="", content="", table=table).format()
return res

def _parse_div(self, item) -> str:
Expand Down
Loading

0 comments on commit 9f73d47

Please sign in to comment.