Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT Function to visualize skops files #317

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions docs/changes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,9 @@ v0.6
- ``add_*`` methods on :class:`.Card` now have default section names (but
``None`` is no longer valid) and no longer add descriptions by default.
:pr:`321` by `Benjamin Bossan`_.
- Add possibility to visualize a skops object and show untrusted types by using
:func:`skops.io.visualize`. For colored output, install `rich`: `pip install
rich`. :pr:`317` by `Benjamin Bossan`_.

v0.5
----
Expand Down
58 changes: 58 additions & 0 deletions docs/persistence.rst
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,64 @@ For example, to convert all ``.pkl`` flies in the current directory:
Further help for the different supported options can be found by calling
``skops convert --help`` in a terminal.

Visualization
#############

Skops files can be visualized using :func:`skops.io.visualize`. If you have
a skops file called ``my-model.skops``, you can visualize it like this:

.. code:: python

import skops.io as sio
sio.visualize("my-model.skops")

The output could look like this:

.. code::

root: sklearn.preprocessing._data.MinMaxScaler
└── attrs: builtins.dict
├── feature_range: builtins.tuple
│ ├── content: json-type(-555)
│ └── content: json-type(123)
├── copy: unsafe_lib.UnsafeType [UNSAFE]
├── clip: json-type(false)
└── _sklearn_version: json-type("1.2.0")

``unsafe_lib.UnsafeType`` was recognized as untrusted and marked.

It's also possible to visualize the object dumped as bytes:

import skops.io as sio
my_model = ...
sio.visualize(sio.dumps(my_model))

There are various options to customize the output. By default, the security of
nodes is color coded if `rich <https://github.com/Textualize/rich>`_ is
installed, otherwise they all have the same color. To install ``rich``, run:

.. code::

python -m pip install rich

or, when installing skops, install it like this:

python -m pip install skops[rich]

To disable colors, even if ``rich`` is installed, pass ``use_colors=False`` to
:func:`skops.io.visualize`.

It's also possible to change what colors are being used, e.g. by passing
``visualize(..., color_safe="cyan")`` to change the color for trusted nodes from
green to cyan. The ``rich`` docs list the `supported standard colors
<https://rich.readthedocs.io/en/stable/appendix/colors.html>`_.

Note that the visualization feature is intended to help understand the structure
of the object, e.g. what attributes are identified as untrusted. It is not a
replacement for a proper security check. In particular, just because an object's
visualization looks innocent does *not* mean you can just call `sio.load(<file>,
trusted=True)` on this object -- only pass the types you really trust to the
``trusted`` argument.

Supported libraries
-------------------
Expand Down
1 change: 1 addition & 0 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ def setup_package():
extras_require={
"docs": min_deps.tag_to_packages["docs"],
"tests": min_deps.tag_to_packages["tests"],
"rich": min_deps.tag_to_packages["rich"],
},
include_package_data=True,
)
Expand Down
6 changes: 4 additions & 2 deletions skops/_min_dependencies.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,8 @@
# 'build' and 'install' is included to have structured metadata for CI.
# It will NOT be included in setup's extras_require
# The values are (version_spec, comma separated tags, condition)
# tags can be: 'build', 'install', 'docs', 'examples', 'tests', 'benchmark'
# tags can be: 'build', 'install', 'docs', 'examples', 'tests', 'benchmark',
# 'rich'
# example:
# "tomli": ("1.1.0", "install", "python_full_version < '3.11.0a7'"),
dependent_packages = {
Expand Down Expand Up @@ -34,13 +35,14 @@
# TODO: remove condition when catboost supports python 3.11
"catboost": ("1.0", "tests", "python_version < '3.11'"),
"fairlearn": ("0.7.0", "docs, tests", None),
"rich": ("12", "tests, rich", None),
}


# create inverse mapping for setuptools
tag_to_packages: dict = {
extra: []
for extra in ["build", "install", "docs", "examples", "tests", "benchmark"]
for extra in ["build", "install", "docs", "examples", "tests", "benchmark", "rich"]
}
for package, (min_version, extras, condition) in dependent_packages.items():
for extra in extras.split(", "):
Expand Down
13 changes: 13 additions & 0 deletions skops/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,3 +43,16 @@ def mock_import(name, *args, **kwargs):
yield

import matplotlib # noqa


@pytest.fixture
def rich_not_installed():
orig_import = builtins.__import__

def mock_import(name, *args, **kwargs):
if name == "rich":
raise ImportError
return orig_import(name, *args, **kwargs)

with patch("builtins.__import__", side_effect=mock_import):
yield
3 changes: 2 additions & 1 deletion skops/io/__init__.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
from ._persist import dump, dumps, get_untrusted_types, load, loads
from ._visualize import visualize

__all__ = ["dumps", "load", "loads", "dump", "get_untrusted_types"]
__all__ = ["dumps", "load", "loads", "dump", "get_untrusted_types", "visualize"]
4 changes: 4 additions & 0 deletions skops/io/_audit.py
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,10 @@ def get_unsafe_set(self) -> set[str]:

return res

def format(self) -> str:
"""Representation of the node's content."""
return f"{self.module_name}.{self.class_name}"


class CachedNode(Node):
def __init__(
Expand Down
8 changes: 8 additions & 0 deletions skops/io/_general.py
Original file line number Diff line number Diff line change
Expand Up @@ -482,6 +482,14 @@ def get_unsafe_set(self) -> set[str]:
def _construct(self):
return json.loads(self.content)

def format(self) -> str:
"""Representation of the node's content.

Since no module is used, just show the content.

"""
return f"json-type({self.content})"


def bytes_get_state(obj: Any, save_context: SaveContext) -> dict[str, Any]:
f_name = f"{uuid.uuid4()}.bin"
Expand Down
4 changes: 2 additions & 2 deletions skops/io/_scipy.py
Original file line number Diff line number Diff line change
Expand Up @@ -40,9 +40,9 @@ def __init__(
trusted: bool | Sequence[str] = False,
) -> None:
super().__init__(state, load_context, trusted)
type = state["type"]
self.type = state["type"]
self.trusted = self._get_trusted(trusted, [spmatrix])
if type != "scipy":
if self.type != "scipy":
raise TypeError(
f"Cannot load object of type {self.module_name}.{self.class_name}"
)
Expand Down
Loading