Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC Add secure persistence docs #145

Merged
merged 20 commits into from
Sep 29, 2022

Conversation

BenjaminBossan
Copy link
Collaborator

@BenjaminBossan BenjaminBossan commented Sep 19, 2022

Fixes #139

@BenjaminBossan
Copy link
Collaborator Author

@merveenoyan Ready for review.

Copy link
Collaborator

@merveenoyan merveenoyan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left minor nits, overall it looks good but I previously read the code so I'd like @stevhliu to take a look as well!

README.rst Outdated Show resolved Hide resolved
docs/persistence.rst Outdated Show resolved Hide resolved
docs/persistence.rst Outdated Show resolved Hide resolved
@@ -28,6 +28,29 @@


def save(obj, file):
"""Save an object using the skops persistence format.

Skops aims at providing a secure persistence feature that does not rely on
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to put warnings only to load rather than save, but feel free to ignore if you think it doesn't make sense😅

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know I went overboard with warnings, but I really wanted to make sure users don't miss the fact that we're still early in development and know that it's not secure yet. Hopefully, we can fill these gaps soon and then remove the warnings. Until then, I'd rather leave them in.

BenjaminBossan and others added 3 commits September 19, 2022 17:34
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Copy link
Contributor

@stevhliu stevhliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job and super cool docs!

docs/persistence.rst Outdated Show resolved Hide resolved
caution as for pickle should be applied: Don't load from sources that you
don't trust. In the future, more security will be added.

Skops offers a way to save and load sklearn models without using :mod:`pickle`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this opening would be much stronger if you presented it as an issue with pickle + how skops addresses it. It may not be obvious to a user why they shouldn't use pickle especially if it is so common. Something like:

The pickle module is not secure, but with skops, you can securely save and load sklearn models without using pickle.

docs/persistence.rst Outdated Show resolved Hide resolved
Skops offers a way to save and load sklearn models without using :mod:`pickle`.
Pickle is the standard serialization format for sklearn and for Python in
general. It has the big advantage that it can be used for almost all Python code
but this flexibility also means that it's inherently insecure. As the Python
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be a little clearer if you also explain why its flexibility means it's insecure. Maybe something like:

but this flexibility also means that it's inherently insecure because [insert reason here].

docs/persistence.rst Outdated Show resolved Hide resolved
docs/persistence.rst Outdated Show resolved Hide resolved
docs/persistence.rst Outdated Show resolved Hide resolved
docs/persistence.rst Outdated Show resolved Hide resolved
code using :func:`eval`. We do, however, have concrete plans on how to mitigate
this. Please stay updated.

On top of trying to support all of sklearn, we plan on making persistence
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
On top of trying to support all of sklearn, we plan on making persistence
On top of trying to support persistence in sklearn, we plan on making persistence

work. However, most ``numpy`` and ``scipy`` functions should work. Therefore,
you can actually save built-in functions like``numpy.sqrt``.

Goals
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'd maybe rename this section to just Roadmap and combine it with the section below.

BenjaminBossan and others added 8 commits September 22, 2022 10:23
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
- further explain pickle insecurity
- don't use value words
- wording
- merge sections
@BenjaminBossan
Copy link
Collaborator Author

@stevhliu Thanks, your input is really appreciated. I especially tend to use value words, thanks for pointing that out. I addressed all of your points.

Copy link
Member

@adrinjalali adrinjalali left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good as a first iteration. Thanks @BenjaminBossan

docs/persistence.rst Show resolved Hide resolved
docs/persistence.rst Outdated Show resolved Hide resolved
code but this flexibility also makes it inherently insecure. This is because
loading certain types of objects requires the ability to run arbitrary code,
which can be misused for malicious purposes. For example, an attacker can use it
to steal secrets from your machine or install a virus. As the Python docs say:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

link to the python docs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about that, but this specific paragraph has no link. However, if you follow the :mod:`pickle` link just above, you will see this paragraph. Is another link to the same place needed?

docs/persistence.rst Show resolved Hide resolved
docs/persistence.rst Outdated Show resolved Hide resolved
skops/io/_persist.py Outdated Show resolved Hide resolved
skops/io/_persist.py Outdated Show resolved Hide resolved
skops/io/_persist.py Outdated Show resolved Hide resolved
skops/io/_persist.py Outdated Show resolved Hide resolved
BenjaminBossan and others added 5 commits September 26, 2022 10:45
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
- Wording
- Use a "warning" section

Also:

- Consistent use of "``pickle``" over "pickle".
Comment on lines 22 to 23
attacker can use it to steal secrets from your machine or install a virus. As
the Python docs say:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking something like this would be enough?

Suggested change
attacker can use it to steal secrets from your machine or install a virus. As
the Python docs say:
attacker can use it to steal secrets from your machine or install a virus. As
the `Python docs <https://docs.python.org/3/library/pickle.html#module-pickle>`__ say:

This whole paragraph doesn't have any links to pickle.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This results in the exact same link as :mod:`pickle` in the paragraph directly above, but anyway, I added it

@adrinjalali adrinjalali changed the title Add secure persistence docs DOC Add secure persistence docs Sep 29, 2022
@adrinjalali adrinjalali merged commit 021c8fd into skops-dev:main Sep 29, 2022
@BenjaminBossan BenjaminBossan deleted the persistence-docs branch September 29, 2022 13:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Secure persistence: Add documentation
4 participants