-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC Add secure persistence docs #145
Conversation
@merveenoyan Ready for review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left minor nits, overall it looks good but I previously read the code so I'd like @stevhliu to take a look as well!
@@ -28,6 +28,29 @@ | |||
|
|||
|
|||
def save(obj, file): | |||
"""Save an object using the skops persistence format. | |||
|
|||
Skops aims at providing a secure persistence feature that does not rely on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to put warnings only to load rather than save, but feel free to ignore if you think it doesn't make sense😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know I went overboard with warnings, but I really wanted to make sure users don't miss the fact that we're still early in development and know that it's not secure yet. Hopefully, we can fill these gaps soon and then remove the warnings. Until then, I'd rather leave them in.
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
Co-authored-by: Merve Noyan <merveenoyan@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job and super cool docs!
caution as for pickle should be applied: Don't load from sources that you | ||
don't trust. In the future, more security will be added. | ||
|
||
Skops offers a way to save and load sklearn models without using :mod:`pickle`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this opening would be much stronger if you presented it as an issue with pickle
+ how skops
addresses it. It may not be obvious to a user why they shouldn't use pickle
especially if it is so common. Something like:
The pickle
module is not secure, but with skops
, you can securely save and load sklearn models without using pickle
.
docs/persistence.rst
Outdated
Skops offers a way to save and load sklearn models without using :mod:`pickle`. | ||
Pickle is the standard serialization format for sklearn and for Python in | ||
general. It has the big advantage that it can be used for almost all Python code | ||
but this flexibility also means that it's inherently insecure. As the Python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it'd be a little clearer if you also explain why its flexibility means it's insecure. Maybe something like:
but this flexibility also means that it's inherently insecure because [insert reason here].
docs/persistence.rst
Outdated
code using :func:`eval`. We do, however, have concrete plans on how to mitigate | ||
this. Please stay updated. | ||
|
||
On top of trying to support all of sklearn, we plan on making persistence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On top of trying to support all of sklearn, we plan on making persistence | |
On top of trying to support persistence in sklearn, we plan on making persistence |
docs/persistence.rst
Outdated
work. However, most ``numpy`` and ``scipy`` functions should work. Therefore, | ||
you can actually save built-in functions like``numpy.sqrt``. | ||
|
||
Goals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I'd maybe rename this section to just Roadmap and combine it with the section below.
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
- further explain pickle insecurity - don't use value words - wording - merge sections
@stevhliu Thanks, your input is really appreciated. I especially tend to use value words, thanks for pointing that out. I addressed all of your points. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Look good as a first iteration. Thanks @BenjaminBossan
docs/persistence.rst
Outdated
code but this flexibility also makes it inherently insecure. This is because | ||
loading certain types of objects requires the ability to run arbitrary code, | ||
which can be misused for malicious purposes. For example, an attacker can use it | ||
to steal secrets from your machine or install a virus. As the Python docs say: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
link to the python docs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about that, but this specific paragraph has no link. However, if you follow the :mod:`pickle`
link just above, you will see this paragraph. Is another link to the same place needed?
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
Co-authored-by: Adrin Jalali <adrin.jalali@gmail.com>
- Wording - Use a "warning" section Also: - Consistent use of "``pickle``" over "pickle".
docs/persistence.rst
Outdated
attacker can use it to steal secrets from your machine or install a virus. As | ||
the Python docs say: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking something like this would be enough?
attacker can use it to steal secrets from your machine or install a virus. As | |
the Python docs say: | |
attacker can use it to steal secrets from your machine or install a virus. As | |
the `Python docs <https://docs.python.org/3/library/pickle.html#module-pickle>`__ say: |
This whole paragraph doesn't have any links to pickle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This results in the exact same link as :mod:`pickle`
in the paragraph directly above, but anyway, I added it
Fixes #139