Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloning and serialization improvements #21

Merged
merged 1 commit into from
Oct 3, 2022
Merged

Conversation

engintoklu
Copy link
Collaborator

@engintoklu engintoklu commented Sep 28, 2022

The main purpose of this pull request is to increase the correctness when cloning and/or pickling instances of Problem, Solution, SolutionBatch, ObjectArray, and ImmutableContainer. When cloning and/or pickling, two new behaviors are introduced:
(i) PyTorch tensors are cut off from their original storages (to avoid small slices from carrying their original storages with themselves); and
(ii) self references within these mentioned objects are specially handled, and therefore, many unwanted RecursionError scenarios which could previously happen are now avoided.

New function: deep_clone(...)

This new function creates a deep copy of a Python container (which can contain other containers or PyTorch tensors or numpy arrays). While traversing through the container, any PyTorch tensors will be copied using the .clone() method, which means, in the newly created clone, the PyTorch tensors will be independent copies without extra storage.

New base class: Clonable

Any subclass of Clonable is expected to override the method _get_cloned_state(...). This method is expected to return the state of the object in a dictionary. The Clonable base class provides definitions for the methods __copy__(...) and __deepcopy__(...) where the dictionary made by _get_cloned_state(...) is used to create the clone.

New base class: Serializable

This is a subclass of Clonable, which is meant to be used as a base class. In addition to the method definitions made by Clonable (__copy__ and __deepcopy__), this base class also defines __getstate__ so that the method _get_cloned_state(...) also affects how the inheriting class is pickled.

Classes inheriting from Serializable

The following classes now inherit from Serializable:

  • Problem
  • Solution
  • SolutionBatch
  • Distribution

Each of these mentioned classes have their own definitions for _get_cloned_state(...), and use deep_clone(...) for preparing their state dictionaries. Therefore, at the moment of cloning and pickling, the stored tensors are subject to the .clone() method and they are cut off from their original storages. With this feature, when a relatively small slice of a large SolutionBatch is cloned or pickled, the serialized slice will not contain the entire storage of the original batch.

New base class: ReadOnlyClonable

This is like Clonable, but requires the inheriting class to also define a method called _get_mutable_clone() so that the object knows how its mutable counterpart can be constructed. ImmutableContainer objects are now ReadOnlyClonable.

======================================

The main purpose of this commit is to increase
the correctness when cloning and/or pickling
instances of Problem, Solution, SolutionBatch,
ObjectArray, and ImmutableContainer.
When cloning and/or pickling, two new behaviors
are introduced:
(i) PyTorch tensors are cut off from their
original storages (to avoid small slices from
carrying their original storages with
themselves); and
(ii) self references within these mentioned
objects are specially handled, and therefore,
many unwanted RecursionError scenarios which
could previously happen are now avoided.

New function: `deep_clone(...)`
===============================

This new function creates a deep copy of a
Python container (which can contain other
containers or PyTorch tensors or numpy arrays).
While traversing through the container, any
PyTorch tensors will be copied using the
`.clone()` method, which means, in the newly
created clone, the PyTorch tensors will be
independent copies without extra storage.

New base class: `Clonable`
==========================

Any subclass of `Clonable` is expected to
override the method `_get_cloned_state(...)`.
This method is expected to return the state
of the object in a dictionary.
The `Clonable` base class provides definitions
for the methods `__copy__(...)` and
`__deepcopy__(...)` where the dictionary made
by `_get_cloned_state(...)` is used to create
the clone.

New base class: `Serializable`
==============================

This is a subclass of Clonable, which is meant
to be used as a base class. In addition to the
method definitions made by `Clonable` (
`__copy__` and `__deepcopy__`), this base class
also defines `__getstate__` so that the method
`_get_cloned_state(...)` also affects how the
inheriting class is pickled.

Classes inheriting from `Serializable`
======================================

The following classes now inherit from
`Serializable`:

 - Problem
 - Solution
 - SolutionBatch
 - Distribution

Each of these mentioned classes have their own
definitions for `_get_cloned_state(...)`, and
use `deep_clone(...)` for preparing their state
dictionaries. Therefore, at the moment of
cloning and pickling, the stored tensors are
subject to the `.clone()` method and they are
cut off from their original storages.
With this feature, when a relatively small
slice of a large SolutionBatch is cloned or
pickled, the serialized slice will not
contain the entire storage of the original
batch.

New base class: `ReadOnlyClonable`
==================================

This is like `Clonable`, but requires the
inheriting class to also define a method
called `_get_mutable_clone()` so that the
object knows how its mutable counterpart
can be constructed. ImmutableContainer
objects are now ReadOnlyClonable.
@engintoklu engintoklu added bug Something isn't working enhancement New feature or request labels Sep 28, 2022
Copy link
Collaborator

@NaturalGradient NaturalGradient left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excluding the gym compatibility issues described in #22 , every existing example appears to work as expected. I have also reviewed the code and have found no notable issues except for those already discussed and clarified by @engintoklu .

@flukeskywalker flukeskywalker merged commit df034a4 into master Oct 3, 2022
@Higgcz Higgcz deleted the feature/cloning branch October 11, 2022 18:46
@Higgcz Higgcz added this to the 0.3.0 milestone Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants