Convert outputs to dict #757

BulatVakhitov · 2024-07-15T11:01:13Z

This PR makes it possible to work with outputs as dicts. The main points are:

The idea behind PR is to get rid of memorising that sigmoid output is on index 0, predictions is on index 1, and loss is on index 2. Now you can address them by their name.
In case the model has multiple predictions, it's now user's responsibility to parse these predictions.
You still can pass outputs as list, string or callable, but under the hood they all are converted to dict. The resulting processed output would be dict if outputs is a dict, list if outputs is a list and tensor otherwise.

SergeyTsimfer · 2024-07-16T08:17:21Z

batchflow/models/torch/base.py

+            Each string defines a tensor to get and should be one of:
+                - pre-defined tensors, which are `predictions`, `loss`, and `predictions_{i}` for multi-output models.
+                - pre-defined operations, which are `softplus`, `sigmoid`, `sigmoid_uint8`, `sigmoid_int16`, 
+                `proba`, `labels`. Work only with len(predictions) == 1
+                - layer id, which describes how to access the layer through a series of `getattr` and `getitem` calls.
+                Allows to get intermediate activations of a neural network.
+            Each callable defines a function that should be applied to predictions.
+            If outputs are dict, then keys are strings and they are considered as output_names. The values should be
+            either callables, pre-defined tensors, pre-defined operations or layer id.


A tensor to get and should be one of: - a string, which can be ... - - ... - - ... - a callable, ... - a sequence, where each item is one of the previous types. Result of this method is guaranteed to have the same order of elements; - a dict, where each value is one of the previous types. Result of this method is a dictionary with the same keys and requested tensors as values.

SergeyTsimfer · 2024-07-16T08:45:53Z

batchflow/models/torch/base.py

+        if 'predictions' in outputs_dict.values():
+            # in case there are multiple output_names with the same operation == `predictions`. Same for `loss`
+            predictions_names_list = [output_name for output_name, operation in outputs_dict.items() \
+                                      if operation == 'predictions']
+            output_container.update({output_name: predictions for output_name in predictions_names_list})
+        if 'loss' in outputs_dict.values():
+            losses_names_list = [output_name for output_name, operation in outputs_dict.items() \
+                                 if operation == 'loss']
+            output_container.update({output_name: loss for output_name in losses_names_list})


for output_name, requested in output_dict.items(): if requested == 'prediction': output_container[output_name] = predictions elif requested == 'loss': output_container[output_name] = loss

IMO, looks much simpler. Can be further reduced to 4 lines.

SergeyTsimfer · 2024-07-16T08:47:07Z

batchflow/models/torch/base.py

+            elif isinstance(outputs, list):
+                result = list(result.values())


what about tuple / set?

SergeyTsimfer · 2024-07-16T08:49:56Z

batchflow/models/torch/base.py

-    def compute_outputs(self, predictions):
-        """ Produce additional outputs, defined in the config, from `predictions`.
-        Also adds a number of aliases to predicted tensors.
+    def compute_outputs(self, predictions, operations):


Probably, can move code for adding predictions / loss here

SergeyTsimfer · 2024-07-16T08:50:51Z

batchflow/models/torch/base.py

+            elif isinstance(operation, LayerHook):
+                operation.close()
+                result = operation.activation
+            elif isinstance(operation, str) and re.match(r"predictions_[0-9]+", operation):


either don't use regexp here or compile it once at the module / class level

SergeyTsimfer · 2024-07-16T08:51:22Z

batchflow/models/torch/base.py

            else:
+                if isinstance(predictions, (tuple, list)) and not len(predictions) == 1:
+                    raise ValueError('Default operations can`t be applicable to multi output predictions.')


can't be applied, maybe?

SergeyTsimfer · 2024-07-16T08:51:57Z

batchflow/models/torch/base.py

-        """ Add the hooks to all outputs that look like a layer id. """
-        result = []
-        for output_name in outputs:
+        """ Add the hooks to all outputs that look like a layer id. Also convert outputs to dict"""


SergeyTsimfer · 2024-07-16T08:53:08Z

batchflow/models/torch/base.py

+                elif callable(output):
+                    processed_outputs[output.__name__] = output


Why is that needed? callables are hashable

SergeyTsimfer

Other than the naming, this PR looks great. If possible, make a better distinction between internal/external variables, and it is ready to merge.

Good job:)

SergeyTsimfer · 2024-07-17T11:08:18Z

batchflow/models/torch/base.py

-    def compute_outputs(self, predictions):
-        """ Produce additional outputs, defined in the config, from `predictions`.
-        Also adds a number of aliases to predicted tensors.
+    def compute_outputs(self, predictions, operations, targets=None, loss=None):


I would say that renaming operations to requested_outputs would make things much easier to understand. Or maybe to outputs_dict, to keep in line with the rest of the code

SergeyTsimfer · 2024-07-17T11:09:21Z

batchflow/models/torch/base.py

+                elif targets is not None:
+                    targets = self.transfer_to_device(targets)
+                    loss = self.loss(predictions, targets)
+                    result[name] = loss


What do you think about keeping this part in the _predict method?

SergeyTsimfer · 2024-07-17T11:11:48Z

batchflow/models/torch/base.py

                else:
                    raise ValueError(f'Unknown type of operation `{operation}`!')
-                name = operation
-        return result, name
+        return result


    def prepare_outputs(self, outputs):


prepare_outputs sounds very much alike to compute_outputs, while the intent here is to `prepare user-passed argument "outputs" to a form that we internally use". Do you see any better names?

Convert outputs to dict

97b31e1

BulatVakhitov requested a review from a team July 15, 2024 11:01

BulatVakhitov added 3 commits July 16, 2024 08:12

fix merging

65f91fd

Merge branch 'master' into fix_outputs

e23a47a

change tensor to predictions

e3b0428

SergeyTsimfer reviewed Jul 16, 2024

View reviewed changes

BulatVakhitov added 5 commits July 16, 2024 12:41

fix PR comments

c5b67e8

remove extract_outputs

38d68aa

fix docstrings

3d68e21

fix case when outputs None

23cd149

small fix

d68122c

SergeyTsimfer reviewed Jul 17, 2024

View reviewed changes

some renaming

3608ce0

BulatVakhitov requested a review from SergeyTsimfer July 19, 2024 12:36

SergeyTsimfer approved these changes Jul 22, 2024

View reviewed changes

AlexeyKozhevin approved these changes Jul 29, 2024

View reviewed changes

SergeyTsimfer merged commit 86fe1f3 into master Jul 29, 2024
37 checks passed

SergeyTsimfer deleted the fix_outputs branch July 29, 2024 08:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Convert outputs to dict #757

Convert outputs to dict #757

BulatVakhitov commented Jul 15, 2024 •

edited

Loading

SergeyTsimfer Jul 16, 2024

SergeyTsimfer Jul 16, 2024

SergeyTsimfer Jul 16, 2024

SergeyTsimfer Jul 16, 2024

SergeyTsimfer Jul 16, 2024

SergeyTsimfer Jul 16, 2024

SergeyTsimfer Jul 16, 2024

SergeyTsimfer Jul 16, 2024

SergeyTsimfer left a comment

SergeyTsimfer Jul 17, 2024

SergeyTsimfer Jul 17, 2024

SergeyTsimfer Jul 17, 2024

		elif isinstance(outputs, list):
		result = list(result.values())

		elif callable(output):
		processed_outputs[output.__name__] = output

Convert outputs to dict #757

Convert outputs to dict #757

Conversation

BulatVakhitov commented Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SergeyTsimfer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

BulatVakhitov commented Jul 15, 2024 •

edited

Loading