NVIDIA · okuchaiev · Feb 20, 2020 · Feb 13, 2020 · Feb 14, 2020 · Feb 18, 2020
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -70,6 +70,8 @@ To release a new version, please update the changelog as followed:
 ## [Unreleased]
 
 ### Added
+- New Neural Type System documentation. Also added decorator to generate docs for input/output ports.
+([PR #370](https://github.com/NVIDIA/NeMo/pull/370)) - @okuchaiev
 - New Neural Type System and its tests.
 ([PR #307](https://github.com/NVIDIA/NeMo/pull/307)) - @okuchaiev
 - Named tensors tuple module's output for graph construction.

diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -4,9 +4,9 @@
 
 2) Make sure you sign your commits. E.g. use ``git commit -s`` when commiting
 
-3) Make sure all unittests finish successfully before sending PR
+3) Make sure all unittests finish successfully before sending PR ``python -m unittest`` from NeMo's root folder
 
-4) Send your Pull Request to `master` branch
+4) Send your Pull Request to the `master` branch
 
 
 # Collection Guidelines
@@ -28,9 +28,8 @@ Please note that CI needs to pass for all the modules and collections.
 1. **Sensible**: code should make sense. If you think a piece of code might be confusing, write comments.
 
 ## Python style
-We follow [PEP 8 style guide](https://www.python.org/dev/peps/pep-0008/) and we incorporate [pycodestyle](https://pypi.org/project/pycodestyle/) into our CI pipeline to check for style. Make sure that your code passes PEP 8 before creating a Pull Request.
-
-There are several tools to automatically format your code to be PEP 8 compliant, such as [autopep8](https://github.com/hhatto/autopep8). Your text editor might support its own auto PEP 8 plugin.
+We use ``black`` as our style guide. To check whether your code will pass style check (from the NeMo's repo folder) run:
+``python setup.py style`` and if it does not pass run ``python setup.py style --fix``.
 
 1. Avoid wild import: ``from X import *`` unless in ``X.py``, ``__all__`` is defined.
 1. Minimize the use of ``**kwargs``.
@@ -47,7 +46,10 @@ There are several tools to automatically format your code to be PEP 8 compliant,
 1. If a comment lasts multiple lines, use ``'''`` instead of ``#``.
 
 ## Nemo style
-1. If you import a module from the same collection, use relative path instead of absolute path. For example, inside ``nemo_nlp``, use ``.utils`` instead of ``nemo_nelp.utils``.
+1. Use absolute paths.
 1. Before accessing something, always make sure that it exists.
 1. Right inheritance. For example, if a module doesn't have any trainable weights, don't inherit from TrainableNM.
 1. Naming consistency, both within NeMo and between NeMo and external literature. E.g. use the name ``logits`` for ``log_probs``, ``hidden_size`` for ``d_model``.
+1. Make an effort to use the right Neural Types when designing your neural modules. If a type you need does not
+ exists - you can introduce one. See documentation on how to do this
+1. When creating input/ouput ports for your modules use "add_port_docs" decorator to nicely generate docs for them
diff --git a/docs/sources/source/tutorials/neuraltypes.rst b/docs/sources/source/tutorials/neuraltypes.rst
@@ -1,63 +1,166 @@
 Neural Types
 ============
 
-Neural Types are used to check input tensors to make sure that two neural modules are compatible, and catch
-semantic and dimensionality errors.
+Basics
+~~~~~~
 
-Neural Types are implemented by :class:`NeuralType<nemo.core.neural_types.NeuralType>` class which is a mapping from Tensor's axis to :class:`AxisType<nemo.core.neural_types.AxisType>`.
+All input and output ports of every neural module in NeMo are typed.
+The type system's goal is check compatibility of connected input/output port pairs.
+The type system's constraints are checked when the user connects modules with each other and before any training or
+inference is started.
 
-:class:`AxisType<nemo.core.neural_types.AxisType>` contains following information per axis:
+Neural Types are implemented with the Python class :class:`NeuralType<nemo.core.neural_types.NeuralType>` and helper
+classes derived from :class:`ElementType<nemo.core.neural_types.ElementType>`, :class:`AxisType<nemo.core
+.neural_types.AxisType>` and :class:`AxisKindAbstract<nemo.core.neural_types.AxisKindAbstract>`.
 
-* Semantic Tag, which must inherit from :class:`BaseTag<nemo.core.neural_types.BaseTag>`, for example: :class:`BatchTag<nemo.core.neural_types.BatchTag>`, :class:`ChannelTag<nemo.core.neural_types.ChannelTag>`, :class:`TimeTag<nemo.core.neural_types.TimeTag>`, etc. These tags can be related via `is-a` inheritance.
-* Dimension: unsigned integer
-* Descriptor: string
+**A Neural Type contains two categories of information:**
 
+* **axes** - represents what varying a particular axis means (e.g. batch, time, etc.)
+* **elements_type** - represents the semantics and properties of what is stored inside the activations (audio signal,text embedding, logits, etc.)
 
-To instantiate a NeuralType you should pass it a dictionary (axis2type) which will map axis to it's AxisType.
-For example, a ResNet18 input and output ports can be described as:
+
+To instantiate a NeuralType you need to pass it the following arguments: `axes: Optional[Tuple] = None,
+elements_type: ElementType = VoidType(), optional=False`. Typically, the only place where you need to instantiate
+:class:`NeuralType<nemo.core.neural_types.NeuralType>` objects are inside your module's `input_ports` and
+`output_ports` properties.
+
+
+Consider an example below. It represents an (audio) data layer output ports, used in Speech recognition collection.
 
 .. code-block:: python
 
-    input_ports = {"x": NeuralType({0: AxisType(BatchTag),
-                                    1: AxisType(ChannelTag),
-                                    2: AxisType(HeightTag, 224),
-                                    3: AxisType(WidthTag, 224)})}
-    output_ports = {"output": NeuralType({
-                                    0: AxisType(BatchTag),
-                                    1: AxisType(ChannelTag)})}
+        {
+            'audio_signal': NeuralType(axes=(AxisType(kind=AxisKind.Batch, size=None, is_list=False),
+                                             AxisType(kind=AxisKind.Time, size=None, is_list=False)),
+                                       elements_type=AudioSignal(freq=self._sample_rate)),
+            'a_sig_length': NeuralType(axes=tuple(AxisType(kind=AxisKind.Batch, size=None, is_list=False)),
+                                       elements_type=LengthsType()),
+            'transcripts': NeuralType(axes=(AxisType(kind=AxisKind.Batch, size=None, is_list=False),
+                                             AxisType(kind=AxisKind.Time, size=None, is_list=False)),
+                                      elements_type=LabelsType()),
+            'transcript_length': NeuralType(axes=tuple(AxisType(kind=AxisKind.Batch, size=None, is_list=False)),
+                                            elements_type=LengthsType()),
+        }
+
+A less verbose version of exactly the same output ports looks like this:
 
+.. code-block:: python
 
+        {
+            'audio_signal': NeuralType(('B', 'T'), AudioSignal(freq=self._sample_rate)),
+            'a_sig_length': NeuralType(tuple('B'), LengthsType()),
+            'transcripts': NeuralType(('B', 'T'), LabelsType()),
+            'transcript_length': NeuralType(tuple('B'), LengthsType()),
+        }
 
-**Neural type comparison**
 
-Two :class:`NeuralType<nemo.core.neural_types.NeuralType>` objects can be compared using ``.compare`` method.
-The result is:
+
+Neural type comparison
+~~~~~~~~~~~~~~~~~~~~~~
+
+Two :class:`NeuralType<nemo.core.neural_types.NeuralType>` objects are compared using ``.compare`` method.
+The result is from the :class:`NeuralTypeComparisonResult<nemo.core.neural_types.NeuralTypeComparisonResult>`:
 
 .. code-block:: python
 
     class NeuralTypeComparisonResult(Enum):
-      """The result of comparing two neural type objects for compatibility.
-      When comparing A.compare_to(B):"""
-      SAME = 0
-      LESS = 1  # A is B
-      GREATER = 2  # B is A
-      DIM_INCOMPATIBLE = 3  # Resize connector might fix incompatibility
-      TRANSPOSE_SAME = 4 # A transpose will make them same
-      INCOMPATIBLE = 5  # A and B are incompatible. Can't fix incompatibility automatically
+        """The result of comparing two neural type objects for compatibility.
+        When comparing A.compare_to(B):"""
+
+        SAME = 0
+        LESS = 1  # A is B
+        GREATER = 2  # B is A
+        DIM_INCOMPATIBLE = 3  # Resize connector might fix incompatibility
+        TRANSPOSE_SAME = 4  # A transpose and/or converting between lists and tensors will make them same
+        CONTAINER_SIZE_MISMATCH = 5  # A and B contain different number of elements
+        INCOMPATIBLE = 6  # A and B are incompatible
+        SAME_TYPE_INCOMPATIBLE_PARAMS = 7  # A and B are of the same type but parametrized differently
+
+
+Special cases
+~~~~~~~~~~~~~
+
+* **Void** element types. Sometimes, it is necessary to have a functionality similar to "void*" in C/C++. That, is if we still want to enforce order and axes' semantics but should be able to accept elements of any type. This can be achieved by using an instance of :class:`VoidType<nemo.core.neural_types.VoidType>` as ``elements_type`` argument.
+* **Big void** this type will effectively disable any type checks. This is how to create such type: ``NeuralType()``. The result of its comparison to any other type will always be SAME.
+* **AxisKind.Any** this axis kind is used to represent any axis. This is useful, for example, in losses where a specific loss module can be used in difference applications and therefore with different axis kinds
+
+Inheritance
+~~~~~~~~~~~
+
+Type inheritance is a very powerful tool in programming. NeMo's neural types support inheritance. Consider the
+following example below.
+
+**Example.** We want to represent the following. A module's A output (out1) produces mel-spectrogram
+signal, while module's B output produces mffc-spectrogram. We also want to a thrid module C which can perform data
+augmentation with any kind of spectrogram. With NeMo neural types representing this semantics is easy:
 
+.. code-block:: python
+
+    input = NeuralType(('B', 'D', 'T'), SpectrogramType())
+    out1 = NeuralType(('B', 'D', 'T'), MelSpectrogramType())
+    out2 = NeuralType(('B', 'D', 'T'), MFCCSpectrogramType())
+
+    # then the following comparison results will be generated
+    input.compare(out1) == SAME
+    input.compare(out2) == SAME
+    out1.compare(input) == INCOMPATIBLE
+    out2.compare(out1) == INCOMPATIBLE
+
+This happens because both ``MelSpectrogramType`` and ``MFCCSpectrogramType`` inherit from ``SpectrogramType`` class.
+Notice, that mfcc and mel spectrograms aren't interchangable, which is why ``out1.compare(input) == INCOMPATIBLE``
 
-**Special cases**
+Advanced usage
+~~~~~~~~~~~~~~
 
-* *Non-tensor* objects should be denoted as ``NeuralType(None)``
-* *Optional*: input is as optional, if input is provided the type compatibility will be checked
-* *Root* type is denoted by ``NeuralType({})``: A port of ``NeuralType({})`` type must accept NmTensors of any NeuralType:
+**Extending with user-defined types.** If you need to add your own element types, create a new class inheriting from
+:class:`ElementType<nemo.core.neural_types.ElementType>`. Instead of using built-in axes kinds from
+:class:`AxisKind<nemo.core.neural_types.AxisKind>`, you can define your own
+by creating a new Python enum which should inherit from :class:`AxisKindAbstract<nemo.core.neural_types.AxisKindAbstract>`.
+
+**Lists**. Sometimes module's input or output should be a list (possibly nested) of Tensors. NeMo's
+:class:`AxisType<nemo.core.neural_types.AxisType>` class accepts ``is_list`` argument which could be set to True.
+Consider the example below:
 
 .. code-block:: python
 
-    root_type = NeuralType({})
-    root_type.compare(any_other_neural_type) == NeuralTypeComparisonResult.SAME
+        T1 = NeuralType(
+            axes=(
+                AxisType(kind=AxisKind.Batch, size=None, is_list=True),
+                AxisType(kind=AxisKind.Time, size=None, is_list=True),
+                AxisType(kind=AxisKind.Dimension, size=32, is_list=False),
+                AxisType(kind=AxisKind.Dimension, size=128, is_list=False),
+                AxisType(kind=AxisKind.Dimension, size=256, is_list=False),
+            ),
+            elements_type=ChannelType(),
+        )
+
+In this example, first two axes are lists. That is the object are list of lists of rank 3 tensors with dimensions
+(32x128x256). Note that all list axes must come before any tensor axis.
+
+.. tip::
+    We strongly recommend this to be avoided, if possible, and tensors used instead (perhaps) with padding.
 
-See "nemo/tests/test_neural_types.py" for more examples.
+
+**Named tuples (structures).** To represent struct-like objects, for example, bounding boxes in computer vision, use
+the following syntax:
+
+.. code-block:: python
+
+        class BoundingBox(ElementType):
+            def __str__(self):
+                return "bounding box from detection model"
+            def fields(self):
+                return ("X", "Y", "W", "H")
+        # ALSO ADD new, user-defined, axis kind
+        class AxisKind2(AxisKindAbstract):
+            Image = 0
+        T1 = NeuralType(elements_type=BoundingBox(),
+                        axes=(AxisType(kind=AxisKind.Batch, size=None, is_list=True),
+                              AxisType(kind=AxisKind2.Image, size=None, is_list=True)))
+
+In the example above, we create a special "element type" class for BoundingBox which stores exactly 4 values.
+We also, add our own axis kind (Image). So the final Neural Type (T1) represents lists (for batch) of lists (for
+image) of bounding boxes. Under the hood it should be list(lists(4x1 tensors)).
 
 
 **Neural Types help us to debug models**
@@ -76,6 +179,5 @@ For example, module should concatenate (add) two input tensors X and Y along dim
 
 A module expects image of size 224x224 but gets 256x256. The type comparison will result in ``NeuralTypeComparisonResult.DIM_INCOMPATIBLE`` .
 
-.. note::
-    This type mechanism is represented by Python inheritance. That is, :class:`NmTensor<nemo.core.neural_types.NmTensor>` class inherits from :class:`NeuralType<nemo.core.neural_types.NeuralType>` class.
+
 
diff --git a/nemo/backends/pytorch/common/losses.py b/nemo/backends/pytorch/common/losses.py
@@ -3,6 +3,7 @@
 
 from nemo.backends.pytorch.nm import LossNM
 from nemo.core.neural_types import LabelsType, LogitsType, LossType, NeuralType, RegressionValuesType
+from nemo.utils.decorators import add_port_docs
 
 __all__ = ['SequenceLoss', 'CrossEntropyLoss', 'MSELoss']
 
@@ -32,18 +33,16 @@ class SequenceLoss(LossNM):
     """
 
     @property
+    @add_port_docs()
     def input_ports(self):
         """Returns definitions of module input ports.
         """
         return {'log_probs': NeuralType(axes=('B', 'T', 'D')), 'targets': NeuralType(axes=('B', 'T'))}
 
     @property
+    @add_port_docs()
     def output_ports(self):
         """Returns definitions of module output ports.
-
-        loss:
-            NeuralType(None)
-
         """
         return {"loss": NeuralType(elements_type=LossType())}
 
@@ -103,6 +102,7 @@ class CrossEntropyLoss(LossNM):
     """
 
     @property
+    @add_port_docs()
     def input_ports(self):
         """Returns definitions of module input ports.
         """
@@ -112,6 +112,7 @@ def input_ports(self):
         }
 
     @property
+    @add_port_docs()
     def output_ports(self):
         """Returns definitions of module output ports.
 
@@ -133,6 +134,7 @@ def _loss_function(self, logits, labels):
 
 class MSELoss(LossNM):
     @property
+    @add_port_docs()
     def input_ports(self):
         """Returns definitions of module input ports.
 
@@ -148,6 +150,7 @@ def input_ports(self):
         }
 
     @property
+    @add_port_docs()
     def output_ports(self):
         """Returns definitions of module output ports.
 

diff --git a/nemo/backends/pytorch/common/rnn.py b/nemo/backends/pytorch/common/rnn.py
@@ -23,6 +23,7 @@
 from nemo.backends.pytorch.common.parts import Attention
 from nemo.backends.pytorch.nm import TrainableNM
 from nemo.core import *
+from nemo.utils.decorators import add_port_docs
 from nemo.utils.misc import pad_to
 
 __all__ = ['DecoderRNN', 'EncoderRNN']
@@ -65,6 +66,7 @@ class DecoderRNN(TrainableNM):
     """
 
     @property
+    @add_port_docs()
     def input_ports(self):
         """Returns definitions of module input ports.
         """
@@ -78,6 +80,7 @@ def input_ports(self):
         }
 
     @property
+    @add_port_docs()
     def output_ports(self):
         """Returns definitions of module output ports.
         """
@@ -203,6 +206,7 @@ class EncoderRNN(TrainableNM):
     """ Simple RNN-based encoder using GRU cells """
 
     @property
+    @add_port_docs()
     def input_ports(self):
         """Returns definitions of module input ports.
         """
@@ -214,6 +218,7 @@ def input_ports(self):
         }
 
     @property
+    @add_port_docs()
     def output_ports(self):
         """Returns definitions of module output ports.
         """

diff --git a/nemo/backends/pytorch/common/search.py b/nemo/backends/pytorch/common/search.py
@@ -4,6 +4,7 @@
 
 from nemo.backends.pytorch.nm import NonTrainableNM
 from nemo.core.neural_types import ChannelType, NeuralType
+from nemo.utils.decorators import add_port_docs
 
 INF = float('inf')
 BIG_NUM = 1e4
@@ -29,6 +30,7 @@ class GreedySearch(NonTrainableNM):
     """
 
     @property
+    @add_port_docs()
     def input_ports(self):
         """Returns definitions of module input ports.
         """
@@ -40,6 +42,7 @@ def input_ports(self):
         }
 
     @property
+    @add_port_docs()
     def output_ports(self):
         """Returns definitions of module output ports.