Fix broken links (#8825)

* Fix broken links * reverting the change to on prem requirements creating a separate PR for this * Fix broken links * zapier * resolve feedback * resolve issues * resolve issues * try to remove attributions by renaming it * restore link to swagger REST API
determined-ai · Feb 21, 2024 · dc3e41e · dc3e41e
1 parent bccdf0c
commit dc3e41e
Show file tree

Hide file tree

Showing 45 changed files with 188 additions and 214 deletions.
diff --git a/docs/get-started/architecture/introduction.rst b/docs/get-started/architecture/introduction.rst
@@ -859,5 +859,5 @@ Reference
 ---------
 
 -  YAML: https://learnxinyminutes.com/docs/yaml/
--  Validate YAML: http://www.yamllint.com/
+-  Validate YAML: https://www.yamllint.com/
 -  Convert YAML to JSON: https://www.json2yaml.com/convert-yaml-to-json
diff --git a/docs/get-started/example-solutions/_index.rst b/docs/get-started/example-solutions/_index.rst
@@ -6,7 +6,7 @@
 
 Start with an example machine learning model converted to Determined's APIs. Code examples are in
 the ``examples/`` subdirectory of the `Determined GitHub repo
-<https://github.com/determined-ai/determined/tree/master/examples>`__. Download links are below.
+<https://github.com/determined-ai/determined/tree/main/examples>`__. Download links are below.
 
 For more examples, visit the `determined-examples repo
 <https://github.com/determined-ai/determined-examples/>`__.

diff --git a/docs/integrations/notification/zapier.rst b/docs/integrations/notification/zapier.rst
@@ -19,9 +19,8 @@ The steps to set up Zapier webhook are:
  Creating a Zap with Webhook
 *****************************
 
-First, you need to create a Zap with webhook. Visit the `Zapier Website
-<https://zapier.com/app/zaps>`_, signup if you haven't already, and click on the **Create Zap**
-button.
+First, you need to create a Zap with webhook. Visit `Zapier <https://zapier.com/>`_, signup if you
+haven't already, and click on the **Create Zap** button.
 
 Select **Webhooks by Zapier** as trigger **Catch Raw Hook** as event. Using **Catch Raw Hook**
 intead of **Catch Hook** because headers are needed to verify each webhook request.

diff --git a/docs/integrations/prometheus/_index.rst b/docs/integrations/prometheus/_index.rst
@@ -26,7 +26,7 @@ can be enabled in the master configuration file.
  Reference
 ***********
 
-`Grafana <https://grafana.com/docs/grafana/latest/installation/>`__
+`Grafana <https://grafana.com/docs/grafana/latest/setup-grafana/installation/>`__
 
 `Prometheus <https://prometheus.io/docs/prometheus/latest/installation/>`__
 
@@ -156,7 +156,7 @@ source. After the Grafana server is running and the Web UI is accessible, follow
    running Prometheus server address. By default, this is the machine address on port 9090.
 
 #. After the Prometheus data source connects, import the `Determined Hardware Metrics dashboard JSON
-   <https://github.com/determined-ai/works-with-determined/blob/master/observability/grafana/determined-hardware-grafana.json>`__
+   <https://github.com/determined-ai/works-with-determined/blob/main/observability/grafana/determined-hardware-grafana.json>`__
    file in **Grafana** -> **Create** -> **Import** -> **Import using panel JSON**.
 
 *********

diff --git a/docs/manage/elasticsearch-logging-backend.rst b/docs/manage/elasticsearch-logging-backend.rst
@@ -5,15 +5,14 @@
 ##############################
 
 Use this guide as a reference when considering a shift from the default logging backend to
-`Elasticsearch <https://www.elastic.co/what-is/elasticsearch>`__ for optimized log storage and
-analysis.
+`Elasticsearch <https://www.elastic.co/elasticsearch>`__ for optimized log storage and analysis.
 
 We'll discuss the limitations of the default logging backend and provide tips and guidelines for
 migrating to Elasticsearch including how to tune Elasticsearch to work best with Determined.
 
-`Elasticsearch <https://www.elastic.co/what-is/elasticsearch>`__ is a search engine commonly used
-for storing application logs for search and analytics. Determined supports using Elasticsearch as
-the storage backend for task logs. Configuring Determined to use Elasticsearch is simple; however,
+`Elasticsearch <https://www.elastic.co/elasticsearch>`__ is a search engine commonly used for
+storing application logs for search and analytics. Determined supports using Elasticsearch as the
+storage backend for task logs. Configuring Determined to use Elasticsearch is simple; however,
 managing an Elasticsearch cluster at scale is an involved task, so this guide is recommended for
 users who have hit the limitations of the default logging backend.
 

diff --git a/docs/manage/troubleshooting.rst b/docs/manage/troubleshooting.rst
@@ -38,9 +38,9 @@ Make sure you back up the database and temporarily shut down the master before p
 
 To fix this error message, locate the up migration with a suffix of ``.up.sql`` and a prefix
 matching the long number in the error message in `this directory
-<https://github.com/determined-ai/determined/tree/master/master/static/migrations>_` and carefully
-run the SQL within the file manually against the database used by Determined. For convenience, all
-the information needed to connect except the password can be found with:
+<https://github.com/determined-ai/determined/tree/main/master/static/migrations>_` and carefully run
+the SQL within the file manually against the database used by Determined. For convenience, all the
+information needed to connect except the password can be found with:
 
 .. code::
 
@@ -53,7 +53,7 @@ If this proceeds successfully, then mark the migration as successful by running
    UPDATE schema_migrations SET dirty = false;
 
 And restart the master. Otherwise, please seek assistance in the community `Slack
-<https://join.slack.com/t/determined-community/shared_invite/zt-1f4hj60z5-JMHb~wSr2xksLZVBN61g_Q>`__.
+<https://determined-community.slack.com/join/shared_invite/zt-1f4hj60z5-JMHb~wSr2xksLZVBN61g_Q>`__.
 
 .. _validate-nvidia-container-toolkit:
 

diff --git a/docs/model-dev-guide/api-guides/apis-howto/api-core-ug-basic.rst b/docs/model-dev-guide/api-guides/apis-howto/api-core-ug-basic.rst
@@ -36,7 +36,7 @@ This user guide shows you how to get started using the Core API.
 
 Access the tutorial files via the :download:`core_api.tgz </examples/core_api.tgz>` download or
 directly from the `Github repository
-<https://github.com/determined-ai/determined/tree/master/examples/tutorials/core_api>`_.
+<https://github.com/determined-ai/determined/tree/main/examples/tutorials/core_api>`_.
 
 *****************
  Getting Started
@@ -132,7 +132,7 @@ with only a few new lines of code.
 
 The complete ``1_metrics.py`` and ``1_metrics.yaml`` listings used in this example can be found in
 the :download:`core_api.tgz </examples/core_api.tgz>` download or in the `Github repository
-<https://github.com/determined-ai/determined/tree/master/examples/tutorials/core_api>`_.
+<https://github.com/determined-ai/determined/tree/main/examples/tutorials/core_api>`_.
 
 .. _core-checkpoints:
 
@@ -212,7 +212,7 @@ trial ID in the checkpoint and use it to distinguish the two types of continues.
 
 The complete ``2_checkpoints.py`` and ``2_checkpoints.yaml`` listings used in this example can be
 found in the :download:`core_api.tgz </examples/core_api.tgz>` download or in the `Github repository
-<https://github.com/determined-ai/determined/tree/master/examples/tutorials/core_api>`_.
+<https://github.com/determined-ai/determined/tree/main/examples/tutorials/core_api>`_.
 
 .. _core-hpsearch:
 
@@ -292,7 +292,7 @@ runs a train-validate-report loop:
 
 The complete ``3_hpsearch.py`` and ``3_hpsearch.yaml`` listings used in this example can be found in
 the :download:`core_api.tgz </examples/core_api.tgz>` download or in the `Github repository
-<https://github.com/determined-ai/determined/tree/master/examples/tutorials/core_api>`_.
+<https://github.com/determined-ai/determined/tree/main/examples/tutorials/core_api>`_.
 
 .. _core-distributed:
 
@@ -420,4 +420,4 @@ considerations are:
 
 The complete ``4_distributed.py`` and ``3_hpsearch.yaml`` listings used in this example can be found
 in the :download:`core_api.tgz </examples/core_api.tgz>` download or in the `Github repository
-<https://github.com/determined-ai/determined/tree/master/examples/tutorials/core_api>`_.
+<https://github.com/determined-ai/determined/tree/main/examples/tutorials/core_api>`_.
diff --git a/docs/model-dev-guide/api-guides/apis-howto/api-core-ug.rst b/docs/model-dev-guide/api-guides/apis-howto/api-core-ug.rst
@@ -66,7 +66,7 @@ Create a new directory.
 
 Access the tutorial files via the :download:`core_api_pytorch_mnist.tgz
 </examples/core_api_pytorch_mnist.tgz>` download link or directly from the `Github repository
-<https://github.com/determined-ai/determined/tree/master/examples/tutorials/core_api_pytorch_mnist>`_.
+<https://github.com/determined-ai/determined/tree/main/examples/tutorials/core_api_pytorch_mnist>`_.
 These scripts have already been modified to fit the steps outlined in this tutorial.
 
 In this initial step, we’ll run our experiment using the ``model_def.py`` script and its
@@ -521,7 +521,7 @@ skipping batch 1, warming up on batch 2, profiling batches 3 and 4, then repeati
 files will be uploaded to the experiment's TensorBoard path and can be viewed under the "PyTorch
 Profiler" tab in the Determined Tensorboard UI.
 
-See `PyTorch Profiler <https://github.com/pytorch/kineto/tree/master/tb_plugin>`_ documentation for
+See `PyTorch Profiler <https://github.com/pytorch/kineto/tree/main/tb_plugin>`_ documentation for
 details.
 
 .. code:: python

diff --git a/docs/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.rst b/docs/model-dev-guide/api-guides/apis-howto/api-pytorch-ug.rst
@@ -268,7 +268,7 @@ finding the common code snippet: ``for batch in dataloader``. In Determined,
 :meth:`~determined.pytorch.PyTorchTrial.train_batch` also works with one batch at a time.
 
 Take `this script implemented with the native PyTorch
-<https://github.com/pytorch/examples/blob/master/imagenet/main.py>`_ as an example. It has the
+<https://github.com/pytorch/examples/blob/main/imagenet/main.py>`_ as an example. It has the
 following code for the training loop.
 
 .. code:: python
@@ -410,8 +410,8 @@ training") is easy if you follow a few rules.
 -  Even if you are going to ultimately return an IterableDataset, it is best to use PyTorch's
    Sampler class as the basis for choosing the order of records. Operations on Samplers are quick
    and cheap, while operations on data afterwards are expensive. For more details, see the
-   discussion of random vs sequential access `here <https://yogadl.readthedocs.io>`_. If you don't
-   have a custom sampler, start with a simple one:
+   discussion of random vs sequential access `here <https://yogadl.readthedocs.io/en/latest/>`_. If
+   you don't have a custom sampler, start with a simple one:
 
    ..
       code::python
@@ -568,8 +568,8 @@ Remove Pinned GPUs
 Determined handles scheduling jobs on available slots. However, you need to let the Determined
 library handles choosing the GPUs.
 
-Take `this script <https://github.com/pytorch/examples/blob/master/imagenet/main.py>`_ as an
-example. It has the following code to configure the GPU:
+Take `this script <https://github.com/pytorch/examples/blob/main/imagenet/main.py>`_ as an example.
+It has the following code to configure the GPU:
 
 .. code:: python
 
@@ -585,8 +585,8 @@ To run distributed training outside Determined, you need to have code that handl
 launching processes, moving models to pined GPUs, sharding data, and reducing metrics. You need to
 remove this code to be not conflict with the Determined library.
 
-Take `this script <https://github.com/pytorch/examples/blob/master/imagenet/main.py>`_ as an
-example. It has the following code to initialize the process group:
+Take `this script <https://github.com/pytorch/examples/blob/main/imagenet/main.py>`_ as an example.
+It has the following code to initialize the process group:
 
 .. code:: python
 

diff --git a/docs/model-dev-guide/api-guides/apis-howto/deepspeed/_index.rst b/docs/model-dev-guide/api-guides/apis-howto/deepspeed/_index.rst
@@ -15,9 +15,9 @@ In this guide, you'll learn how to use the DeepSpeed API.
 | :ref:`deepspeed-reference`                                            |
 +-----------------------------------------------------------------------+
 
-`DeepSpeed <https://deepspeed.ai/>`_ is a Microsoft library that supports large-scale, distributed
-learning with sharded optimizer state training and pipeline parallelism. Determined supports
-DeepSpeed with the :class:`~determined.pytorch.deepspeed.DeepSpeedTrial` API.
+`DeepSpeed <https://www.deepspeed.ai/>`_ is a Microsoft library that supports large-scale,
+distributed learning with sharded optimizer state training and pipeline parallelism. Determined
+supports DeepSpeed with the :class:`~determined.pytorch.deepspeed.DeepSpeedTrial` API.
 :class:`~determined.pytorch.deepspeed.DeepSpeedTrial` provides a way to use an automated training
 loop with DeepSpeed.
 

diff --git a/docs/model-dev-guide/api-guides/apis-howto/deepspeed/advanced.rst b/docs/model-dev-guide/api-guides/apis-howto/deepspeed/advanced.rst
@@ -23,7 +23,7 @@ engine passed to :meth:`~determined.pytorch.deepspeed.DeepSpeedTrialContext.wrap
 
 For more advanced cases where model engines have different model parallel topologies, contact
 support on the Determined `community Slack
-<https://join.slack.com/t/determined-community/shared_invite/zt-1f4hj60z5-JMHb~wSr2xksLZVBN61g_Q>`_.
+<https://determined-community.slack.com/join/shared_invite/zt-1f4hj60z5-JMHb~wSr2xksLZVBN61g_Q>`_.
 
 *****************
  Custom Reducers

diff --git a/docs/model-dev-guide/api-guides/apis-howto/deepspeed/autotuning.rst b/docs/model-dev-guide/api-guides/apis-howto/deepspeed/autotuning.rst
@@ -126,7 +126,7 @@ instance requires no further changes to your code.
 
 For a complete example of how to use DeepSpeed Autotune with ``DeepSpeedTrial``, visit the
 `Determined GitHub Repo
-<https://github.com/determined-ai/determined/tree/master/examples/deepspeed_autotune/torchvision/deepspeed_trial>`__
+<https://github.com/determined-ai/determined/tree/main/examples/deepspeed_autotune/torchvision/deepspeed_trial>`__
 and navigate to ``examples/deepspeed_autotune/torchvision/deepspeed_trial`` .
 
 .. note::
@@ -164,7 +164,7 @@ so there is no need to remove the context manager after the ``dsat`` trials have
 
 For a complete example of how to use DeepSpeed Autotune with Core API, visit the `Determined GitHub
 Repo
-<https://github.com/determined-ai/determined/tree/master/examples/deepspeed_autotune/torchvision/core_api>`__
+<https://github.com/determined-ai/determined/tree/main/examples/deepspeed_autotune/torchvision/core_api>`__
 and navigate to ``examples/deepspeed_autotune/torchvision/core_api`` .
 
 Hugging Face Trainer
@@ -215,8 +215,8 @@ relevant code:
       ``dsat_reporting_context`` context manager.
 
 To find examples that use DeepSpeed Autotune with Hugging Face Trainer, visit the `Determined GitHub
-Repo <https://github.com/determined-ai/determined/tree/master/examples/hf_trainer_api>`__ and
-navigate to ``examples/hf_trainer_api``.
+Repo <https://github.com/determined-ai/determined/tree/main/examples/hf_trainer_api>`__ and navigate
+to ``examples/hf_trainer_api``.
 
 ******************
  Advanced Options

diff --git a/docs/model-dev-guide/api-guides/apis-howto/deepspeed/deepspeed.rst b/docs/model-dev-guide/api-guides/apis-howto/deepspeed/deepspeed.rst
@@ -67,7 +67,7 @@ DeepSpeed training initialization consists of two steps:
 #. Create the DeepSpeed model engine.
 
 Refer to the `DeepSpeed Getting Started guide
-<https://www.deepspeed.ai/getting-started/#writing-deepspeed-models/>`_ for more information.
+<https://www.deepspeed.ai/getting-started/#writing-deepspeed-models>`_ for more information.
 
 Outside of Determined, this is typically done in the following way:
 
@@ -318,7 +318,7 @@ method.
 passed directly into ``torch.profiler.profile``. Stepping the profiler will be handled automatically
 during the training loop.
 
-See the `PyTorch profiler plugin <https://github.com/pytorch/kineto/tree/master/tb_plugin>`_ for
+See the `PyTorch profiler plugin <https://github.com/pytorch/kineto/tree/main/tb_plugin>`_ for
 details.
 
 The snippet below will profile GPU and CPU usage, skipping batch 1, warming up on batch 2, and

diff --git a/docs/model-dev-guide/api-guides/batch-process-api-ug.rst b/docs/model-dev-guide/api-guides/batch-process-api-ug.rst
@@ -169,10 +169,6 @@ You have the option to associate your batch inference run with the
 :class:~determined.experimental.model.ModelVersion employed during the run. This allows you to
 compile custom metrics for that specific object, which can then be analyzed at a later stage.
 
-The ``inference_example.py`` file in the `CIFAR10 Pytorch Example
-<https://github.com/determined-ai/determined/tree/main/examples/computer_vision/cifar10_pytorch>`__
-is an example.
-
 Connect the :class:`~determined.experimental.checkpoint.Checkpoint` or
 :class:`~determined.experimental.model.ModelVersion` to the inference run.
 

diff --git a/docs/model-dev-guide/debug-models.rst b/docs/model-dev-guide/debug-models.rst
@@ -117,9 +117,9 @@ This step assumes you have a working local environment for training. If you do n
    If your per-method checks in :ref:`Step 2 <step2>` passed but local test mode fails, your
    ``Trial`` subclass might not be implemented correctly. Double-check the documentation. It is also
    possible that you have found a bug or an invalid assumption in the Determined software and should
-   `file a GitHub issue <https://github.com/determined-ai/determined/issues/new>`__ or contact
+   `file a GitHub issue <https://github.com/determined-ai/determined/issues>`__ or contact
    Determined on `Slack
-   <https://join.slack.com/t/determined-community/shared_invite/zt-1f4hj60z5-JMHb~wSr2xksLZVBN61g_Q>`__.
+   <https://determined-community.slack.com/join/shared_invite/zt-1f4hj60z5-JMHb~wSr2xksLZVBN61g_Q>`__.
 
 .. _step4:
 
@@ -298,7 +298,7 @@ interactive environment, it is submitted to the cluster and managed by Determine
       has errors. Review the :ref:`experiment configuration <experiment-config-reference>`.
 
 If you are unable to identify the cause of the problem, contact Determined `community support
-<https://join.slack.com/t/determined-community/shared_invite/zt-1f4hj60z5-JMHb~wSr2xksLZVBN61g_Q>`__!
+<https://determined-community.slack.com/join/shared_invite/zt-1f4hj60z5-JMHb~wSr2xksLZVBN61g_Q>`__!
 
 .. _step8:
 

diff --git a/docs/model-dev-guide/dtrain/dtrain-implement.rst b/docs/model-dev-guide/dtrain/dtrain-implement.rst
@@ -226,23 +226,3 @@ important details regarding ``slots_per_trial`` and the scheduler's behavior:
    ``slots_per_trial`` is set so that it can be scheduled within these constraints. You can also use
    the CLI command ``det task list`` to check if any other tasks are using GPUs and preventing your
    experiment from using all the GPUs on a machine.
-
-***********************
- Distributed Inference
-***********************
-
-PyTorch users have the option to use the existing distributed training workflow with PyTorchTrial to
-accelerate their inference workloads. This workflow is not yet officially supported, therefore,
-users must specify certain training-specific artifacts that are not used for inference. To run a
-distributed batch inference job, create a new PyTorchTrial and follow these steps:
-
--  Load the trained model and build the inference dataset using ``build_validation_data_loader()``.
--  Specify the inference step using ``evaluate_batch()`` or ``evaluate_full_dataset()``.
--  Register a dummy ``optimizer``.
--  Specify a ``build_training_data_loader()`` that returns a dummy dataloader.
--  Specify a no-op ``train_batch()`` that returns an empty map of metrics.
-
-Once the new PyTorchTrial object is created, use the experiment configuration to distribute
-inference in the same way as training. `cifar10_pytorch_inference
-<https://github.com/determined-ai/determined/blob/master/examples/computer_vision/cifar10_pytorch_inference/>`_
-serves as an example of distributed batch inference.
diff --git a/docs/model-dev-guide/dtrain/dtrain-introduction.rst b/docs/model-dev-guide/dtrain/dtrain-introduction.rst
@@ -8,10 +8,11 @@
  How Determined Distributed Training Works
 *******************************************
 
-Determined employs data parallelism in its approach to distributed training. Data parallelism for
-deep learning consists of a set of workers, where each worker is assigned to a unique compute
-accelerator such as a GPU or a TPU. Each worker maintains a copy of the model parameters (weights
-that are being trained), which is synchronized across all the workers at the start of training.
+Determined employs data or model parallelism in its approach to distributed training. Data
+parallelism for deep learning consists of a set of workers, where each worker is assigned to a
+unique compute accelerator such as a GPU or a TPU. Each worker maintains a copy of the model
+parameters (weights that are being trained), which is synchronized across all the workers at the
+start of training.
 
 .. image:: /assets/images/_dtrain-loop-dark.png
    :class: only-dark

diff --git a/docs/model-dev-guide/hyperparameter/instrument-model-code.rst b/docs/model-dev-guide/hyperparameter/instrument-model-code.rst
@@ -7,7 +7,7 @@ object in the Trial base class. This :class:`~determined.TrialContext` object ex
 :func:`~determined.TrialContext.get_hparam` method that takes the hyperparameter name. For example,
 to inject the value of the ``dropout_probability`` hyperparameter defined in the experiment
 configuration into the constructor of a PyTorch `Dropout
-<https://pytorch.org/docs/stable/nn.html#dropout>`_ layer:
+<https://pytorch.org/docs/stable/generated/torch.nn.Dropout.html>`_ layer:
 
 .. code:: python