Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: [FE-270] add PBS known issue - Cluster tab does not display GPU information #8719

Merged
merged 3 commits into from
Jan 20, 2024

Conversation

jagadeesh545
Copy link
Contributor

@jagadeesh545 jagadeesh545 commented Jan 19, 2024

Description

Add a new PBS known issue "Cluster tab does not display GPU information" including a link to the PBS requirements section.
This is a follow up for the PR #8714

Test Plan

Tested manually
image

Commentary (optional)

Checklist

  • Changes have been manually QA'd
  • User-facing API changes need the "User-facing API Change" label.
  • Release notes should be added as a separate file under docs/release-notes/.
    See Release Note for details.
  • Licenses should be included for new code which was copied and/or modified from any external code.

Ticket

FE-270

@cla-bot cla-bot bot added the cla-signed label Jan 19, 2024
Copy link

netlify bot commented Jan 19, 2024

Deploy Preview for determined-ui canceled.

Name Link
🔨 Latest commit 8e65249
🔍 Latest deploy log https://app.netlify.com/sites/determined-ui/deploys/65aada5cbf8c8d000864a0c3

@determined-ci determined-ci requested a review from a team January 19, 2024 19:36
@determined-ci determined-ci added the documentation Improvements or additions to documentation label Jan 19, 2024
@jagadeesh545 jagadeesh545 requested a review from a team January 19, 2024 19:36
- If the ``Cluster`` tab on the DeterminedAI WebUI does not display the GPU information, there
could be an issue in the PBS configuration. Please refer to :ref:`PBS Requirements
<pbs-config-requirements>` to ensure PBS is configured properly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • If the Cluster tab in the WebUI does not display the GPU information, there
    may be an issue with the PBS configuration. Visit :ref:PBS Requirements <pbs-config-requirements> to ensure PBS is properly configured.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jagadeesh545 avoid DeterminedAI as it is reserved for talking about the company itself. Determined is reserved for talking about the product. However, we can avoid it altogether. In the future we may also avoid WebUI but for now it's the most valid term.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, Tara. I updated as per your suggestions.

Copy link
Member

@tara-det-ai tara-det-ai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested edits

@@ -381,6 +381,10 @@ Some constraints are due to differences in behavior between Docker and Singulari
PBS Known Issues
******************

- If the ``Cluster`` tab on the DeterminedAI WebUI does not display the GPU information, there
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it DeterminedAI or Determined AI?

Is WebUI the appropriate wording to use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed DeterminedAI as per Tara's suggestion. I think WebUI is fine here. Please let me know why you think it might be an inappropriate word. I can change it if needed.

@@ -381,6 +381,10 @@ Some constraints are due to differences in behavior between Docker and Singulari
PBS Known Issues
******************

- If the ``Cluster`` tab on the DeterminedAI WebUI does not display the GPU information, there
could be an issue in the PBS configuration. Please refer to :ref:`PBS Requirements
<pbs-config-requirements>` to ensure PBS is configured properly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where in the PBS Requirements (https://docs.determined.ai/latest/setup-cluster/slurm/slurm-requirements.html#pbs-requirements) would a user find the step necessary to get the GPU information displayed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PBS Requirements lists the important configuration steps necessary along with commands and sample outputs. That should help the users to ensure PBS is configured properly.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you were a user that experienced this problem, where in the PBS Requirements would you find the solution?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the ref link to point to the exact section for solving the issue.

@@ -381,6 +381,10 @@ Some constraints are due to differences in behavior between Docker and Singulari
PBS Known Issues
******************

- If the ``Cluster`` tab on the DeterminedAI WebUI does not display the GPU information, there
could be an issue in the PBS configuration. Please refer to :ref:`PBS Requirements
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we provide a little more information as to what the specific issue may be?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please let me know what kind of information you would like to be added I can add it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this update to the documentation is in response to PBS GPU issue you just worked on where if not all vnodes have the same GPUs, then it shows "", or something along those lines, right? If so, where in the documentation you are adding is the solution to that problem described?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the ref link to point to the exact section for solving the issue.

@determined-ci determined-ci requested a review from a team January 19, 2024 19:54
@rcorujo
Copy link
Contributor

rcorujo commented Jan 20, 2024

I had been looking at the PBS Requirements in the Determined AI online documentation and didn't see any reference to GPUs. I didn't realize that you had another pull request #8714 that added GPUs to the PBS Requirements. Would have been good if this PR would have referenced that other PR, then things would have been clearer. Thanks.

@jagadeesh545
Copy link
Contributor Author

I had been looking at the PBS Requirements in the Determined AI online documentation and didn't see any reference to GPUs. I didn't realize that you had another pull request #8714 that added GPUs to the PBS Requirements. Would have been good if this PR would have referenced that other PR, then things would have been clearer. Thanks.

Sorry Rig. I updated the description with the related PR link.

@jagadeesh545 jagadeesh545 merged commit 190af1d into main Jan 20, 2024
72 of 83 checks passed
@jagadeesh545 jagadeesh545 deleted the fe-270 branch January 20, 2024 15:58
maxrussell pushed a commit that referenced this pull request Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants