Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[develop] The service partition no longer works on Hera and Jet following the Slurm updates #1011

Closed
MichaelLueken opened this issue Feb 8, 2024 · 0 comments · Fixed by #1012
Assignees
Labels
bug Something isn't working

Comments

@MichaelLueken
Copy link
Collaborator

Expected behavior

Running the get_extrn_ics and get_extrn_lbcs (as well as the tasks responsible for retrieving verification observations) should successfully pass on all platforms.

Current behavior

These tasks are failing, following the updates to Slurm on both platforms.

Machines affected

Currently, only Hera and Jet

Steps To Reproduce

  1. Clone and build the SRW app on either machine.
  2. Run any test that utilizes the service partition.
  3. The job will fail to launch the get_* tasks.

Detailed Description of Fix (optional)

In order to correct this behavior, the SCHED_NATIVE_CMD_HPSS entry in the workflow needs to be updated to include either -n 1 or --ntasks 1. This will correct the issue for get_extrn_ics and get_extrn_lbcs. For the verification observation tasks, updated native entries need to be applied to parm/wflow/verify_pre.yaml's task_get_obs_ccpa, task_get_obs_nohrsc, task_get_obs_mrms, and task_get_obs_ndas.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant