Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel IO issue #1234

Closed
Hae-CheolKim-NOAA opened this issue May 26, 2022 · 7 comments
Closed

Parallel IO issue #1234

Hae-CheolKim-NOAA opened this issue May 26, 2022 · 7 comments
Labels
bug Something isn't working

Comments

@Hae-CheolKim-NOAA
Copy link

Description

With the new compilation I cloned on May 23, 2002, I advanced a run for 1/12-th degree DATM (CDEPS)-MOM6-CICE6 on HERA but bumped into an parallel IO issue. Looks like this has something to do with parallel netcdf lib.

...
0: Abort with message NetCDF: Variable not found in file /scratch2/NCEPDEV/nwprod/hpc-stack/src/develop/pkg/pio-2.5.3/src/clib/pio_nc.c at line 1158
40: Abort with message NetCDF: Variable not found in file /scratch2/NCEPDEV/nwprod/hpc-stack/src/develop/pkg/pio-2.5.3/src/clib/pio_nc.c at line 1158
1: Abort with message NetCDF: Variable not found in file /scratch2/NCEPDEV/nwprod/hpc-stack/src/develop/pkg/pio-2.5.3/src/clib/pio_nc.c at line 1158
2: Abort with message NetCDF: Variable not found in file /scratch2/NCEPDEV/nwprod/hpc-stack/src/develop/pkg/pio-2.5.3/src/clib/pio_nc.c at line 1158
41: Abort with message NetCDF: Variable not found in file /scratch2/NCEPDEV/nwprod/hpc-stack/src/develop/pkg/pio-2.5.3/src/clib/pio_nc.c at line 1158
...

To Reproduce:

What compilers/machines are you seeing this with? INTEL/HERA
Give explicit steps to reproduce the behavior.

  1. Compile the model code at /scratch2/NCEPDEV/marine/Hae-Cheol.Kim/ufs-weather-model
  2. run the test case /scratch1/NCEPDEV/stmp2/Hae-Cheol.Kim/FV3_RT/rt_191768/GLBb0.08_038_gefs

Additional context

Add any other context about the problem here.
Directly reference any issues or PRs in this or other repositories that this is related to, and describe how they are related. Example:

  • needs to be fixed also in noaa-emc/nems/issues/<issue_number>
  • needed for noaa-emc/fv3atm/pull/<pr_number>
@Hae-CheolKim-NOAA Hae-CheolKim-NOAA added the bug Something isn't working label May 26, 2022
@DeniseWorthen
Copy link
Collaborator

What is the status of this issue?

@Hae-CheolKim-NOAA
Copy link
Author

Not resolved, yet.

@arunchawla-NOAA
Copy link

@edwardhartnett bringing to your attention. Do you have any suggestions for @Hae-CheolKim-NOAA

@edwardhartnett
Copy link
Contributor

This is a PIO error, not a netCDF error.

Have we tried with the latest release of PIO and netCDF?

It's complaining about a variable not found, what are the lines of code which are being called?

Is there a test code that demonstrates this problem? If not, could you generate one please? That is, a single-file Fortran program which does the same thing the model is doing, and generates the same error?

@Hae-CheolKim-NOAA
Copy link
Author

I tested the new src cloned on Feb 15, 2023, this issue was not repeated and I was able to compile and initialize the 1/12-th degree DATM (CDEPS)-MOM6-CICE6 on HERA. Sorry for the delay in updating the status and thanks for your support.

@junwang-noaa
Copy link
Collaborator

@Hae-CheolKim-NOAA It looks to me the issue is resolved. Can we close it?

@Hae-CheolKim-NOAA
Copy link
Author

Hae-CheolKim-NOAA commented Mar 27, 2023 via email

TerrenceMcGuinness-NOAA added a commit to NOAA-EMC/global-workflow that referenced this issue Oct 16, 2024
# Description

This PR has the GitHub Pipeline script in the `github/workflows`
directory for running CI tests
to be preformed an AWS virtual cluster. It is setup to be launched from
the dispatch action from the Actions tab.

For now it will only run C48_ATM 

Resolves #3006 

Once the yaml pipeline is in `.github/workflows` directory of the
default branch we can test it against [PR
2977](#2977) which may
be needed to build on Parallel Works Centos AWS.

Code managers can check to see if the self-hosted runner
[globalworkflow_parallelworks](https://github.com/NOAA-EMC/global-workflow/settings/actions/runners/22)
is up and ready by checking the
[Running](https://github.com/NOAA-EMC/global-workflow/settings/actions/runners)
Settings.

In pending work we should also be able spin up the cluster on demand
from GitHub as well.

<!-- For more on writing good commit messages, see
https://cbea.ms/git-commit/ -->

# Type of change
- [ ] Bug fix (fixes something broken)
- [ ] New feature (adds functionality)
- [x] Maintenance (code refactor, clean-up, new CI test, etc.)

# Change characteristics
<!-- Choose YES or NO from each of the following and delete the other
-->
- Is this a breaking change (a change in existing functionality)? YES/NO
- Does this change require a documentation update? YES/NO
- Does this change require an update to any of the following submodules?
YES/NO (If YES, please add a link to any PRs that are pending.)
  - [ ] EMC verif-global <!-- NOAA-EMC/EMC_verif-global#1234 -->
  - [ ] GDAS <!-- NOAA-EMC/GDASApp#1234 -->
  - [ ] GFS-utils <!-- NOAA-EMC/gfs-utils#1234 -->
  - [ ] GSI <!-- NOAA-EMC/GSI#1234 -->
  - [ ] GSI-monitor <!-- NOAA-EMC/GSI-Monitor#1234 -->
  - [ ] GSI-utils <!-- NOAA-EMC/GSI-Utils#1234 -->
  - [ ] UFS-utils <!-- ufs-community/UFS_UTILS#1234 -->
  - [ ] UFS-weather-model <!-- ufs-community/ufs-weather-model#1234 -->
  - [ ] wxflow <!-- NOAA-EMC/wxflow#1234 -->

# How has this been tested?
<!-- Please list any test you conducted, including the machine.

CI Tests runs-end-to end on an AWS Centos based virtual cluster on
Parallel Works.

-->

# Checklist
- [ ] Any dependent changes have been merged and published
- [x] My code follows the style guidelines of this project
- [ ] I have performed a self-review of my own code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have documented my code, including function, input, and output
descriptions
- [ ] My changes generate no new warnings
- [ ] New and existing tests pass with my changes
- [x] This change is covered by an existing CI test or a new one has
been added
- [ ] I have made corresponding changes to the system documentation if
necessary

---------

Co-authored-by: tmcguinness <terry.mcguinness@noaa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants