Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

increase wtime_make_orog for RRFS NA 3km on WCOSS #622

Merged
merged 6 commits into from
Oct 27, 2021
Merged

increase wtime_make_orog for RRFS NA 3km on WCOSS #622

merged 6 commits into from
Oct 27, 2021

Conversation

chan-hoo
Copy link
Collaborator

@chan-hoo chan-hoo commented Oct 25, 2021

DESCRIPTION OF CHANGES:

  • Since make_orog fails to run due to run time limit, WTIME_MAKE_OROG is changed to 1 hour.
  • PPN_RUN_FCST is removed from the WE2E configuration for RRFS NA 3km.

TESTS CONDUCTED:

WE2E test on WCOSS:

  • grid_RRFS_NA_3km_ics_FV3GFS_lbcs_FV3GFS_suite_RRFS_v1alpha

ISSUE:

Fixes issue mentioned in #621

@JeffBeck-NOAA
Copy link
Collaborator

JeffBeck-NOAA commented Oct 25, 2021

@chan-hoo, thanks for this PR. Any idea why the orog task is timing out on WCOSS but not on other platforms? If an SDF with GSL GWD is used, there are extra orog fields that are produced, and can cause it to time out. Is that what you're experiencing? Otherwise I'm fine with increasing the wall clock limit (also due to what I just mentioned).

I'm not sure why PPN_RUN_FCST was set to 24 for the RRFS_NA_3km domain WE2E test. It's not set in any of the other WE2E tests. Thanks for catching that! (@gsketefian, do you know why it was set to 24 only for this domain?)

As for matching EMC's layout values for the RRFS_NA_3km domain, I had originally used the smallest layout possible that would fit within the 8 hour wall clock limit to conserve resources on NOAA HPC, since it's not being run operationally in the App. I'm torn on whether we should match the operational config or not, since it will use significantly more core hours. Thoughts, @gsketefian, @jwolff-ncar, @mkavulich, @llpcarson?

@chan-hoo
Copy link
Collaborator Author

@JeffBeck-NOAA, 1) This is because 'make_grid' and 'make_orog' run in serial on WCOSS. I tested them in parallel on both wcoss machines, but they didn't work. 3) I'll remove this change.

@JeffBeck-NOAA
Copy link
Collaborator

@chan-hoo, the grid and orog tasks are run in serial on all NOAA HPC systems, so it may just be that the WCOSS configuration is slower than other machines?

There's value in knowing the EMC operational layout configuration for the RRFS_NA_3km domain. Could you add them as comments after the smaller layout values? Something like:

"LAYOUT_X="${LAYOUT_X:-18} #40 - EMC operational configuration

@chan-hoo
Copy link
Collaborator Author

chan-hoo commented Oct 25, 2021

@JeffBeck-NOAA , if so .... I have no idea. Maybe you are right. make_orog took about 45 minutes on both wcoss machines.

@chan-hoo chan-hoo merged commit 43e8915 into ufs-community:develop Oct 27, 2021
@chan-hoo chan-hoo deleted the feature/orog_wcoss branch October 27, 2021 10:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants