Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stochastic physics is not working correctly after yaml/Python changes to the workflow #818

Closed
JeffBeck-NOAA opened this issue Jun 1, 2023 · 11 comments · Fixed by #870
Closed
Labels
bug Something isn't working Priority: HIGH

Comments

@JeffBeck-NOAA
Copy link
Collaborator

JeffBeck-NOAA commented Jun 1, 2023

Expected behavior (previous correct behavior prior to the yaml/Python updates)

  1. Stochastic physics (using the single seed values) should work when running a deterministic forecast.
  2. Stochastic physics should use the date/ensmem string when running in ensemble mode.
  3. When run in ensemble mode, each member of each cycle should be using its own input.nml file within the run directory.

Current behavior

  1. Stochastic physics is not activated when DO_SPPT/SHUM/SKEB/SPP are set to "true" in deterministic mode.
  2. Stochastic physics is using the single seed values when run in ensemble mode and not the date/ensmem string values.
  3. When run in ensemble mode, each member is currently using a soft-linked input.nml file from the main experiment directory, which will result in all members having the same perturbations.

Machines affected

All machines.

Steps To Reproduce

  1. Set up a deterministic run of the SRW App and turn on DO_SPPT/SHUM/SKEB/SPP; the resulting experiment directory input.nml will not have any stochastic physics activated.
  2. Set up an ensemble run of the SRW App and turn on DO_SPPT/SHUM/SKEB/SPP; the resulting member directories will use a single input.nml file, instead of a member-specific input.nml in the experiment directory, and it will not contain the date/ensmem string value for the seeds.

Detailed Description of Fix (optional)

Changes to the Pythonized set_FV3nml_ens_stoch_seeds.py are likely required to fix the seed problem. Modifications to when stochastic namelist entries are applied is necessary to fix the deterministic stochastic physics problem.

Correct behavior:

Example namelist block for nam_stochy when running in deterministic mode with stochastic physics turned on (SPPT in this case):

&nam_stochy
iseed_sppt = 1
new_lscale = .true.
sppt = 0.7
sppt_logit = .true.
sppt_lscale = 150000
sppt_sfclimit = .true.
sppt_tau = 21600
spptint = 3600
use_zmtnblck = .false.
/

Example mem001 namelist block for nam_stochy when running in ensemble mode with stochastic physics turned on (SPPT in this case):

&nam_stochy
iseed_sppt = 2019061500011
new_lscale = .true.
sppt = 0.7
sppt_logit = .true.
sppt_lscale = 150000
sppt_sfclimit = .true.
sppt_tau = 21600
spptint = 3600
use_zmtnblck = .false.
/

@christinaholtNOAA @MichaelLueken @mark-a-potts @mkavulich @gsketefian @michelleharrold @willmayfield

@JeffBeck-NOAA JeffBeck-NOAA added bug Something isn't working Priority: HIGH labels Jun 1, 2023
@mkavulich
Copy link
Collaborator

mkavulich commented Jun 1, 2023

@JeffBeck-NOAA Can you point me to a case for the deterministic problem you're seeing? I have added a new deterministic SPP case for my PR #811 and I observed the expected perturbations. You can see the cases I ran on Hera:

With SPP:

  • /scratch2/BMC/fv3lam/kavulich/UFS/workdir/issue_798/testing/expt_dirs/grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP_old_20230523_160515/

Without SPP:

  • /scratch2/BMC/fv3lam/kavulich/UFS/workdir/issue_798/testing/expt_dirs/grid_RRFS_CONUS_13km_ics_FV3GFS_lbcs_FV3GFS_suite_RAP

@JeffBeck-NOAA
Copy link
Collaborator Author

JeffBeck-NOAA commented Jun 1, 2023

@mkavulich, here are two deterministic cases on Hera, one with SPPT, one with SPPT+SPP:

/scratch2/BMC/fv3lam/beck/FV3-LAM/expt_dirs/deter_SPPT
/scratch2/BMC/fv3lam/beck/FV3-LAM/expt_dirs/deter_SPPT+SPP

Neither have anything in the input.nml nam_stochy or nam_sppperts sections.

The config.yaml file for the second expt_dir (SPPT+SPP) is here (I just added DO_SPP=true):

/scratch2/BMC/fv3lam/beck/FV3-LAM/ufs-srweather-app/ush/config.yaml

Thanks for taking a look!

@JeffBeck-NOAA
Copy link
Collaborator Author

JeffBeck-NOAA commented Jun 1, 2023

For the ensemble date/ensmem seed problem, you can look here on Hera: /scratch2/BMC/fv3lam/beck/FV3-LAM/expt_dirs/runtime_SPPT_ens

You'll see that mem001 is pointing to the input.nml_stoch file in /scratch2/BMC/fv3lam/beck/FV3-LAM/expt_dirs/runtime_SPPT_ens and not the input.nml file in the the member directory as it needs to. It's also not using the date/ensmem string in input.nml_base, which is in the mem001 directory.

@christinaholtNOAA
Copy link
Collaborator

@JeffBeck-NOAA Is there a WE2E config that I can work on to see if I can find the source of the issue?

@JeffBeck-NOAA
Copy link
Collaborator Author

JeffBeck-NOAA commented Jun 14, 2023

@christinaholtNOAA, the following WE2E should illustrate the problems with stochastic physics for a deterministic configuration:

grids_extrn_mdls_suites_community/config.grid_SUBCONUS_Ind_3km_ics_HRRR_lbcs_HRRR_suite_HRRR.yaml

And the following WE2E should illustrate the problem with the seeds in ensemble mode and the soft-linked input.nml (should use one unique input.nml file per member):

wflow_features/config.community_ensemble_2mems_stoch.yaml

@christinaholtNOAA
Copy link
Collaborator

Thanks @JeffBeck-NOAA. I will try to work in a debugging session with this soon, but I might not get there right away. Thanks for taking the time above to clearly describe the problem above. It will be super helpful for diving in.

@JeffBeck-NOAA
Copy link
Collaborator Author

Thanks, @christinaholtNOAA!

@MichaelLueken
Copy link
Collaborator

@JeffBeck-NOAA -

I was able to identify the hash where the stochastic physics stopped working correctly - 7baa285, which is associated with PR #744.

I was wondering, do we want stochastic physics to work with deterministic runs? Given the pre-existing logic in scripts/exregional_run_fcst.sh, it looks like stochastic physics should only be used for ensemble runs. I noted that a deterministic run using hash b72ab14 (the hash right before 7baa285), with stochastic physics turned on, fails. The deterministic run with stochastic physics off, from the updated code in hash 7baa285, passes.

I just wanted to bring this to everyone's attention so that we can have more eyes on this.

@JeffBeck-NOAA
Copy link
Collaborator Author

@MichaelLueken, thanks for digging into the hashes to identify where things changed. Stochastic physics was functional for both deterministic and ensemble runs in the shell-based SRW App when it was originally committed here and here. I'm not sure what pre-existing logic is in scripts/exregional_run_fcst.sh that you're referring to, but that was likely changed after the initial PR went in. We need to have the deterministic option because it allows users to easily test changes to seeds or turn schemes on/off and compare individual simulations.

@MichaelLueken
Copy link
Collaborator

@JeffBeck-NOAA In scripts/exregional_run_fcst.sh, the following logic is present:

if [ "${DO_ENSEMBLE}" = "TRUE" ] && ([ "${DO_SPP}" = "TRUE" ] || [ "${DO_SPPT}" = "TRUE" ] || [ "${DO_SHUM}" = "TRUE" ] || \
   [ "${DO_SKEB}" = "TRUE" ] || [ "${DO_LSM_SPP}" =  "TRUE" ]); then

Before this, I'm seeing the following logic:

if [ "${DO_ENSEMBLE}" = TRUE ]; then
  set_FV3nml_stoch_params cdate="$cdate" || print_err_msg_exit "\

From the sounds of it, the input.nml file should always include stochastic parameters if DO_SPP, DO_SPPT, DO_SHUM, DO_SKEB, or DO_LSM_SPP are set to true, regardless of whether it is an ensemble or deterministic run. The presence of DO_ENSEMBLE is to correctly set the stochastic seeds for each ensemble member. If this is the case, then I think I might have an idea of how to bring back deterministic stochastic physics capability.

For the ensembles, I think the issue is that only the input.nml_base is being updated and not input.nml itself. The run_fcst task is using input.nml itself, which is why the updated seeds present in input.naml_base aren't being used. The seed modifications made by ush/set_FV3nml_ens_stoch_seeds.py need to be applied directly to input.nml in order for the proper seeds to be used. However, when ush/set_FV3nml_ens_stoch_seeds.py updates the input.nml file directly, the contents of &nam_sppperts and &nam_stochy are removed, with the exception of correct seeds:

&nam_sppperts
    iseed_spp = 2020081000014, 2020081000015, 2020081000016, 2020081000017,
                2020081000018
/

&nam_stochy
    iseed_shum = 2020081000012
    iseed_skeb = 2020081000013
    iseed_sppt = 2020081000011
/

Not not clear to me where the rest of the contents for these two namelist options have gone. Once this has been addressed, I think that the ensemble issue will be cleared up.

@JeffBeck-NOAA
Copy link
Collaborator Author

@MichaelLueken, you hit on the main problem. After taking a look, it was indeed the "DO_ENSEMBLE" part of scripts/exregional_run_fcst.sh and generate_FV3LAM_wflow.py that was the issue with stochastic physics not being used in the deterministic runs. I fixed that issue with changes to the ex-script and generate_FV3LAM_wflow.py. @willmayfield found the other problem with ensemble mode, where the wrong namelist was being referenced and was linked instead of copied into the member directories. We have put both of our changes into a branch on Will's fork of the App, and he will open a PR shortly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Priority: HIGH
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants