-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test ERS_Lh11.C96.GFSv15p2.cheyenne_intel FAILS in restart comparison #62
Comments
I am currently working on getting the restart tests into the rt.sh regression test system, also because this was reported independently in NOAA-EMC/fv3atm#42. It would make sense to wait for the rt.sh based tests to be implemented before spending more time on this. |
@climbfuji Please let me know if you want me to do anything. |
I can confirm that with the namelist settings in the ufs_public_release branches for the GFS_v15p2 tests the restarts do not work. I am now trying to fix this, I've got a few ideas what may be the difference to the tests that we know are b4b reproducible in restart runs. |
@jedwards4b @mcgibbon I have a solution for this (tested on my Mac for GFSv15p2 thus far). The default namelist settings for both GFSv15p2 and GFSv16beta in the ufs_public_release branch of the ufs-weather-model repository turn on skep, shum and sppt. The stochastic physics do not reproduce in restart runs, because the logic for dealing with restarts hasn't been implemented in the stochastic_physics repo (@pjpegion) and the model isn't writing those fields to the restart files (@DusanJovic-NOAA @junwang-noaa). My suggestion for the public release is to (a) turn off stochastic physics in the default namelists (Phil suggested this anyway, but I missed it) and (b) document that using the stochastic perturbations is an advanced feature that currently does not support b4b identical results through restarts (@ligiabernardet). For our development branches, we need to implement this capability in stochastic_physics and fv3atm in the near future. Any objections? |
I believe that we have already made this change for cime tests and it still
fails.
…On Fri, Jan 17, 2020, 09:10 Dom Heinzeller ***@***.***> wrote:
@jedwards4b <https://github.com/jedwards4b> @mcgibbon
<https://github.com/mcgibbon> I have a solution for this (tested on my
Mac for GFSv15p2 thus far). The default namelist settings for both GFSv15p2
and GFSv16beta in the ufs_public_release branch of the ufs-weather-model
repository turn on skep, shum and sppt. The stochastic physics do not
reproduce in restart runs, because the logic for dealing with restarts
hasn't been implemented in the stochastic_physics repo ***@***.***
<https://github.com/pjpegion>) and the model isn't writing those fields
to the restart files ***@***.***
<https://github.com/DusanJovic-NOAA> @junwang-noaa
<https://github.com/junwang-noaa>). My suggestion for the public release
is to (a) turn off stochastic physics in the default namelists (Phil
suggested this anyway, but I missed it) and (b) document that using the
stochastic perturbations is an advanced feature that currently does not
support b4b identical results through restarts ***@***.***
<https://github.com/ligiabernardet>). For our development branches, we
need to implement this capability in stochastic_physics and fv3atm in the
near future. Any objections?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#62?email_source=notifications&email_token=ABOXUGEGY72BFCXT2OUQ7Q3Q6HJ7VA5CNFSM4KHW334KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJIFJLI#issuecomment-575689901>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABOXUGEMND3FHPQP7J7NJOTQ6HJ7VANCNFSM4KHW334A>
.
|
The default configurations for this release are with all stochastic processes turned off. |
@jedwards4b Yes, i confirm that. We turned off stochastic physics but it was still not b4b. |
On my Mac, I am getting b4b identical results w/o stochastic physics. Now testing on Cheyenne with Intel. |
Just to make sure that you are modifying the nstf_name namelist entry as well for the restart runs? The usual regression tests for ufs-weather-model use 2,1,1,0,5 for coldstarts. When restarting, one needs to set the second 1 to 0 (that is the NSST spinup flag, one of the "hidden features" - don't blame me). The input.nml we got from EMC uses 2,1,0,0,0 for coldstarts. I am testing now if 2,0,0,0,0 works for restarts or if we need to switch to "2,1,1,0,5" and "2,0,1,0,5". Just be patient, please. |
Dom,
This feature is not hidden, please see the document:
https://vlab.ncep.noaa.gov/redmine/projects/comfv3/wiki/_set_up_restart_run_for_FV3GFS_
…On Fri, Jan 17, 2020 at 12:02 PM Dom Heinzeller ***@***.***> wrote:
Just to make sure that you are modifying the nstf_name namelist entry as
well for the restart runs? The usual regression tests for ufs-weather-model
use 2,1,1,0,5 for coldstarts. When restarting, one needs to set the second
1 to 0 (that is the NSST spinup flag, one of the "hidden features" - don't
blame me). The input.nml we got from EMC uses 2,1,0,0,0 for coldstarts. I
am testing now if 2,0,0,0,0 works for restarts or if we need to switch to
"2,1,1,0,5" and "2,0,1,0,5". Just be patient, please.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#62?email_source=notifications&email_token=AI7D6TMPLTIO22KVHM5BQJDQ6HQCLA5CNFSM4KHW334KYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJIKOVQ#issuecomment-575711062>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AI7D6TL2NJD5OWJYQTDICDDQ6HQCLANCNFSM4KHW334A>
.
|
I agree, Jun, it is not hidden to people who have access to Vlab. I am not sure if it is in the ufs-weather-model documentation for the release (I am lost wrt documentation) and I am not sure if the CIME folks know about it ... let's wait to hear from them! |
@climbfuji I have already did it. My previous tests are on Base run: Restart run: By default, stochastic physics is off. |
That's good to know, thanks. if that fails I will test the default 2,1,1,0,5 settings. Just wait, please. |
@climbfuji when you get things working, could you attach an |
Sure. But note everyone that I will be taking this weekend off (definitely Sunday and Monday), so please don't expect any answers before Tuesday. Thanks ... |
Everyone, please see here NOAA-EMC/fv3atm#42 for the solution/namelists/... Thanks! |
@climbfuji I tried the cime test with these changes, it still fails. |
I don't think I have the time to look at the differences between your runs and mine today. Here is a copy of all the directories you need on Cheyenne:
You will be interested in the following directories:
|
I am beginning to wonder if this is related to the debug-run problems you have been seeing, i.e. the missing update to the ufs_release_v1.0 branch for chgres_cube from George Gayno and the missing compiler flags for the GNU compiler for this executable. |
This test is using the Intel compiler so I'm not sure what GNU would have to do with it. The biggest difference I see is that you are using the cubed_sphere_grid for output_grid and I am using gaussian_grid . I'm looking into this now. |
The same tests passed with the GNU compilers as well. They are identical except the modules.fv3 files. I can rerun the tests on Cheyenne with GNU and keep the rundirs, but as I said the differences will be in modules.fv3 and in the actual model output. |
@jedwards4b i tested with changing
|
@climbfuji I tested your input.nml with CIME build model for v15p2 and we have still difference in the restart. So, at least the problem is not related with input.nml. I'll continue to dig but let me know if you have any other idea. The runs are in
|
I can think of
I need to get this cime setup run by myself. Will try tomorrow. |
The initial documentation is in https://ufs-mrapp.readthedocs.io/en/latest/index.html# I am still working on but i could find lots of information especially in quick start guide. |
@climbfuji I ran the cime restart test with your executable and it passed. This points to a difference in the build, perhaps in the build flags, but I also noticed that you were not using the latest model version: 6a93463 |
Yes, the code I had used for the testing didn't include the last PR. But the current PR I have and for which I reran the restart tests does (ufs-community/ufs-weather-model#33). |
I built using src/model/tests/compile_cmake.sh and it also passed the restart test - I've been studying the build since and still cannot pinpoint the difference. |
If you send me build logs (cmake and make; may have to add VERBOSE=1 to the make calls) then I can take a look. Maybe something comes to my mind wrt which files to look at when I stare at this long enough. Thanks ... |
|
This problem is fixed. The build flags to libfv3core.a were different. |
Yeah! Thanks for figuring this out, I was struggling all day to find time to look at your compile logs. |
Can you please elaborate on the fix @jedwards4b? I'm having the same issue with a different build system. |
@mcgibbon I found that the noaa build was using the flag |
This test indicates that restarts are not producing bfb results under cime testing.
The file comparisons show:
run/ERS_Lh11.C96.GFSv15p2.cheyenne_intel.20200116_085748_nk2uyu.ufsatm.atm.f011.nc.base.cprnc.out: of which 14 had non-zero differences
run/ERS_Lh11.C96.GFSv15p2.cheyenne_intel.20200116_085748_nk2uyu.ufsatm.sfc.f011.nc.base.cprnc.out: of which 125 had non-zero differences
The text was updated successfully, but these errors were encountered: