Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add cpld_control_wave with 1deg wave grid #385

Closed

Conversation

aliabdolali
Copy link
Collaborator

Description

add cpld_control_wave with 1deg wave grid.

No expected changes to the old tests. But the baseline should be created to include new test.

export atm_petlist_bounds=$APB_cpl_wwav
export ocn_petlist_bounds=$OPB_cpl_wwav
export ice_petlist_bounds=$IPB_cpl_wwav
export wav_petlist_bounds=$WPB_cpl_wwav
Copy link
Collaborator

@DeniseWorthen DeniseWorthen Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will set the tasking to the tasking when using waves, which is appropriate for the 1/4 deg model but not for 1deg. You'll need to create a new set of PE tasking for 'cpl_dflt' which is for the 1deg case, but which includes waves, maybe call it 'cpl_dflt_wwav'.


export CPLWAV='.T.'
export CPLWAV2ATM='.T.'
export RT1DEG='.T.'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In default_vars, we set the default resolution for each component using "OCNRES" etc. I would suggest to create a default WW3 res which is 1deg. You wouldn't need the RT1DEG variable because the default tests would be 1 deg ww3, 1deg ocean, 1deg ice and c96 fv3.

The current ww3 tests which are all for 1/4 deg would then need to set a non-default resolution. This would be consistent w/ how we vary the resolution across the different tests.

Copy link
Collaborator Author

@aliabdolali aliabdolali Jan 19, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DeniseWorthen Thanks for the suggestions. I modified the test but still get a failure.
Here are my changes:

  1. I added cpl_dflt_wwav. Do you have any suggestions for the numbers? I assumed the 1deg is lighter than 1/4, and heavier than dflt, so I set the total to 288. What each of these variables stand for? I mean where can I see the definitions?
  2. I removed export RT1DEG='.T.'
  3. I added WAVRES='1.00'

@@ -22,9 +22,13 @@ elif [[ $MACHINE_ID = wcoss_dell_p3 || $MACHINE_ID = wcoss2 ]]; then
TASKS_strnest=96 ; TPN_strnest=28 ; INPES_strnest=2 ; JNPES_strnest=4

TASKS_cpl_dflt=192; TPN_cpl_dflt=28; INPES_cpl_dflt=3; JNPES_cpl_dflt=8
THRD_cpl_dflt=1; WPG_cpl_dflt=6; MPB_cpl_dflt="0 143"; APB_cpl_dflt="0 149"
THRD_cpl_dflt=1; WPG_cpl_dflt=6; ePB_cpl_dflt="0 143"; APB_cpl_dflt="0 149"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have a typo here and elsewhere: ePB_cpl_dflt.

The WPG is the atm write component tasks, the MPB is the mediator, the APB is the atmosphere, the OPB is the ocean and the IPB is the ice.

@DeniseWorthen
Copy link
Collaborator

You can add the words "skip-ci" (without the quotes) to a commit message while you're making changes. That will prevent the CI tests from running. Then after all the logs are added, at that point we do the CI tests.

@aliabdolali
Copy link
Collaborator Author

You can add the words "skip-ci" (without the quotes) to a commit message while you're making changes. That will prevent the CI tests from running. Then after all the logs are added, at that point we do the CI tests.

Thanks @DeniseWorthen Good idea. BTW, my test fails at the initiation step and I do not know what is the reason.

@DeniseWorthen
Copy link
Collaborator

Point me to your run directory and I'll take a look.

@aliabdolali
Copy link
Collaborator Author

@DeniseWorthen Here is the path
/scratch1/NCEPDEV/stmp2/Ali.Abdolali/FV3_RT/rt_75766/cpld_control_wave_prod
Thanks a million for your kind help.

@DeniseWorthen
Copy link
Collaborator

How are you compiling and then running this test? The error I see (in err) is an odd one (/scratch1/NCEPDEV/stmp2/Ali.Abdolali/FV3_RT/rt_75766/cpld_control_wave_prod/./fv3.exe': corrupted size vs. prev_size: 0x0000000012dd46b0).

I have your branch checked out here: /scratch2/NCEPDEV/climate/Denise.Worthen/WORK/ufs_1degw3/tests. I've added an rt.test which is a temporary file containing only the compile line for ufs-cpld+waves and this single new test. I am trying that now.

I also noticed that the test is still using some of the the 1/4deg configuration (DT_ATMOS='450') settings. You'll want to remove these since you want to use the default c96mx100 settings (these are set in default_vars). See what I've done in tests/cpld_control_wave.

@DeniseWorthen
Copy link
Collaborator

I was able to get the test to run (see /scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_RT/rt_23024/cpld_control_wave_prod) but there were two issues.

One, the test is trying to move the file out_grd.glo_1deg into the baseline but it can't find it. Is this supposed to be the ww3 restart file? There is a file in the run directory named 20161004.000000.restart.glo_1deg. If that is the restart, then it needs to be used as the file name in the test.

Two, I had to adjust the PE tasking for the 1deg wave test. You had created the new TASKS_cpl_dflt_wwav but they were still appropriate for the 1/4 deg. For the 1deg resolution, I gave waves 12 PEs (the 1/4 uses 40). I only updated the hera setting so you'll need do the same for the others. See my changes here: /scratch2/NCEPDEV/climate/Denise.Worthen/WORK/ufs_1degw3/tests

@aliabdolali
Copy link
Collaborator Author

I was able to get the test to run (see /scratch1/NCEPDEV/stmp2/Denise.Worthen/FV3_RT/rt_23024/cpld_control_wave_prod) but there were two issues.

One, the test is trying to move the file out_grd.glo_1deg into the baseline but it can't find it. Is this supposed to be the ww3 restart file? There is a file in the run directory named 20161004.000000.restart.glo_1deg. If that is the restart, then it needs to be used as the file name in the test.

Two, I had to adjust the PE tasking for the 1deg wave test. You had created the new TASKS_cpl_dflt_wwav but they were still appropriate for the 1/4 deg. For the 1deg resolution, I gave waves 12 PEs (the 1/4 uses 40). I only updated the hera setting so you'll need do the same for the others. See my changes here: /scratch2/NCEPDEV/climate/Denise.Worthen/WORK/ufs_1degw3/tests

@DeniseWorthen You are awesome. I'll modify the PEs for all the platforms. The out_grd.glo_1deg is the gridded outputs which should be compared with the baseline. I will add 20161004.000000.restart.glo_1deg to the filename to be compared with the baseline.
Once I confirm the test is running at my end, I will ask you to review it.
AA

@aliabdolali
Copy link
Collaborator Author

@junwang-noaa @JessicaMeixner-NOAA @DeniseWorthen
With guidance and help from Denise, I prepared the cpld_control_wave test. I need to run the whole rt.conf and create the baseline. Could you tell me when is my turn? so I can run it on Hera and Orion (the machines I have access to).

@@ -50,6 +50,12 @@ cp @[INPUTDATA_ROOT]/CICE_FIX/@[OCNRES]/grid_cice_NEMS_mx@[OCNRES].nc .
cp @[INPUTDATA_ROOT]/CICE_FIX/@[OCNRES]/kmtu_cice_NEMS_mx@[OCNRES].nc .
cp @[INPUTDATA_ROOT]/CICE_FIX/@[OCNRES]/mesh.mx@[OCNRES].nc .

# WW3 fix/input
if [[ $CPLWAV == .T. && $CPLWAV2ATM == .T. ]]; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We would need this only if CPLWAV is true, cplwave2atm=f would still need mod_defs and the input file

tests/rt.conf Outdated
@@ -128,8 +128,9 @@ RUN | fv3_ccpp_esg_HAFS_v0_hwrf_thompson_debug
# CPLD tests #
###################################################################################################################################################################################

COMPILE | SUITES=FV3_GFS_2017_coupled,FV3_GFS_2017_satmedmf_coupled,FV3_GFS_v15p2_coupled,FV3_GFS_v16beta_coupled S2S=Y | - wcoss_cray gaea.intel jet.intel | fv3 |
COMPILE | SUITES=FV3_GFS_2017_coupled,FV3_GFS_2017_satmedmf_coupled,FV3_GFS_v15p2_coupled,FV3_GFS_v16beta_coupled S2S=Y WW3=Y | - wcoss_cray gaea.intel jet.intel | fv3 |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the compile the same as line 162, which makes me wonder if we should either move the new test w/waves under that compile or just remove the second compile?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JessicaMeixner-NOAA I fixed them. Thanks.

skip-ci
@@ -25,6 +25,10 @@ elif [[ $MACHINE_ID = wcoss_dell_p3 || $MACHINE_ID = wcoss2 ]]; then
THRD_cpl_dflt=1; WPG_cpl_dflt=6; MPB_cpl_dflt="0 143"; APB_cpl_dflt="0 149"
OPB_cpl_dflt="150 179"; IPB_cpl_dflt="180 191"

TASKS_cpl_dflt_wwav=204; TPN_cpl_dflt_wwav=40; INPES_cpl_dflt_wwav=3; JNPES_cpl_dflt_wwav=8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be TPN_cpl_dfld_wwav=28 (same as cpl_dflt). This is the nodes/processor which varies across machines.

@@ -167,6 +179,10 @@ elif [[ $MACHINE_ID = cheyenne.* ]]; then
THRD_cpl_dflt=1; WPG_cpl_dflt=6; MPB_cpl_dflt="0 143"; APB_cpl_dflt="0 149"
OPB_cpl_dflt="150 179"; IPB_cpl_dflt="180 191"

TASKS_cpl_dflt_wwav=204; TPN_cpl_dflt_wwav=40; INPES_cpl_dflt_wwav=3; JNPES_cpl_dflt_wwav=8
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, cheyenne is only 36/node.

@@ -205,6 +221,10 @@ elif [[ $MACHINE_ID = stampede.* ]]; then
THRD_cpl_dflt=1; WPG_cpl_dflt=6; MPB_cpl_dflt="0 143"; APB_cpl_dflt="0 149"
OPB_cpl_dflt="150 179"; IPB_cpl_dflt="180 191"

TASKS_cpl_dflt_wwav=204; TPN_cpl_dflt_wwav=40; INPES_cpl_dflt_wwav=3; JNPES_cpl_dflt_wwav=8
Copy link
Collaborator

@DeniseWorthen DeniseWorthen Jan 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here (48)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DeniseWorthen
Copy link
Collaborator

Closed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants