Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update global jdas enkf diag job with COMIN/COMOUT for COM prefix #2959

Open
wants to merge 16 commits into
base: develop
Choose a base branch
from

Conversation

mingshichen-noaa
Copy link
Contributor

Description

NCO has requested that each COM variable specify whether it is an input or an output. This completes that process for the global jdas enkf diagnostics job.

Refs #2451

Type of change

  • Maintenance (code refactor, clean-up, new CI test, etc.)

Change characteristics

  • Is this a breaking change (a change in existing functionality)? NO
  • Does this change require a documentation update? NO

How has this been tested?

  • Clone and build on RDHPCS
  • Cycled tests on Hercules
  • Forecast-only tests on Hercules

Checklist

  • Any dependent changes have been merged and published
  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing tests pass with my changes
  • I have made corresponding changes to the documentation if necessary

Copy link
Member

@KateFriedman-NOAA KateFriedman-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look ok for the most part but some COMOUTs needs to be changed to COMINs.

jobs/JGDAS_ENKF_DIAG Outdated Show resolved Hide resolved
jobs/JGDAS_ENKF_DIAG Outdated Show resolved Hide resolved
jobs/JGDAS_ENKF_DIAG Outdated Show resolved Hide resolved
jobs/JGDAS_ENKF_DIAG Outdated Show resolved Hide resolved
@mingshichen-noaa
Copy link
Contributor Author

@KateFriedman-NOAA
Thank you very much for your comments. Based on your suggestions. I have changed "COMOUT_ATMOS_ANALYSIS_DET_PREV" into "COMIN_ATMOS_ANALYSIS_DET_PREV".

Copy link
Member

@KateFriedman-NOAA KateFriedman-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Requested updates alongside other updates look good, thanks @mingshichen-noaa !

@WalterKolczynski-NOAA WalterKolczynski-NOAA added the CI-Wcoss2-Ready **CM use only** PR is ready for CI testing on WCOSS label Sep 27, 2024
@emcbot emcbot added CI-Wcoss2-Building **Bot use only** CI testing is cloning/building on WCOSS and removed CI-Wcoss2-Ready **CM use only** PR is ready for CI testing on WCOSS labels Sep 27, 2024
@emcbot
Copy link

emcbot commented Sep 27, 2024

CI Update on Wcoss2 at 09/27/24 05:05:09 AM
============================================
Cloning and Building global-workflow PR: 2959
with PID: 52099 on host: clogin03

@emcbot emcbot added CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress and removed CI-Wcoss2-Building **Bot use only** CI testing is cloning/building on WCOSS labels Sep 27, 2024
@emcbot
Copy link

emcbot commented Sep 27, 2024

Automated global-workflow Testing Results:

Machine: Wcoss2
Start: Fri Sep 27 05:11:33 UTC 2024 on clogin03
---------------------------------------------------
Build: Completed at 09/27/24 05:49:22 AM
Case setup: Completed for experiment C48_ATM_c55c8630
Case setup: Skipped for experiment C48mx500_3DVarAOWCDA_c55c8630
Case setup: Skipped for experiment C48_S2SWA_gefs_c55c8630
Case setup: Completed for experiment C48_S2SW_c55c8630
Case setup: Completed for experiment C96_atm3DVar_extended_c55c8630
Case setup: Skipped for experiment C96_atm3DVar_c55c8630
Case setup: Completed for experiment C96C48_hybatmaerosnowDA_c55c8630
Case setup: Completed for experiment C96C48_hybatmDA_c55c8630
Case setup: Completed for experiment C96C48_ufs_hybatmDA_c55c8630

@emcbot emcbot added CI-Wcoss2-Failed **Bot use only** CI testing on WCOSS for this PR has failed and removed CI-Wcoss2-Running **Bot use only** CI testing on WCOSS for this PR is in-progress labels Sep 27, 2024
@emcbot
Copy link

emcbot commented Sep 27, 2024

Experiment C96C48_hybatmDA_c55c8630 FAIL on Wcoss2 at 09/27/24 06:42:30 AM

Error logs:

/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2959/RUNTESTS/COMROOT/C96C48_hybatmDA_c55c8630/logs/2021122100/gdasanaldiag.log

Follow link here to view the contents of the above file(s): (link)

@WalterKolczynski-NOAA
Copy link
Contributor

JGDAS_ATMOS_ANALYSIS_DIAG also uses the exglobal_diag.sh script, so they have to be updated at the same time.

+ JGDAS_ATMOS_ANALYSIS_DIAG[30]: declare_from_tmpl -rx COM_ATMOS_ANALYSIS
+ bash_utils.sh[35]: [[ NO == \N\O ]]
+ bash_utils.sh[35]: set +x
declare_from_tmpl :: COM_ATMOS_ANALYSIS=/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2959/RUNTESTS/COMROOT/C96C48_hybatmDA_c55c8630/gdas.20211221/00//analysis/atmos
+ JGDAS_ATMOS_ANALYSIS_DIAG[31]: mkdir -m 775 -p /lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2959/RUNTESTS/COMROOT/C96C48_hybatmDA_c55c8630/gdas.20211221/00//analysis/atmos
+ JGDAS_ATMOS_ANALYSIS_DIAG[35]: /lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2959/global-workflow/scripts/exglobal_diag.sh
Begin exglobal_diag.sh at Fri Sep 27 06:36:46 UTC 2024
++ exglobal_diag.sh[25]: pwd
+ exglobal_diag.sh[25]: pwd=/lfs/h2/emc/stmp/terry.mcguinness/RUNDIRS/C96C48_hybatmDA_c55c8630/gdas.2021122100/analdiag.246715
+ exglobal_diag.sh[28]: CDATE=2021122100
+ exglobal_diag.sh[29]: GDUMP=gdas
+ exglobal_diag.sh[32]: export 'CHGRP_CMD=chgrp rstprod'
+ exglobal_diag.sh[32]: CHGRP_CMD='chgrp rstprod'
+ exglobal_diag.sh[33]: export NCLEN=/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2959/global-workflow/ush/getncdimlen
+ exglobal_diag.sh[33]: NCLEN=/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2959/global-workflow/ush/getncdimlen
+ exglobal_diag.sh[34]: export CATEXEC=/apps/ops/prod/libs/intel/19.1.3.304/cray-mpich/8.1.4/ncdiag/1.0.0/bin/ncdiag_cat_serial.x
+ exglobal_diag.sh[34]: CATEXEC=/apps/ops/prod/libs/intel/19.1.3.304/cray-mpich/8.1.4/ncdiag/1.0.0/bin/ncdiag_cat_serial.x
+ exglobal_diag.sh[35]: COMPRESS=gzip
+ exglobal_diag.sh[36]: UNCOMPRESS=gunzip
+ exglobal_diag.sh[37]: APRUNCFP='mpiexec -l -np $ncmd --cpu-bind verbose,core cfp'
+ exglobal_diag.sh[40]: netcdf_diag=.true.
+ exglobal_diag.sh[41]: binary_diag=.false.
+ exglobal_diag.sh[44]: RUN=gdas
+ exglobal_diag.sh[45]: SENDECF=NO
+ exglobal_diag.sh[46]: SENDDBN=NO
+ exglobal_diag.sh[51]: export APREFIX=gdas.t00z.
+ exglobal_diag.sh[51]: APREFIX=gdas.t00z.
/lfs/h2/emc/global/noscrub/globalworkflow.ci/GFS_CI_ROOT/PR/2959/global-workflow/scripts/exglobal_diag.sh: line 52: COMOUT_ATMOS_ANALYSIS: unbound variable

@mingshichen-noaa
Copy link
Contributor Author

@WalterKolczynski-NOAA:
Your comment: "JGDAS_ATMOS_ANALYSIS_DIAG also uses the exglobal_diag.sh script, so they have to be updated at the same time."
In fact, I have updated both GDAS_ATMOS_ANALYSIS_DIAG and exglobal_diag.sh script in this PR.
@RaghuReddy-NOAA::
Today you told me the jobs (JGDAS_ENKF_) are relevant each other, this means I should update all jobs (JGDAS_ENKF_) and associated bash and python scripts in one PR at the same time. Am I right?

The jobs (JGDAS_ENKF_*) include JGDAS_ENKF_DIAG, JGDAS_ENKF_ECEN , JGDAS_ENKF_POST, JGDAS_ENKF_SELECT_OBS, JGDAS_ENKF_SFC , JGDAS_ENKF_SNOW_RECENTER, JGDAS_ENKF_UPDATE

@WalterKolczynski-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA: Your comment: "JGDAS_ATMOS_ANALYSIS_DIAG also uses the exglobal_diag.sh script, so they have to be updated at the same time." In fact, I have updated both GDAS_ATMOS_ANALYSIS_DIAG and exglobal_diag.sh script in this PR. @RaghuReddy-NOAA:: Today you told me the jobs (JGDAS_ENKF__) are relevant each other, this means I should update all jobs (JGDAS_ENKF__) and associated bash and python scripts in one PR at the same time. Am I right?

The jobs (JGDAS_ENKF_*) include JGDAS_ENKF_DIAG, JGDAS_ENKF_ECEN , JGDAS_ENKF_POST, JGDAS_ENKF_SELECT_OBS, JGDAS_ENKF_SFC , JGDAS_ENKF_SNOW_RECENTER, JGDAS_ENKF_UPDATE

No, you only need to do the ones that are calling the script you modify here (exglobal_diag.sh), that's just two j-job scripts:

>grep exglobal_diag.sh jobs/*
jobs/JGDAS_ATMOS_ANALYSIS_DIAG:${ANALDIAGSH:-${SCRgfs}/exglobal_diag.sh}
jobs/JGDAS_ENKF_DIAG:${ANALDIAGSH:-${SCRgfs}/exglobal_diag.sh}

@RaghuReddy-NOAA
Copy link

@WalterKolczynski-NOAA and @mingshichen-noaa I see I am tagged in the two messages above, I am wondering if it was intended for Rahul?

@mingshichen-noaa
Copy link
Contributor Author

@RaghuReddy-NOAA
I am sorry that the previous messages to you and Walter at a window have tagged you. Now let me explain my question. below
Because The jobs (JGDAS_ENKF_*: including JGDAS_ENKF_DIAG, JGDAS_ENKF_ECEN , JGDAS_ENKF_POST, JGDAS_ENKF_SELECT_OBS, JGDAS_ENKF_SFC , JGDAS_ENKF_SNOW_RECENTER, JGDAS_ENKF_UPDATE) are associate with JDAS EnKF, can I create one branch to change COM_ into COMIN/COMOUT_ instead of job by job?

@DavidHuber-NOAA
Copy link
Contributor

@mingshichen-noaa Is this PR ready to be reviewed/tested again?

@WalterKolczynski-NOAA WalterKolczynski-NOAA removed the CI-Wcoss2-Failed **Bot use only** CI testing on WCOSS for this PR has failed label Oct 18, 2024
Copy link
Contributor

@WalterKolczynski-NOAA WalterKolczynski-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks correct, but CI will need to wait until #2928 is merged and then develop merged into this PR, as that PR contains fixes needed for one of the CI tests.

@WalterKolczynski-NOAA
Copy link
Contributor

@mingshichen-noaa Before we run this through automated CI, did this pass a test using the C96_atm3DVar case?

@mingshichen-noaa
Copy link
Contributor Author

@WalterKolczynski-NOAA
Yesterday I tested PR #2959 in 3 forecast-only cases (C48_ATM C48_S2SW C48_S2SWA_gefs) and 3 cycled cases( C96_atm3DVar C96C48_hybatmDA C96C48_ufs_hybatmDA) in hercules environment.
Today I tested C96_atm3DVar in Orion environment
After I did " rocotostat -d C96_atm3DVar.db -w C96_atm3DVar.xml -v 10" , head lines of results showed as follows:
CYCLE TASK JOBID STATE EXIT STATUS TRIES DURATION

202112201800 gdas_stage_ic druby://130.18.14.111:34781 SUBMITTING - 0 0.0
202112201800 gdas_fcst_seg0 - - - - -
202112201800 gdas_atmos_prod_f000 - - - - -
202112201800 gdas_atmos_prod_f003 - - - - -
202112201800 gdas_atmos_prod_f006 - - - - -
202112201800 gdas_atmos_prod_f009 - - - - -

@WalterKolczynski-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA
Yesterday I tested PR #2959 in 3 forecast-only cases (C48_ATM C48_S2SW C48_S2SWA_gefs) and 3 cycled cases( C96_atm3DVar C96C48_hybatmDA C96C48_ufs_hybatmDA) in hercules environment.
Today I tested C96_atm3DVar in Orion environment
After I did " rocotostat -d C96_atm3DVar.db -w C96_atm3DVar.xml -v 10" , head lines of results showed as follows:
CYCLE TASK JOBID STATE EXIT STATUS TRIES DURATION
202112201800 gdas_stage_ic druby://130.18.14.111:34781 SUBMITTING - 0 0.0
202112201800 gdas_fcst_seg0 - - - - -
202112201800 gdas_atmos_prod_f000 - - - - -
202112201800 gdas_atmos_prod_f003 - - - - -
202112201800 gdas_atmos_prod_f006 - - - - -
202112201800 gdas_atmos_prod_f009 - - - - -

Do you have an entry for rocotorun in you crontab for each experiment? Because it looks like it hasn't run since the first time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants