Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove gfs_cyc dependency and replace CDUMP #137

Merged

Conversation

WalterKolczynski-NOAA
Copy link
Contributor

@WalterKolczynski-NOAA WalterKolczynski-NOAA commented Sep 17, 2024

The dependency on gfs_cyc is removed from the global verif script. gfs_cyc is being replaced by an interval and running on the appropriate cycles will be handled by the workflow from hereon. Workflow will always run verif at 18z, regardless of if the GFS forecast is produced for that cycle.

Additionally, the deprecated $CDUMP is replaced with the preferred $RUN.

Refs: NOAA-EMC/global-workflow#2928

`gfs_cyc` is being replaced in global-workflow, so taking this
opportunity to move cycle management from the verif script entirely.
Cycle control will be moved to the workflow manager.
`$CDUMP` is no longer used as a proxy for `$RUN`, so instances of
CDUMP are replaced.
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this pull request Sep 17, 2024
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:

```
--interval <n_hours>
```

Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every
cycle).

In cycled mode, there is an additional argument to control which
cycle will be the first gfs cycle:

```
---sdate_gfs <YYYYMMDDHH>
```

The default if not provided is `--idate` + 6h (first full cycle).

As part of this change, some of the validation of the dates has
been added. `--edate` has also been made optional and defaults to
`--idate` if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running
metp on the correct cycles. This also removes "do nothing" metp
tasks that exit immediately, because only the last GFS cycle in a
day would actually process verification.

Now, metp has its own cycledef and always runs at 18z, regardless
of whether gfs is running at 18z or not. This is simplier than
trying to determine the last gfs cycle of a day when it could
change from day to day. To facilitate this change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the
metp task needs to know whether the cycle has a gfsarch task or not.
metp will trigger on gfsarch completing (as before), or gdasarch
completing if there is no gfsarch.

metp tasks are no longer generated for forecast-only, as the pgbanl
files (copied of the 1p00 pgbanl files) are not generated for f-o
anyway. If metp is needed for f-o, additional work will be needed.

Additionally, a couple EE2 issues with the metp job are resolved
(even though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Depends on NOAA-EMC/EMC_verif-global#137
Resolves NOAA-EMC#260
Refs NOAA-EMC#1299
@WalterKolczynski-NOAA WalterKolczynski-NOAA marked this pull request as ready for review September 17, 2024 04:29
ush/run_verif_global_in_global_workflow.sh Show resolved Hide resolved
export SDATE_GFS=${SDATE_GFS:-$SDATE}
export EDATE_GFS=${EDATE_GFS:-$EDATE}
export VDATE="${VDATE:-$(echo $($NDATE -${VRFYBACK_HRS} $CDATE) | cut -c1-8)}"

cyc2run="${cyc}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the way verif-global is run. I think this is OK since it is not going to be operational, but just noting that it will now run every GFS cycle instead of once per day.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Workflow is only running it at 18z, so it will function similarly to before.

Copy link
Contributor

@DavidHuber-NOAA DavidHuber-NOAA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good. Approve pending global-workflow reviews and CI testing.

WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this pull request Sep 28, 2024
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:

```
--interval <n_hours>
```

Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every
cycle).

In cycled mode, there is an additional argument to control which
cycle will be the first gfs cycle:

```
---sdate_gfs <YYYYMMDDHH>
```

The default if not provided is `--idate` + 6h (first full cycle).

As part of this change, some of the validation of the dates has
been added. `--edate` has also been made optional and defaults to
`--idate` if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running
metp on the correct cycles. This also removes "do nothing" metp
tasks that exit immediately, because only the last GFS cycle in a
day would actually process verification.

Now, metp has its own cycledef and always runs at 18z, regardless
of whether gfs is running at 18z or not. This is simplier than
trying to determine the last gfs cycle of a day when it could
change from day to day. To facilitate this change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the
metp task needs to know whether the cycle has a gfsarch task or not.
metp will trigger on gfsarch completing (as before), or gdasarch
completing if there is no gfsarch.

metp tasks are no longer generated for forecast-only, as the pgbanl
files (copied of the 1p00 pgbanl files) are not generated for f-o
anyway. If metp is needed for f-o, additional work will be needed.

Additionally, a couple EE2 issues with the metp job are resolved
(even though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Depends on NOAA-EMC/EMC_verif-global#137
Resolves NOAA-EMC#260
Refs NOAA-EMC#1299
WalterKolczynski-NOAA added a commit to WalterKolczynski-NOAA/global-workflow that referenced this pull request Oct 1, 2024
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:

```
--interval <n_hours>
```

Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every
cycle).

In cycled mode, there is an additional argument to control which
cycle will be the first gfs cycle:

```
---sdate_gfs <YYYYMMDDHH>
```

The default if not provided is `--idate` + 6h (first full cycle).

As part of this change, some of the validation of the dates has
been added. `--edate` has also been made optional and defaults to
`--idate` if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running
metp on the correct cycles. This also removes "do nothing" metp
tasks that exit immediately, because only the last GFS cycle in a
day would actually process verification.

Now, metp has its own cycledef and always runs at 18z, regardless
of whether gfs is running at 18z or not. This is simplier than
trying to determine the last gfs cycle of a day when it could
change from day to day. To facilitate this change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the
metp task needs to know whether the cycle has a gfsarch task or not.
metp will trigger on gfsarch completing (as before), or gdasarch
completing if there is no gfsarch.

metp tasks are no longer generated for forecast-only, as the pgbanl
files (copied of the 1p00 pgbanl files) are not generated for f-o
anyway. If metp is needed for f-o, additional work will be needed.

Additionally, a couple EE2 issues with the metp job are resolved
(even though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Depends on NOAA-EMC/EMC_verif-global#137
Resolves NOAA-EMC#260
Refs NOAA-EMC#1299
@malloryprow malloryprow merged commit 564e20e into NOAA-EMC:develop Oct 10, 2024
@WalterKolczynski-NOAA WalterKolczynski-NOAA deleted the feature/gfs_interval branch October 10, 2024 18:22
WalterKolczynski-NOAA added a commit to NOAA-EMC/global-workflow that referenced this pull request Oct 22, 2024
# Description
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:
```
--interval <n_hours>
```
Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every cycle).
(This is a change from current behavior of 24.)

In cycled mode, there is an additional argument to control which cycle
will be the first gfs cycle:
```
--sdate_gfs <YYYYMMDDHH>
```
The default if not provided is `--idate` + 6h (first full cycle). This
is the same as current behavior when `gfs_cyc` is 6, but may vary from
current behavior for other cadences.

As part of this change, some of the validation of the dates has been
added. `--edate` has also been made optional and defaults to `--idate`
if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running metp
on the correct cycles. This also removes "do nothing" metp tasks that
exit immediately, because only the last GFS cycle in a day would
actually process verification.

Now, metp has its own cycledef and will (a) always runs at 18z,
regardless of whether gfs is running at 18z or not, if the interval is
less than 24h; (b) use the same cycledef as gfs if the interval is 24h
or greater. This is simpler than trying to determine the last gfs cycle
of a day when it could change from day to day. To facilitate this
change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the metp
task needs to know whether the cycle has a gfsarch task or not. metp
will trigger on gfsarch completing (as before), or look backwards for
the last gfsarch to exist.

Additionally, a couple EE2 issues with the metp job are resolved (even
though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Also corrects some dependency issues with the extractvars job for replay and the replay CI test.

Depends on NOAA-EMC/EMC_verif-global#137
Resolves #260
Refs #1299

---------

Co-authored-by: David Huber <david.huber@noaa.gov>
EricSinsky-NOAA pushed a commit to EricSinsky-NOAA/global-workflow that referenced this pull request Oct 24, 2024
To facilitate longer and more flexible GFS cadences, the `gfs_cyc`
variable is replaced with a specified interval. Up front, this is
reflected in a change in the arguments for setup_exp to:
```
--interval <n_hours>
```
Where `n_hours` is the interval (in hours) between gfs forecasts.
`n_hours` must be a multiple of 6. If 0, no gfs will be run (only
gdas; only valid for cycled mode). The default value is 6 (every cycle).
(This is a change from current behavior of 24.)

In cycled mode, there is an additional argument to control which cycle
will be the first gfs cycle:
```
--sdate_gfs <YYYYMMDDHH>
```
The default if not provided is `--idate` + 6h (first full cycle). This
is the same as current behavior when `gfs_cyc` is 6, but may vary from
current behavior for other cadences.

As part of this change, some of the validation of the dates has been
added. `--edate` has also been made optional and defaults to `--idate`
if not provided.

During `config.base` template-filling, `INTERVAL_GFS` (renamed from
`STEP_GFS`) is defined as `--interval` and `SDATE_GFS as
`--sdate_gfs`.

Some changes were necessary to the gfs verification (metp) job, as
`gfs_cyc` was being used downstream by verif-global. That has been
removed, and instead workflow will be responsible for only running metp
on the correct cycles. This also removes "do nothing" metp tasks that
exit immediately, because only the last GFS cycle in a day would
actually process verification.

Now, metp has its own cycledef and will (a) always runs at 18z,
regardless of whether gfs is running at 18z or not, if the interval is
less than 24h; (b) use the same cycledef as gfs if the interval is 24h
or greater. This is simpler than trying to determine the last gfs cycle
of a day when it could change from day to day. To facilitate this
change, support for the
undocumented rocoto dependency tag `taskvalid` is added, as the metp
task needs to know whether the cycle has a gfsarch task or not. metp
will trigger on gfsarch completing (as before), or look backwards for
the last gfsarch to exist.

Additionally, a couple EE2 issues with the metp job are resolved (even
though it is not run in ops):
- verif-global update replaced `$CDUMP` with `$RUN`
- `$DATAROOT` is no longer redefined in the metp job

Also corrects some dependency issues with the extractvars job for replay and the replay CI test.

Depends on NOAA-EMC/EMC_verif-global#137
Resolves NOAA-EMC#260
Refs NOAA-EMC#1299

---------

Co-authored-by: David Huber <david.huber@noaa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants