Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add explicit layerThicknessEdge variables #12

Conversation

cbegeman
Copy link
Collaborator

@cbegeman cbegeman commented Feb 23, 2022

This PR replaces the existing layerThicknessEdge variable with two variables: layerThicknessEdgeMean refers to the mean of the neighboring cell-centered layerThickness, and layerThicknessEdgeFlux refers to the layerThicknessEdge used to flux momentum and tracers. When config_thickness_flux_type = 'centered' layerThicknessEdgeFlux = layerThicknessEdgeMean, but when config_thickness_flux_type = 'upwind' layerThicknessEdgeFlux is upwinded.

Prior to this PR, when config_thickness_flux_type = 'upwind' layerThicknessEdge was upwinded everywhere. This PR uses layerThicknessEdgeFlux for horizontal advection terms and layerThicknessEdgeMean elsewhere, such as for mixing parameterizations. This change is in preparation for adding additional options for thickness fluxes.

This PR also removes layerThicknessEdge from global stats, since it did not offer much information not given by the global stats for layerThickness.

amametjanov and others added 13 commits February 4, 2022 14:58
In a few places, we trigger the same behavior as for the CAM macro.
EAM's functions are just wrapping share_file_mod's ones, without adding
any logic. This way, we can use the module even if EAM is not used
(e.g., for SCREAM builds).

Also, fixed indentation of some if blocks in namelist_mod.F90
This can happen for SCREAM, where dt is not known at the place
where the initialization happen. It will instead be known
only when timestepping is performed.

To be safe, add a check at the beginning of prim_run_subcycle,
to ensure nsplit has been set to a value>=1.
… a noinline clause to the FFT init to fix it.
Comment on lines -291 to -292
! layerThicknessEdge is computed in compute_solve_diag, and is not available yet,
! so recompute layerThicknessEdge here.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noting that this statement is false. layerThicknessEdge is computed in ocn_diagnostic_solve, which is called for timeLevel=2 in each time integration scheme before ocn_vel_vmix_tend is called.

@cbegeman cbegeman force-pushed the ocn/add-explicit-layerThickEdge-vars branch from 87904b9 to d0389d9 Compare February 23, 2022 19:45
@xylar
Copy link
Collaborator

xylar commented Feb 23, 2022

@cbegeman, and others, a general question for discussion. Are we okay with layerThicknessEdgeCenter as opposed to layerThicknessEdgeCentered? My first instinct was that it seemed like a layer thickness at the center of an edge as opposed to one found by centered averaging (I don't think it's actually correct that this layer thickness is from centered differences, since it's a mean rather than a difference).

@cbegeman
Copy link
Collaborator Author

@xylar I can change it to layerThicknessEdgeCentered if you think that's more clear. To me, layerThicknessEdgeCentered could also refer to the layer thickness at the center of the edge. Are you suggesting that layerThicknessEdgeMean might be a better name?

@xylar
Copy link
Collaborator

xylar commented Feb 23, 2022

Yes, I think I like layerThicknessEdgeMean. That's a great suggestion. Let's see if others have feedback before you change anything.

@vanroekel
Copy link
Collaborator

I very much like layerThicknessEdgeMean I'd vote for that instead of Center or Centered.

@dengwirda
Copy link
Collaborator

Just stepping back from the naming conversation for a moment, I'd say the main reason we want to support upwind-type layerThicknessEdge variables is to ensure we have monotonic thickness advection in certain cases --- in other words, that it shouldn't be possible numerically to create -ve thickness layers.

I think the main terms we have to worry about are the div(u*h_flux) and div(u*h_flux*psi) tendencies in the thickness and tracer advection steps, and perhaps also the PV-flux tendencies in the momentum eqn (though we should look into the PV case in more detail).

To aim for a minimal change, keeping all other forcing, friction and parameterisation tendencies using the existing centred scheme seems like a good place to start to me.

In terms of layerThicknessEdgeFlux itself, the intention is for this to be a blend of the centred and upwind thicknesses --- equivalent to the existing centred thickness in the deep ocean (so that we recover the energy conservation of TRiSK for global config.'s) but degrading to the upwind thickness in shallow regions to ensure we have stability.

@cbegeman
Copy link
Collaborator Author

@dengwirda Thanks for posting this context. I had in mind that this PR would simply introduce the new variables and aim for correctness in designating which of these variables should be deployed in which terms.

To my knowledge folks aren't utilizing the config_thickness_flux_type='upwind' option, so all the tendencies will still be identical when config_thickness_flux_type='centered'. In light of this, it would be preferable to me to determine whether layerThickEdgeFlux or layerThickEdgeCentered is more appropriate in the forcing and friction terms and include that in this PR. If you think that decision should only be made after testing in coastal configurations, then I suppose we would have to postpone it to the next PR. What do you think?

@dengwirda
Copy link
Collaborator

@cbegeman Yes, agree all round --- I feel that using layerThicknessEdgeFlux just in the div(u*h) and div(u*h*psi) tendencies to start with is a good option, with all other forcing, friction and parameterisation terms using the existing centred scheme. I think that's the most minimal change we can make that we can then extend for robust wetting + drying.

@cbegeman
Copy link
Collaborator Author

@dengwirda Got it. Commits c731e62 and 454069f revert my changes to horizontal mixing and cvmix so they will use the centered thicknesses.

Copy link
Collaborator

@xylar xylar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like there are still a few changes related to the move from layerThciknessEdgeCenter to layerThicknessEdgeMean. I'll take a more in-depth look early next week.

Did the allocation issue get resolved or is that something I should help look into?

Upgrading to the latest YAKL. Includes performance improvements for CUDA and HIP as well as a bug (memory leak) fix for the pool allocator. The new YAKL also has a fully functioning SYCL backend for Intel GPUs.

CUDA has what appears to be a compiler bug in the FFT routines. These have been worked around in the new YAKL. The FFT API changed by no longer requiring a parameter to init(). This has been reflected in pressure.cpp and crm_variance_transport.cpp

An issue with the standalone samxx has been fixed, and new machine files have been added.

This PR passes ./create_test SMS_Ld10_P6x7.ne4pg2_ne4pg2.F-MMFXX.summit_gnugpu.eam-rrtmgpxx

[BFB]
@mark-petersen
Copy link

@cbegeman thanks for your detailed work here, and apologies for the delayed reponse. I agree with these changes. Having layerThicknessEdgeMean and layerThicknessEdgeFlux is very clear.

The global stats names follow the previous patterns so I agree with your choices, even though MeanMin etc can be confusing. In general I think we should greatly reduce the number of variables in global stats (I only ever look at prognostic variables and KE), but that is a PR for another day. Do global statistics for layerThicknessEdgeMean and layerThicknessEdgeFlux provide more useful information than global stats for layerThicknessEdge?

I do worry about the memory footprint of MPAS-Ocean. We add 3D arrays freely, so the memory footprint becomes quite bloated. I'm guilty of that too. For example, it appears that layerThicknessEdgeMean is used in diffusive terms, forcing terms, and parameterizations. Again this is not the purpose of this PR, but we could decide if the extra 3D array is really worth the savings of averaging layerThickness at the neighboring cells every time we need it.

For the purpose of having a separate variable for the flux quantity for the layer thickness, defined with flags at the correct place, this all looks right to me.

@cbegeman
Copy link
Collaborator Author

cbegeman commented Mar 9, 2022

@mark-petersen Thanks for giving this a thorough look!

Do global statistics for layerThicknessEdgeMean and layerThicknessEdgeFlux provide more useful information than global stats for layerThicknessEdge?

I personally don't see either of these variables being particularly useful from a global stats perspective because they are going to reflect the global stats for layerThickness very closely.

To your question about whether an additional 3d array is worth it: I guess the short answer is that we'd have to assess this further. If we're going to go this route, my first inclination would be to consider making layerThicknessEdgeFlux temporary (allocated, computed via diagnostic routine, and deallocated in time integration routines and ocn_tendency). I imagine this would be less costly than averaging in all the places that layerThicknessEdgeMean is needed, but we'd probably have to time it out to know for sure.

Should we try to flag these issues for further consideration, with a github issue or somewhere in the code?

@mark-petersen
Copy link

OK, since you agree, please take layerThicknessEdgeMean and layerThicknessEdgeFlux out of global stats now. That is relevant to this PR, and they don't add value to global stats. Sorry about your time spent putting them in.

The decision between permanently allocated arrays and flops is a tricky one. Flops are free, as they say, but retrieval of variables from memory takes much longer. Here we can either retrieve one column of layerThicknessEdgeMean; or retrieve two columns of layerThickness that are not contiguous in memory, and average them. The first one might be faster because we retrieve half as many bytes from memory (@philipwjones any opinion on that?).

I think layerThicknessEdgeFlux is worth having in permanently allocated arrays, because it depends on the numerical method and needs to be consistent across the model.

So my conclusion is, go ahead and leave those two variables as-is. Thanks!

@philipwjones
Copy link
Collaborator

You'll probably regret tagging me since this one hits two of my biggest pet peeves with MPAS and you're catching me after a long day of virtual workshop. So... (1) Registry must die and (2) if you've heard of software containers, ocean diagnostics is a software frickin' dumpster and adding any more to it is contributing to the dumpster fire that will result. Can we please start properly scoping variables? That's not directed at any of you specifically, just that we've collectively gotten lazy on this.

But in answer to the specific question, I'm not sure how the new edge thickness will be used so it's hard for me to say what's best. As a general rule, something as simple as this should be kept local and recomputed (even as a local scalar) rather than stored. But if it's used in more than a few places or needs to be consistent (like the edge flux) then it makes sense to compute once and share. Or if you're going to introduce options that require branching every time you use it, then computing once to avoid excessive branching might be appropriate. And if we're going to have both of these as shared variables, then maybe we need a thickness module so they can be explicitly use'd and dependencies are clear rather than continuing to add to the diagnostics clutter.

@philipwjones
Copy link
Collaborator

Funny - I had a rant /rant with brackets around much of that first paragraph but this GitHub editor interpreted it as formatting and left it off...

This is a combination of 17 commits.

    OpenACC changes for time integration

    Changing pgi-summit compile flags for OpenACC
    Adding scalar get_config routines to simplify OpenACC changes
    Adding GPU vector reconstruct routines
    Adding varinp_ scalar names as alternative to config_ input variable pointer names
    Cleaning up unused use statements in EOS
    Changing input coeff in EOS from pointer to scalar

    Starting to pull openacc directives out of diagnostics into time integration routines

    Changing diagnostics variables to create rather than copyin
    Moving some data movement out of diagnostics

Moving more openacc data movement directives out of diagnositics routine

Moving more OpenACC data movement directives out of diagnostics

Moving rest of OpenACC copyins out of diagnostics

Moving OpenACC data copyout movement directives out of diagnostic routine

Moving more OpenACC data movement out of diagnostics

Moving more OpenACC data movement directives out of diagnostic solve

Moving more OpenACC data movement directives out of diagnostic solve

Moving more OpenACC data movement directives out of diagnostic solve

Adding use of varinp scalar variable in diagnostics routine to simplify OpenACC mods

Extending out OpenACC region in time integration

Extending OpenACC region in time integration routines

Extending OpenACC region to include normal Barotropic/Baroclinic Velocity calcs

Extending OpenACC region above Freq calcs

Moving more OpenACC data directives out of diagnostics

Moving remaining copyouts/updates out of diagnositic solve
@cbegeman
Copy link
Collaborator Author

@philipwjones @mark-petersen Thanks for this discussion. It seems to me that this change to where layerThicknessEdge variables are computed/stored is better suited to a separate pull request. The main goal of this PR was to remove ambiguity about where upwinded layerThicknessEdge should be used. I understand that this pull request does contribute to the ocn_diagnostics_variables dumpster, and I am willing to work on a separate PR to remove both layerThicknessEdge variables from ocn_diagnostics_variables if you feel that that's a priority.

Regarding this next PR, I imagine that a separate thickness module would be appropriate so that we only compute layerThicknessEdgeFlux once, given that there is some branching here and given that the upwinded routine has if-clauses to determine the upwind direction. This next PR could also include moving the layerThicknessFluxMean computation locally.

…ject#4825)

Fix non-standard declarations of 8-byte integers in ESMF in MPAS-Framework

This PR fixes non-standard declarations of 8-byte integers in
ESMF_BaseMod.F90. See the original PR for MPAS at
MPAS-Dev/MPAS-Model#959. The ESMF_BaseMod module contains several non-
standard declarations of 8-byte integer variables. These were identified
using the GNU and NAG compilers.

[BFB]
@cbegeman cbegeman force-pushed the ocn/add-explicit-layerThickEdge-vars branch from e950cfe to 2071b66 Compare March 10, 2022 23:26
Youngsung Kim and others added 3 commits March 11, 2022 10:53
Changes:
* add crusher node in config_machines.xml, config_batch.xml
* create gnu_crusher.cmake
* for testing amdclang and crayclang, cmake configurations for the compilers are also added
* add CPRAMD macro in shr_infnan_mod.F90.in to use ieee arithmetic
Add gnu compiler support on Crusher and AMD.

Changes:

    add crusher configuration in config_machines.xml, config_batch.xml
    create gnu_crusher.cmake
    for testing amdclang and crayclang, cmake configurations for the compilers are also added
    add CPRAMD macro in shr_infnan_mod.F90.in to use ieee arithmetic

[BFB]
Extending OpenACC diagnostics up to time integration

Pulling OpenACC data movement directives up into top-level time
integration routines and extending OpenACC region by:
* Changing diagnostics variables to create rather than copyin
* Changing pgi-summit compile flags for OpenACC
* Adding scalar get_config routines to simplify OpenACC changes
* Adding GPU vector reconstruct routines
* Adding varinp_ scalar names as alternative to config_ input variable
  pointer names
* Cleaning up unused use statements in EOS
* Changing input coeff in EOS from pointer to scalar

[BFB]
@mark-petersen mark-petersen self-requested a review March 15, 2022 14:46
@xylar
Copy link
Collaborator

xylar commented Mar 15, 2022

@cbegeman, we chatted today and @philipwjones and @mark-petersen are fine with this moving over to E3SM when you're ready.

Copy link

@mark-petersen mark-petersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with these changes. I think it is ready for an E3SM PR. Thanks @cbegeman!

@cbegeman cbegeman force-pushed the ocn/add-explicit-layerThickEdge-vars branch from 2071b66 to 034b20f Compare March 15, 2022 16:42
@cbegeman
Copy link
Collaborator Author

Moved to E3SM-Project#4832

@cbegeman cbegeman closed this Mar 15, 2022
mark-petersen pushed a commit that referenced this pull request May 31, 2023
cee/15.0.0 with GPU MPI buffers can crash in a system lib like this:

#4  0x00007fffe159e35b in (anonymous namespace)::do_free_with_callback(void*, void (*)(void*)) [clone .constprop.0] () from /opt/cray/pe/cce/15.0.0/cce/x86_64/lib/libtcmalloc_minimal.so.1
#5  0x00007fffe15a8f16 in tc_free () from /opt/cray/pe/cce/15.0.0/cce/x86_64/lib/libtcmalloc_minimal.so.1
#6  0x00007fffe99c2bcd in _dlerror_run () from /lib64/libdl.so.2
#7  0x00007fffe99c2481 in dlopen@@GLIBC_2.2.5 () from /lib64/libdl.so.2
#8  0x00007fffea7bce42 in _ad_cray_lock_init () from /opt/cray/pe/lib64/libmpi_cray.so.12
#9  0x00007fffed7eb37a in call_init.part () from /lib64/ld-linux-x86-64.so.2
#10 0x00007fffed7eb496 in _dl_init () from /lib64/ld-linux-x86-64.so.2
#11 0x00007fffed7dc58a in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#12 0x0000000000000001 in ?? ()
#13 0x00007fffffff42e7 in ?? ()
#14 0x0000000000000000 in ?? ()

Work around this by using cee/14.0.3.
xylar pushed a commit that referenced this pull request Nov 8, 2023
Update Icepack warnings calls, cori-haswell pes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.