Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhancement/runtime pft count #189

Merged
merged 119 commits into from
Sep 11, 2017
Merged

Enhancement/runtime pft count #189

merged 119 commits into from
Sep 11, 2017

Conversation

mnlevy1981
Copy link
Collaborator

On my way to runtime PFT counts, I realized it wasn't that difficult to get runtime tracer counts... but I'd like to talk about how I achieved that before pressing forward with the PFT count variables. With the changes from today, finishing #184 will let us remove marbl_sizes.F90.

This is already a pretty crowded pull request:

There are some small POP changes as well. Development started from marbl_dev_levy_n105; marbl_dev_levy_n106 uses the single init routine from 6afefd0 and marbl_dev_levy_n107 relies on POP's buildcpp script instead of MARBL for tracer counts (so ECOSYS_BASE_NT is still a build-time macro, and I also introduced CISO_NT=14, but they are used by POP rather than MARBL)

Rather than module variables, these two can be local variables in
complete_config_and_init() and intent(out) variables from the respective
forcing field index constructors.
Rather than a module variable, this can be local in the interior forcing index
constructor. Requires allocating this%tracer_id after computing
tracer_restore_cnt (which we should've done before anyway)
config -> init_configuration
init -> init_parameters_and_tracers
complete_config_and_init -> init_complete
If you are initializing via namelist and don't want to use marbl%put() or
marbl%get(), this routine runs all three stages of initialization in a single
call.

The init_from_namelist test uses it (so interface is different in the "no
namelist" test), and I also updated the stand-alone test suites so the MARBL
log ontains the subroutine names updated in the last commit rather than the old
subroutine names.
Remove ECOSYS_BASE_NT from necessary CPPDEFS, move ecosys_base_tracer_cnt,
ciso_tracer_cnt, and marbl_total_tracer_cnt to the tracer indexing type. To
remove many of the "use marbl_sizes_mod" statements, I replaced explicit
dimension sizing on intent(in) and intent(out) arrays with ":"; in other
places, I relied on additional arguments in, the size of other arrays, or the
tracer indexing type itself to get tracer module sizes.

Still need to introduce consistency check in tracer index constructor (ensure
that total tracer count = base tracer count + ciso tracer count.
If total number of tracers is not equal to base tracer count + ciso tracer
count, init routine will abort.
They are now

init_configuration -> init_phase1
init_parameters_and_tracers -> init_phase2
init_complete -> init_phase3

init() itself is unchanged (may eventually become init_allphases)
Instead of passing marbl_total_tracer_cnt as argument, associate
variable with size(tracer_names)
Updates last commit to avoid evaluating size() multiple times.
In the tracer indexing type, the component is now simply total_cnt;
elsewhere in the code, the variable is tracer_cnt. This discrepancy may
seem a little odd on face value, but the "tracer" modifier doesn't seem
necessary in the tracer index type (what else would we be counting?) and
the "total" isn't necessary elsewhere in the code (whereas in the type
we also have counts for individual tracer modules).
Makes more sense than allocating memory in the "set_parms_default"
routine
Introduced new add_tracer_index() method to the tracer index class,
which updates the total count, calls the tracer_count_type's
update_count routine for the appropriate tracer module, and then returns
the new index as intent(out). Still need to add error-checking, but that
will involve updating the constructor interface to use the log type.

I also updated the "MARBL Tracer indices" section of the log output to
add lines like

ecosys_base tracer module contains 32 tracers; indices are 1 to 32
ciso tracer module contains 14 tracers; indices are 33 to 46

(ciso line is omitted if ciso_on is .false.)
1. gcm_nl_buffer should be last non-optional argument (will make it easier if
we decide to make it optional)
2. intent(in) arguments before intent(out) arguments (only intent out is an
optional argument)
The interface for add_tracer_index is a little different, and the way we handle
errors in add_tracer_index is different from other routines
(tracer_index_constructor calls add_tracer_index repeatedly and then checks for
labort_marbl instead of checking after every call) because otherwise the
constructor code is very difficult to read.
@mnlevy1981
Copy link
Collaborator Author

@klindsay28 -- the eight commits between (and including) 9ee2746 through 66c7bfb deal with modifications that came up in our code review. There are two things I did NOT address, because I think next week's meeting will clarify the best path forward:

  1. I did not rename the catch-all initialization with the required gcm_nl_buffer argument to init_allphases(); it is still init() for now
  2. I did not introduce an easy way for GCMs to call init() with the default namelist

It turns out you can not use

   integer, dimension(size(A)) :: B

or

   integer, dimension(shape(A)) :: B

in place of

   integer, dimension(size(A,1), size(A,2)) :: B

Which we had talked about briefly, I believe for making sure tracers_local and tracers had the same dimensions.

(No action necessary right now, just leaving a note so we know where to pick up at the next review... I'm going to get back to variable PFT count now and wanted to make sure it was clear where that work began)

To prepare for introduction of new phase prior to init_phase1(), renamed

init_phase1 -> init_phase2
init_phase2 -> init_phase3
init_phase3 -> init_phase4
We have a better check in place to make sure the total tracer count matches the
sum of each tracer module count, so we don't need to explicitly compare the two
in init_phase2 (or maybe it's init_phase3 now)
If the GCM requests marbl_tracer_cnt during init, the optional argument is now
passed to the tracer index constructor.
init_phase4 now just calls marbl_mod:marbl_init_forcing_fields() instead of
going through each step.
This pulls lots of initialization code out of marbl_mod (which will help when
we later refactor marbl_mod into marbl_surface_mod and marbl_interior_mod). For
now, it helps as I create new initialization subroutines to try to clean up our
multi-phase init process.
subname was marbl_mod:[subroutine name] instead of marbl_init_mod:...
To clean up marbl_instance%init_phase2, I introduced marbl_init_log_and_timers
and marbl_init_config_vars1.
To clean up marbl_instance%init_phase3, I introduced marbl_init_config_vars2,
marbl_init_tracers and marbl_init_parameters1
marbl_instance%init_phase4 now calls marbl_init_parameters2
This means that ciso_fract_factors will always be written to the log, whereas
before it was only logged if ciso was on.
@mnlevy1981
Copy link
Collaborator Author

Based on Tuesday's meeting, I'm planning on the following path forward:

  1. All parameter setup should be in marbl_parms.F90 => merge marbl_config_mod.F90 and marbl_parms.F90

  2. Rework config_and_parms_type so that marbl_instance%user_settings%put() can be called before any of the define_variable() routines

    • This means merging %configuration and %parameters into %user_settings, so the locking mechanism and error criteria of this type will change
  3. One monolithic marbl_instance%init() instead of arbitrary init_phase#()

    • Once all the put() statements occur before we define the variables, we no longer need to pause between phases to give the GCM the opportunity to call put()
  4. POP's build-namelist will provide a new (text) file that can be slurped up and turned into user_settings%put() calls

    • Format to be determined later, but probably one variable per line, with each line containing
    varname  datatype  value
    
  5. remove all namelists from MARBL

  6. Add third category of parameters (configuration_vars, parameters, and PFT_parameters? Names will be entirely internal to MARBL, but should be easy for future developers to determine where new variables belong)

    • move *_config (autotrophs_config, etc) from config_vars to parameters (and maybe rename these variables *_define?)
    • move PFT-specific variables from parameters to PFT_parameters
  7. Move PFT count variables from being build-time specified in marbl_sizes.F90 to run-time specified in configuration_vars

    • Will let us remove marbl_sizes.F90
    • Will require allocated memory for *_config (similar to how we allocate memory for tracer_restore_vars after constructing the tracer indices)

I'm open to suggestions on a better order to do things, but am getting started on merging marbl_config_mod and marbl_parms now.

marbl_config_set_defaults, marbl_config_read_namelist,
marbl_define_config_vars, and set_derived_config are all in marbl_parms.F90
instead of marbl_config_mod.F90; marbl_config_mod.F90 just contains the
marbl_config_and_parms_type and associated methods (which should probably get
moved to marbl_internal_types.F90)
Try to be clearer about which variables can be set via put statements
Better description of what each of the marbl_settings_set_defaults_* and
marbl_settings_define_* routines are doing.
This better reflects what the variable represents. In the future, we may decide
to make grazer_prey_cnt an array of dimension zooplankton_cnt (and the max_
prefix would no longer be accurate)
marbl_internal_types.F90 doesn't actually need the PFT_cnt variables; they can
be determined based on the size of autotrophs, zooplankton, etc. This lets me
put the PFT counts in marbl_settings_mod and remove marbl_sizes altogether.
Introduced new marbl_pft_mod module that contains all the PFT types. The
autotroph_type, zooplankton_type, and grazing_type all have new
set_to_default() methods that set values based on pre-determined PFT classes.

Which classes are set is determined by the new parameter PFT_defaults. The
current default is 'CESM2', which uses 'sp', 'diat', and 'diaz' for autotrophs,
'zoo' for zooplankon, and 'sp_zoo', 'diat_zoo', 'diaz_zoo' for grazing.

If PFT_defaults = 'CESM2' then the user can not change autotroph_cnt,
zooplankton_cnt, or max_grazing_prey_cnt via put_setting() calls. This is
enforced in the add_var subroutine. Additionally, if PFT_defaults =
'user-specified' then the user MUST specify these three parameters (as well as
all the PFT parameters); this is also enforced in the add_var subroutine.

Still to do:

1. If user wants to add a new PFT_defaults option, then the three PFT_cnt
variables will need to be updated somehow. I think

call update_PFT_count()

after the add_var('PFT_defaults') call is the best option.

2. Should there be error-checking based on PFT_cnt before calling
set_to_default()?
Now have four distinct groups of parameters:

1. general_parms (was pre_tracers1)
2. PFT_counts (autotroph_cnt, zooplankton_cnt, and max_grazer_prey_cnt formerly
   in pre_tracers1)
3. PFT_derived_types (was pre_tracers2)
4. tracer_dependent (was post_tracers)

In phase (2), if PFT_defaults = 'user-specified' then all PFT counts = -1 (and
must be changed with put_setting())

Also, cleaned up some logical variable names:
1. editable -> nondefault_allowed
2. must_be_put -> nondefault_required
3. allow_edit -> allow_nodefault
4. edit_attempt -> nondefault_val

And I needed to add PFT_defaults to the list of parameters not changed in the
get_put unit test.
marbl_internal_types  -> marbl_interface_private_types
marbl_interface_types -> marbl_interface_public_types

(And lots of modules use those two...)
Since CESM generates an inputfile list, there are several parameters in MARBL
where changing the default value in the fortran code does not change the CESM
output (the value is over-written with whatever is in the inputfile). To help
users avoid making this mistake, we have added a comment in the code for each
variable that is put into marbl_in.

When MARBL can build its own inputfile (and CESM calls the marbl utility rather
than POP's build-namelist), the comment will be updated to no longer be
CESM-specific.
Introduce new marbl_utils_mod that currently has a single subroutine:
marbl_utils_str_to_array. This subroutine looks for a provided separator and
returns an array with each element containing substrings that occur between
separators, ignoring separators that appear between "" or '' blocks.

For example:

character(len=9), allocatable :: array(:)
call marbl_utils_str_to_array("ABC, DEF, GH, IJKL", ",", array)

returns

array(1) = "ABC      "
array(2) = " DEF     "
array(3) = " GH      "
array(4) = " IJKL    "

character(len=9), allocatable :: array(:)
call marbl_utils_str_to_array("ABC, 'DEF, GH', IJKL", ",", array)

returns

array(1) = "ABC      "
array(2) = " DEF, GH "
array(3) = " IJKL    "

This is used twice in put_inputfile_line; once to string out comments

call marbl_utils_str_to_array(inputfile_line, "!", line_out)
! line_out(1) contains everything up to the first ! not in "" or '' block

and once to determine if the value to the right of '=' in the inputfile is an
array.

Next steps:

1. move linear_root() from marbl_diags to marbl_utils_mod.F90
2. unit test both subroutines

One flaw right now -- if the input string produces an array element longer than
the declared length of array, instead of catching the error cleanly the program
crashes. So marbl_status_log should be an intent(inout) and that error should
be caught.
pass str and expected_array to a subroutine, nothing else needs to change from
test to test
pass x, y, and expected root to a subroutine, nothing else needs to change from
test to test
Added a few more linear root tests (included two that expect to not find root),
as well as a few more str_to_array tests. Also created strip_comments_test that
mimics what we do when parsing input file lines -- uses '!' instead of ',' as a
delimiter and makes sure that the first element of array is returned correctly.
put_setting("ciso_on", .true.)

is equivalent to

put_setting("    ciso_on", .true."

and

put_setting("   ciso_on = .true.")
No longer need gnu 5 or later, so testing 4.8 on a machine
Need -wmismatch for MPI_Bcast(), otherwise parallel build fails
1. Better output from new unit test for marbl_utils (and additional tests)
2. Better comments in marbl_utils.F90
3. Better variable names in marbl_utils
@mnlevy1981 mnlevy1981 merged commit a42bf26 into marbl-ecosys:master Sep 11, 2017
@mnlevy1981 mnlevy1981 deleted the enhancement/runtime_PFT_count branch September 12, 2017 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant