Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup of struct NC_Dispatch #1056

Open
DennisHeimbigner opened this issue Jul 10, 2018 · 7 comments
Open

Cleanup of struct NC_Dispatch #1056

DennisHeimbigner opened this issue Jul 10, 2018 · 7 comments

Comments

@DennisHeimbigner
Copy link
Collaborator

Proposed changes/cleanup of the NC_Dispatch structure

  • add a version number as the first field. For back compatibility
    purposes, the version number should be detectable as different
    from the current first field, the model.

  • Make it a requirement that all fields be provided, even
    if netcdf-4 is not enabled. That is, the set of fields is not
    to be conditional on any flag.

  • Hide the initialsz, basepe, chunksizehintp parameters in the
    (*create) field by moving them into the parameters argument.

  • Hide the basepe, chunksizehintp parameters in the
    (*open) field by moving them into the parameters argument.

  • The (*enddef) field has extra parameters solely for the use of MPI:
    h_minfree,v_align,v_minfree,r_align. Convert these to a single parameter
    block (like in, say, (*close)).

  • Remove (*inq_base_pe) and merge into (*inq).

  • Create a new field called (*set) that sets various parameters at the
    top-level ncid level.

  • Merge (*inq_format), (*inq_format_extended), (*set_fill), & (*set_base_pe)
    into the new (*set) operation.

  • Remove (*get_varm) and (*put_varm) and always use NCDEFAULT_get/put_varm

  • Modify (*inq_var_all) to remove the deflate arguments since they
    can be handled by the idp+nparams+params filter arguments.

  • Merge (*def_var_deflate) into (*def_var_filter)

  • Merge (*var_par_access) into (*def_var)

  • Merge (*def_var_fill) into (*def_var)

  • Remove (*show_metadata) since this is actually a debugging function
    [This actually should be removed from netcdf.h, but probably not now]

  • Merge (*inq_unlimdim) and (*inq_unlimdims)

  • (Optional) create a (*inq_grp_all) similar to (*inq)
    and merge (*inq) into (*inq_grp_all)

  • (Optional) remove (*get_vara) and (*put_vara) and convert them
    to calls to (*get/put_vars).

@edhartnett
Copy link
Contributor

These are all good, except the last one.

There are lots of cases for user defined formats, where the user defines only the get/put_vara functions, and leaves the vars functions to the NCDEFAULT version, which just calls the vara function over and over again to make up the vars call. In fact, this is how the HDF4 layer works, and also the AB Dispatch layer (part of a separate library, the first of the external user-defined formats).

So the vara functions are very useful in the dispatch layer, IMO. They will be easier to write for user-defined formats than the vars functions, for most formats.

@DennisHeimbigner
Copy link
Collaborator Author

Good point about var/vars.
We might also want to consider some more structured way to pass a void* parameter;
perhaps something like the HDF5 properties mechanism.

@edhartnett
Copy link
Contributor

There is a fundamental difference between netcdf properties and HDF5 property lists, which is that HDF5 property lists are exposed to the cold, brutal hands of the user, whereas the netcdf properties will only be touched by dispatch-layer netcdf programmers, never the end user.

Right now we have void *parameters in open and create. According to your proposed changes, we would also have them in a few other places. These amount to defacto properties lists for these functions.

How about defined structs?

So for example a struct NC_OPEN_PROPERTIES, which contained all the fields that would otherwise end up in a void *parameter in the dispatch nc_open().

New parameters can just be added to the struct without breaking any existing code.

@edhartnett
Copy link
Contributor

Seems like nc_inq_format_extended() could be moved out of the dispatch table and implemented in libdispatch in a way that would work for all dispatch layers.

@edhartnett
Copy link
Contributor

edhartnett commented Aug 19, 2018

I suggest we make all dispatch functions that take ncid as a parameter instead get a pointer to the NC object for the file.

Currently we look up the file 3 times for a call to nc_put_vara_int().

1 - We look it up (and don't use it) in nc_put_vara_int() (and this lookup is removed in a pending PR I have up.)
2 - We look it up in NC_put_vara() (dvarput.c) so we can find the correct dispatch layer.
3 - We look it up in NC4_put_vars() (hdf5var.c) when the dispatch layer is called.

The reason is that we always pass ncid, but we already have to look up NC from ncid in order to identify the dispatch layer. So we should just pass *NC (which contains the ncid), instead of passing the ncid and looking up NC again.

As we have seen, some users open thousands of files at a time. So looking up the file twice for each operation is something we should avoid.

Unfortunately, this is a big change that will touch many, many code files at once.

@DennisHeimbigner
Copy link
Collaborator Author

I remember that I did this for a reason, but I no longer remember the reason.
It may have had something to do with recursive calls into the API.
Libdap2 code uses the netcdf-3 API and does not have the dispatch table.

Let me see if I can recall the rationale.

@edhartnett
Copy link
Contributor

I ran into the recursion problem when working on PIO and netCDF. The answer is to call the nc3_* version of the functions, and not invoke the dispatch layer at all.

With the changes I am proposing, the libdap2 code should look up the NC and call the function in libsrc directly (passing pointer to NC and not ncid). The libdap2 code is only every going to want the classic version of the function, so there is no need to use the dispatch layer.

No doubt that is some effort but I don't think that looking up every file twice, for every operation, is a good solution. I would be happy to make the changes, under your supervision, if that would be helpful.

DennisHeimbigner added a commit to DennisHeimbigner/netcdf-c that referenced this issue Jul 20, 2019
Partially address: Unidata#1056

Currently, some of the entries in the dispatch table
are conditional'd on USE_NETCDF4.

As a step in upgrading the dispatch table for use
with user-defined tables, we remove that conditional.
This means that all dispatch tables must implement the
netcdf-4 specific functions even if only to make them
return NC_ENOTNC4. To simplify this, a set of default
functions are defined in libdispatch/dnotnc4.c to provide this
behavior. The file libdispatch/dnotnc3.c is also relevant to
this.

The primary fix is to modify the various dispatch tables to
remove the conditional and use the functions in
libdispatch/dnotnc4.c as appropriate. In practice, all of the
existing tables are prepared to handle this, so the only
real change is to remove the conditionals.

Misc. Unrelated fixes
1. Fix some annoying warnings in ncvalidator.

Notes:
1. This has not been tested with either pnetcdf or hdf4 enabled.
   When those are enabled, it is possible that there are still
   some conditionals that need to be fixed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants