-
Notifications
You must be signed in to change notification settings - Fork 262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose dispatch layer so that users can add their own local binary format #177
Comments
OK, I now have this working on HPC NetCDF. It's really neat. (@wkliao this is one of the interesting new features I mentioned). To see it in use, take a look at this github project: In this project, HPC NetCDF is used, and a new netcdf function is called: nc_def_user_format(). After that function is called, the AB Dispatch code is used to read the file. At the moment I've just stuck the HDF4 code in there, so instead of reading AB format, the AB Dispatch code is reading HDF4. ;-) However, that code just has to be swapped out for AB format code that does the same thing (i.e. read the file and present it in the netCDF data model). Here's how it looks in code:
I have added 2 user formats, NC_UF0 and NC_UF1, so the user can define two different formats at the same time, and use the netCDF API to interchangeably read/write them. It's a neat feature. I will announce it on the netCDF mailing list sometime this week. It's going to take time to get this into Unidata netCDF, due to the backlog of my PRs. To prevent out of order merging I'll hold off on putting this up until all or most of my currently-pending PRs are merged. |
Duplicate of comment at: #814 (comment)
|
One critical issue is telling the netcdf library where and how to load the |
But this turns out not to be necessary with this solution. The user-format dispatch layer is a library, which uses the netCDF library. (And the netCDF library does not have to be re-compiled.) Then the user writes a program, which uses the dispatch layer library he/she has just written. And that library uses netCDF. So our code code does not have to deal with .so or .dll files. All functions are available at program compile time, either in the user-dispatch library or the netCDF library. |
From my point of view, this is a bad and complicated solution. |
Let me paraphrase your proposal and tell me where I have it wrong.
|
First there are some changes in netCDF: 1 - I added function nc_def_user_format() which takes a pointer to a dispatch table, a mode flag (NC_UF0 or NC_UF1), and an optional magic number (not yet used). 2 - I modified the netCDF dispatch code in dfile.c so that when it gets NC_UF0 or NC_UF1 it uses the provided user-format dispatch table. (Magic number modifications still to come.) Now for a user to implement their own binary format: 1 - User writes a library that implements the required netCDF dispatch functions. It is not a wrapper for netCDF. (Example here: https://github.com/HPC-NetCDF/ab-dispatch). It's the same code as lives in the existing dispatch directories (like HDF4), but it lives in its own repo and library. It is not mixed with netCDF code (but does use netCDF headers). The library contains a dispatch table, with pointers to all the functions needed. 2 - Only the functions needed in a dispatch layer are required. It can easily reuse standard functions that will work (as the HDF4 layer reuses all the existing nc_inq_* functions). So not every function is implemented. 3 - The user writes a program that links to both netCDF and their own, separately compiled and installed, dispatch library. (It actually does not have to be separate from the user application, but it is separate from netCDF.) 4 - The user code calls nc_def_user_format(), with a pointer to the dispatch table in the user-format dispatch library. In this call, user also associates one of two pre-defined mode flags (NC_UF0 or NC_UF1) with the format. 5 - From then on, NC_UF0 in the mode flag means use this custom dispatch table. The netCDF library is not recompiled or wrapped. 6 - Other modeling groups, working on the same HPC with the same netCDF installation, can define their own NC_UF0, by writing and linking to their own dispatch layer. Each program can have up to two user-defined formats in use at the same time (in addition to all the usual netCDF formats). This way the code for the user-defined dispatch layer is completely and always outside the netCDF code. Users can develop dispatch layers, exchange them, or whatever. They can write a custom dispatch layer for a proprietary format, and not release it. It's a separate library, which plugs in to netCDF. Dynamic loading is taken care of when a pointer to the dispatch table is passed to nc_def_user_format(). So this is quite different from the current situation, where each dispatch layer has to be built into the library, and then the library must be rebuilt. |
Here's a little sneak preview of some code. This is all working code, BTW.
Later, in NC_open():
Then, in my sample user-defined format library, I have:
Note the used of NC4_ and NCDEFAULT functions to lighten the load. I only implemented the functions that start AB_, and many of those just return NC_EPERM since this is read only. To use the AB dispatch layer, I do this:
|
Ok, I see. It is actually a pretty good solution. We ought to propose I might suggest we provide an extra function for the dispatch table However, this solution would not work for existing systems that One problem I see is mode flag conflicts. We have a limited set of |
This system will work for R, IDL, Fortran, and any other layer that sits on top of the C library. BUT the user-format dispatch layer must be written in C. And there must be one C call to set the user format (although we can probably do this in most languages with a void *.) Once that is done, it works completely within the C layer. If the Fortran program passes NC_UF0 in to nf_open() it will all work. Same with python, as it uses the C library. It does nothing for Java but they already have this feature. We are just trying to catch up. ;-) For most HPC users and modeling groups, this will be useful. There will be one or two non-netCDF formats, and some C programmer will write the dispatch layer, and all the fortran programmers will just link to the user-format dispatch library, just as they do to netCDF. Useful formats can even be distributed by helpful folk like @opoplawski if they become popular, and then users can just install them with some package management tool. The user does not add mode flags. I provide two: NC_UF0 and NC_UF1. This allows two different user-defined formats to be in use in the same program. There are still some mode flags available after that. I agree about the eventual mode flag scarcity problem, but we have not run out yet. So I would propose to separate any discussion of mode flag scarcity from the issue of user-defined formats. |
Some other programming notes:
|
This does not sound right. R and IDL are interpreters and have AFAIK no way There are a different set of problems for systems like fortran, java, and python. |
In thinking about this, I do not see why you need to us NC_UF0 and NC_UF1. |
Ok, I take back part of what I said. |
I believe that python, IDL, etc. already have a mechanism for dynamically loading C libraries. So they would use that method to load the dispatch library. All of these languages will let you wrap a C function that actually calls nc_def_user_format(), so it is not even necessary to wrap that in Fortran, or any other language. As part of their dispatch layer use in a python program, they must load their dispatch library, then they must call a C function that calls nc_def_user_format(). From that point on, they are good to go. Yes, for fortran and C programs the dispatch layer can just be built right into the user code. However, I am providing a sample library dispatch layer, so probably it will be easiest for users to just copy that and make it a separate library. I will also say that the primary requirement here is C and Fortran. This is for models and they are not written in Python. However, as I say, I believe it will work just fine for python too. |
To answer your questions about NC_UF0 and NC_UF1:
So those are my reasons. Not married to anything about this implementation, it's just my first pass. |
Effectively you are just reserving some # of bits (2 in your case) in the mode flag |
One thing to remember is that there is no defined mapping between those mode flags |
Yes and yes. The mode flags mean nothing until nc_def_user_format() is called to associate them with a dispactch table. If the users uses a different dispatch table and then tries to read NC_UF0, then obviously it will not work. |
@DennisHeimbigner I've been thinking about your remarks on ncdump on another thread. Now I understand. Without recompiling ncdump, it cannot understand user defined formats. I think here is the case where libraries must be dynamically loaded by our code. If a new command-line argument is added to ncdump which is the user-defined library name, then that library can be dynamically loaded in ncdump, and then ncdump can call its dispatch functions. |
Before this goes much further, we need to have |
Some other issues that need to be addressed:
|
Another issue to consider: |
OK, I will assemble some documentation. Also, in light of these conversations I will do some of the dispatch work we have discussed first, to clear away the underbrush and make this feature's implementation more clear. So let me return to this after #856. That will make the necessary interfaces much more clear. |
@DennisHeimbigner just joking about documentation! No resistance to documenting at all. I love documenting. The hard thing is to get me to shut up. After the current HDF4 changes are merged I have another HDF4-only change that incrementally moves towards some of these changes. Basically I take the HDF4 only fields out of the netCDF-4 info structs and use a void pointer, as we discussed. In the case of HDF4, the void pointer points to the HDF4 info struct. This cleans the HDF4 stuff out of the header files, and will allow me to start to isolate the internal API that is needed by the dispatch layer code. As part of that I will start to add to the internal dispatch documentation, as needed, to explain the changes. |
There has been a lot of development of the user-defined format feature in HPC NetCDF. And it's a very powerful feature. I have been writing the ab-dispatcher, a dispatch layer for the AB format used in the HYCOM model. Recently I have gotten ncdump working with user-defined formats in a general way. Now I can do an ncdump on an AB format file! It's neat! In order for this to work, the user must build the user-defined format library, and then provide information about it at netCDF compile time. Then, with the use of the new -u option, ncdump can read files of user-defined format.
|
Great job getting it to work with ncdump, this looks really interesting, congrats! |
Agreed. But (there is always a but...:-) |
There are two user-defined formats that can be used UF0 and UF1. When you build HPC netCDF you can provide an option like this: --with-uf0=AB_dispatcher (If this is provided, you must also provide LDFLAG and LADD options to point to the library that implements a dispatch table called AB_dispatcher, or the build will fail.) This will tell ncdump that user-format 0 is associated with the AB_dispatcher dispatch table. ncdump registers the new dispatch table before calling nc_open(). When running ncdump, the user must provide the -u0 option, to indicate that user-format 0 should be used for this file. This causes ncdump to use the NC_UF0 for the nc_open() mode flag. A different user format could also be used as user format 1. It's up to the user to keep straight which is which for their build. (But if they get it wrong, ncdump returns an unknown format error, so that's OK.) Note sure what you mean by probing. Does the above answer your questions? |
No. Put another way, how does the library locate the relevant dispatch table? |
The library locates the relevant dispatch table using the symbol "AB_dispatcher" (provided to netCDF configure) which is the name of the dispatch table in my dispatch library. The library is specified by adding it to the LIBS flag when building netCDF. So I can name my dispatch table whatever I like. When I build netCDF, I say --with-uf0=my_dispatch_name, and make sure that the linker can find my dispatch library. The configure defines macro UF0_DISPATCH in config.h. When ncdump is built, the symbol is substituted like this:
This will of course all be fully-documented in a new section of the docs for user-defined formats. To address some of your other recent comments:
|
I have added to my todo list an item to extend this |
Introduction
In order to enable the HYCOM model to be updated to the netCDF API, and to provide a general capability to users, I propose to expose the netCDF-C dispatch layer so that users can add their own local format. This feature, already available in the netCDF-C library internals, is not currently exposed to users.
Background
HYCOM Model and Data Format
The HYCOM modeling team would like to upgrade their model to netCDF, in part to take advantage of the ParallelIO libary, which provides a standard way to ensure good performance in the I/O layer, specifically in the case of asynchronous I/O to a subset of processors using parallel I/O to write to disk.
Many legacy tools have been written based on the existing HYCOM native binary format ("AB Format") Converting the model to netCDF would break all existing tools, leading to loss of significant investment of time and effort, as well as disrupting work with a need to convert all necessary tools at the same time.
Providing and Upgrade Path for HYCOM
With the NetCDF-C dispatch layer exposed, an external plug-in can be provided for the netCDF-C library. Once this plug-in is in place, HYCOM programmers can convert model code to the netCDF API, while still reading and writing their local binary format.
All tools based on the netCDF-C library will transparently read the AB Format (including all tools in Fortran 77, Fortran 90, C++, Python, Perl, Mathmatica, etc.) These tools will also be able to write the AB Format with the netCDF API, allowing them to be converted one at a time.
This will give HYCOM modelers an upgrade path for their existing large code base of code which works with the local binary format. One by one, as time permits, these tools can also be upgraded to the netCDF API.
Eventually, when all tools are converted, the arguments of the nc_create commands in the HYCOM model can be changed to use one of the supported netCDF formats. All tools will transparently read netCDF files, and files in their local format. Historical data (in AB Format) will still be transparently available to netCDF programs. When a file is opened, the netCDF-C library will recognize it as an AB file, and use the plug-in code to read the file. User applications will be able to access the file through the netCDF API.
NettCDF-Java Plug-Ins
The netCDF Java library allows users to write a plug-in which allows them to support read/writes to their own binary format, through the netCDF Java API. The plug-in enables the netCDF library to read/write an arbitrary local format. This feature is widely used in the netCDF-Java community.
The proposed changes will bring this capability to the netCDF-C user community.
General Applicability for the NetCDF Community
Although this work will directly benefit the HYCOM mode, the need for an upgrade path from a local format to netCDF is general. I propose that this become part of the netCDF release and become available to all netCDF-C library users.
Technical Approach
The solution consists of two parts: a plug-in to read and write AB Format, and the changes to the C library necessary to support use of the plug-in, and others like it.
Changes to NetCDF-C to Expose the Dispatch Layer
The C library contains a "dispatch" layer which allows for other formats to be added to the C library. This is how netCDF-4/HDF5, OPENDAP, HDF4, and the parallel-netcdf library are currently supported.
But the C library does not expose this plug-in capability. Changing the C library source code is required in order to add a new version.
For the purposes of this discussion, AB Format, and other such formats, will be called "user-defined formats".
Dispatch Layer Library
The user is responsible for building a dispatch layer library.
This library, when linked with netCDF and a user application, allows the netCDF library to understand the user-defined format. The user-defined format dispatch library implements all the necessary functions from the netCDF dispatch table. (This is a subset of the entire netCDF API).
The user-defined format is registered with a new function, nc_def_user_format().
The AB Format
The local format for the HYCOM model is called "AB Format." It consists of two files, one with metadata, and one with the data stored in binary form.
An AB "file" consists of two files on disk, a binary file with extension "a", and an ASCII text file with metadata with a "b" extension. (The names of the file are part of the format definition.)
The A file is IEEE big-endian, direct access, with a fixed record length.
Converting the AB Format to the NetCDF Internal Model
For the netCDF library to read and write files in AB format, the plug-in must read the metadata file, and provide functions in accordance with the internal needs of the netCDF library. Reading and writing are handled separately.
Reading the AB Format
When reading the AB format the netCDF library needs functions which will:
Identifying AB Files
AB Files can be identified by name. When an nc_open call is made, and the file name is not found, but two files "FILENAME.a" and FILENAME.b" are found, then the library knows it will be opening an AB file.
Reading AB File Metadata
Once a file has been identified as AB Format, the netCDF library will need to read and understand all the metadata pertaining to that file. This includes the names, types, and sizes of all variables, attributes, and dimensions in the file.
With AB format, code will be written to read the .a file and provide metadata information to the netCDF library.
Reading Subsetted Arrays of AB File Data
The core data read operation of netCDF is nc_get_vara_TYPE (where TYPE varies). This call allows the user to read a subset of a data variable into an n-dimensional array. With this operation, the netCDF-C library implements most of the other read operations.
In the case of AB Format the subsetted array read will be straightforward. Based on the information in the metadata file, the exact offsets to any element of the data can be computed.
Writing the AB Format
In writing the AB Foramt, the netCDF library must:
Creating an AB File
The config file which is read at library load will contain the mode flag which indicates the AB format.
Writing the AB Metadata File
Once the nc_create call returns, no further disk access takes place until the nc_enddef() call. During the time between nc_create() and nc_enddef() the user can define variables, attributes, and dimensions for the file.
When nc_enddef() is called, the AB Plug-in code muse write the B file, which contains the metadata for the file.
Only the netCDF-classic data model will be supported by the AB Format Plug-in. Other restrictions on types and objects may be necessary.
Subset Write Operation
The netCDF subset write operation nc_put_vara() is the key write operation from which other write operations are built. Once implement for the AB format, all netCDF write operations will work for the files.
Conclusion
Exposing the netCDF-C library dispatch layer will benefit the HYCOM modeling group, and also other netCDF-C users (including all netCDF fortran, python, perl, etc.)
Comments and feedback welcome. Please add to this issue so that all can participate in discussion.
The text was updated successfully, but these errors were encountered: