Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updates to openbf and closbf for eventual unified build #217

Closed
wants to merge 3 commits into from

Conversation

jbathegit
Copy link
Collaborator

Part of #78

Following @jack-woollen's diligent efforts to resolve #195, I'm now going to push additional commits which move us closer to completion of #78, as well as #20 and #79. I'm going to do this as a number of successive PRs, to make code review easier.

This first PR covers the necessary updates for subroutines closbf and openbf, and subsequent commits will contain similar updates for other routines. This may look like a lot, but the crux is that closbf and openbf are now each accessed via a Fortran90 generic interface encapsulated within a module, and that's where I'd ask you to focus your review. In contrast, most of the other changed files are to simply add use statements for these new modules within all of the other internal library routines which call closbf and openbf.

As for the test codes and utilities, those are all similarly changed as well, but in this case to add a single new use bufr_procedures statement. The new bufr_procedures module will eventually include all of the new generic interfaces, and that way the test codes and utilities only ever need to add this one new statement within their source code. In turn, this also becomes a model for how all other external applications will eventually need to similarly add just that one new statement within their source code, in order to access all of the new generic interfaces that will eventually be contained within the library.

@jbathegit
Copy link
Collaborator Author

One other design point I forgot to mention - within the _4, _d and _8 generic interface routines, you'll note that some local my_ variables are currently being declared, because we're still building separate _4, _8, and _d builds of the library. However, once all of the necessary PRs are implemented and we're able to actually consolidate down to a single library build (which will be analogous to the current _4 build), then at that point we'll only need to retain local my_ variables for any _d or _8 generic interface routines. In other words, at that point we can then go back and remove all of those same statements from within the _4 generic interface routines, and instead just pass all of the call arguments to those routines directly through and into the corresponding _body routines!

I hope that makes sense!

@jack-woollen
Copy link
Contributor

I don't see why any app compiled as 4 or d needs any additional statements to work with bufrlib_4. Obviously app codes compiled with -i8 need to add something to compensate when using bufrlib_4, but it seems just as easy to me for the app code to define necessary integer arguments to bufrlib as i*4, rather than use the module approach. Or even better, compile the app code as d. In the latter case no other changes are needed, are they? Assuming of course the app code doesn't really need all integers to be 8 bytes. What am I missing?

@jbathegit
Copy link
Collaborator Author

jbathegit commented Jun 8, 2022

Hey Jack, thanks for the quick reply. Your comments made me realize that there was an additional simplification I could do, and which I've now pushed up as another commit. This is based on the fact that we now know our final target is indeed going to be the 4-byte build, whereas prior to your work on #195, we weren't sure if our final unified build was going to be 4-byte or 8-byte. Plus we now have testing coverage in place for all of the different IN and OUT test codes across each of the _4, _8, and _d variations, which also sets up us nicely for the eventual transition to the unified 4-byte build.

Having said that, let me try to address your broader question. The overall goal here is to minimize changes to all of the countless application codes that are out there, by making the library itself agnostic w.r.t. how it's being called. In other words, to have one library build that works no matter whether it's linked to an application code which calls it using 4-byte or 8-byte integers (and/or 4-byte or 8-byte reals), and where the library itself automatically figures all of that out on the fly and adjusts accordingly, rather than requiring every user to modify their own application code and/or makefile. With the exception of the real*8 usr arrays for ufbint, ufbrep, etc. and a handful of other variables, the library up until now has never been explicit about whether application codes need to pass in 4-byte or 8-byte integers and reals, and to do so now would necessitate a major coordination effort across who-knows-how-many production and other application codes, both within and outside of NCEP. Instead, it's much less impactful to the broader community if we can create a single library build which works automatically for all such existing variations, and the only way to do that in Fortran is via the use of generic interfaces, which in turn can only be implemented inside of modules in Fortran90.

So that's the overall design philosophy here, and it's not too complicated to implement. Let me step you through the openbf case, to hopefully better explain this:

module subroutine_openbf

    private
    public openbf

    interface openbf
        module procedure openbf_4_d, openbf_8
    end interface

    contains

    subroutine openbf_4_d( lunit, io, lundx )
!       used when call arguments to openbf are 4-byte integers

        implicit none

        integer(kind=4), intent(in) :: lunit, lundx
        character(len=*), intent(in) :: io

        call openbf_body( lunit, io, lundx )

    end subroutine openbf_4_d

    subroutine openbf_8( lunit, io, lundx )
!       used when call arguments to openbf are 8-byte integers

        implicit none

        integer(kind=8), intent(in) :: lunit, lundx
        character(len=*), intent(in) :: io

        integer :: my_lunit, my_lundx 

        my_lunit = lunit
        my_lundx = lundx

        call openbf_body( my_lunit, io, my_lundx )

    end subroutine openbf_8

end module

How this works is that external codes still call the handle openbf as before, but instead of being the name of a subroutine, this handle is now a generic interface which links to the two private subroutines openbf_4_d and openbf_8. Furthermore, using the explicit declarations for lunit and lundx within each of these two private subroutines, the interface automatically figures out at run time whether it was called by the external application with lunit and lundx declared as 4-byte integers or 8-byte integers, and it then automatically passes control accordingly to the proper one of these two private subroutines. In the former case, and since we now know that the final unified target build is also going to be 4-byte integers, all we need to do is just pass these same call parameters directly through to openbf_body, which itself is the same openbf code logic that we've had in the library for years. Otherwise, in the latter case, the private subroutine first needs to copy the 8-byte integer call parameters into local variables and then pass those into openbf_body. At the moment this is superfluous because we're still building a separate _8 byte build of the library, and so these local variables and openbf_body parameters are already 8-bytes as well. But once we have a single, unified 4-byte build, then these local variables and openbf_body parameters will all be 4-byte integers, which is why we'll need to copy the 8-byte integer call parameters into these local variables before we can pass the local variables to openbf_body.

In short, I believe the former case is what you were asking about in terms of not "need(ing) any additional statements to work with bufrlib_4", and the latter case is what you were asking about in terms of "app codes compiled with -i8 need(ing) to add something to compensate when using bufrlib_4". But again, the idea is to have the single, unified build of the library figure that out and adjust automatically, rather than requiring all external application codes to change and now be explicit about how they call each of the various library routines.

I hope that makes sense(?)

@jack-woollen
Copy link
Contributor

Thanks for replying. I understand how the interface works. My problem is the need to put use module statements in every app code (every subroutine in fact) which calls any bufrlib routine. And in the bufrlib you will need to make any user accessible routines into modules, just like openbf and closbf which you have illustrated. All to solve a problem which I contend is a tempest in a teapot. All the apps that currently link to bufr_4 and bufr_d will work as is with bufr_4. That is the great majority of all app codes. I looked through prod makefiles and only found a handfull of codes which link to bufr_8. IMHO they could easily adapt to using the bufr_4 library by defining integer args to bufrlib routines as i*4, or by adapting to use the -i4 -r8 (d) compile of the app code.

If you really want to make numerous changes to bufrlib to accommodate a very small pool of bufr_8 users, there's better ways to do that I think. In order to save the great majority of users from having to change all their app codes, you could use much simpler code wrapping which would not affect any users other than those using bufr_8. It would require the use of a flag, similar to the lendat or other such flags, that would signify to bufrlib that all integers will be 8 bytes. All the user routines would then need an 8byte wrapper, but not in a module. As an example I've put a version of ireadmg which is written in this style. I tested using i8=t, along with readmg and copymg also with wrappers, to prove the concept and it worked fine. As you see I put the i8 flag in a common block just for testing but that can be in a module, controlled by a routine like datelen. The docblock is left off the example:

!--------------------------------------------------------------------------
!--------------------------------------------------------------------------
FUNCTION IREADMG8(LUNIT8,SUBSET,IDATE8)
INTEGER*8 LUNIT8,IDATE8
LUNIT=LUNIT8
IREADMG8=IREADMG(LUNIT,SUBSET,IDATE)
IDATE8=IDATE
END FUNCTION
!--------------------------------------------------------------------------
!--------------------------------------------------------------------------
FUNCTION IREADMG(LUNIT,SUBSET,IDATE)

COMMON /BYTES/ I8; LOGICAL I8
CHARACTER*8 SUBSET

IF(I8) THEN
I8=.FALSE.
IREADMG=IREADMG8(LUNIT,SUBSET,IDATE)
I8=.TRUE.
RETURN
ENDIF

CALL READMG(LUNIT,SUBSET,IDATE,IRET)
IREADMG = IRET

END_FUNCTION

@jbathegit
Copy link
Collaborator Author

I understand and agree it's not ideal that every application code which calls a library routine now needs to include a new use bufr_procedures statement, and I already brought this up in the lengthy discussion for #78. See the entries dated January 20th, where @kgerheiser's suggested example of "module bufr_mod" is analogous to what I have now with "module bufr_procedures". I also recall there was a discussion recently where we were talking about version numbers, and where it was agreed that the next library release would be 12.0.0, precisely because it would require changes to application codes to add use bufr_procedures, and therefore require an increment of the first (major) component of the version number, instead of just becoming 11.8.0. In short, I thought we'd all pretty-much accepted that as a given and settled on this approach, but maybe I misunderstood.

In fact, IMHO I don't think it's that big a deal to ask users of our library to now add a use bufr_procedures statement within their application codes. To me, this is no different than how users of many C-based libraries currently have to add an #include <library>.h statement in their application codes in order to link to subprograms within such libraries, and which seems to be a common and accepted practice, at least in the C-language world.

That said, I agree your suggested approach could also work as well, and I hear what you're saying about there not being a lot of application codes which are currently compiled using 8-byte integers. But were you just looking at the production codes on WCOSS? What about developmental users on Gaea, Hera, Orion, etc., as well as users in the wider community who may download, build, and use our library? It may not be possible to know for sure how many such users this might impact, and which is why I think we need to accommodate such users for backwards-compatibility, since we've always supported them with 8-byte library capability in the past.

So, either way, I think we have some coding updates that we need to make to the library, whether it's by adding 8-byte wrappers outside of modules as you've suggested, or by using generic interfaces inside of modules. I still personally prefer the latter approach, because I think it's cleaner and better encapsulates things together from an overall design standpoint. But I'm certainly willing to accept the will of the majority, so I'd like to ask @kgerheiser, @edwardhartnett and @rmclaren if they have any opinions on this matter or would otherwise like to chime in on this discussion?

@jack-woollen
Copy link
Contributor

Fair enough. I will say, no matter how many bufr_8 users there are out there, the wrapping method I described results in many fewer overall changes needed outside the bufrlib, and from what I see likely no more inside the bufrlib, than the module interface approach. Considering this, I think it would enable a "cleaner" transition to a one build system.

@jbathegit
Copy link
Collaborator Author

@kgerheiser @edwardhartnett @rmclaren

In moving towards a single build of the library, Jack and I have described two possible approaches in this thread w.r.t. overall code design. Do any of you have a preference or want to weigh in on this discussion?

@jbathegit
Copy link
Collaborator Author

Hey Jack, since it doesn't look like we're going to get any feedback from anyone else, then I guess it's really up to you and me to make the call on this, and in that case I'm fine with adopting your 8-byte wrapper approach. You're correct that my generic interface approach would require at least as much if not more coding changes inside the library itself, not to mention requiring additional changes to application code that wouldn't be needed with your approach. So let's go with your approach.

I don't have a strong preference for whether I8 itself should be declared in a common block or a module, though modules do seem to be the preferred approach for such things in modern Fortran, so we should probably go with that. I do think the I8 value should be set to .false. by default, and I agree that application codes should be required to use a new datelen-like routine in order to toggle it to .true. before they call any library routines with integer call parameters.

Also, for simplification and tracking purposes, and since new routines since as ireadmg8 aren't ever intended to be called from anywhere other than within ireadmg, I think it would make sense to just add the code for ireadmg8 as a separate additional routine inside of the existing ireadmg.f source file, along with the updated code for ireadmg. It should still have its own brief docblock though.

How do you want to proceed? If you'd like to push up a new branch with your ireadmg/ireadmg8 and any other existing updates and then open a new PR for that, then please go ahead and do so. Then, once that's merged and we have a prototype for moving forward, I can help you tackle all of the other user-callable library routines that will need similar updates. We can probably just put together a Google doc or similar list that you and I can work from and check off which ones each of us are working on and which ones still need to be completed. Thoughts?

@jack-woollen
Copy link
Contributor

Jeff, that sounds like a good plan. I'm happy to work together on this upgrade. I'll setup a branch with the initial changes and we can go from there.

@jbathegit jbathegit closed this Jun 24, 2022
@jbathegit
Copy link
Collaborator Author

Hey Jack, please let me know whenever you have your new branch set up and are ready to move forward with this.

Thanks, and just as an FYI, I'll be away on leave July 11-15 and 25-28.

@jbathegit
Copy link
Collaborator Author

Hey @jack-woollen, just touching base. Have you had a chance to set up a branch with the initial changes that we discussed for this eventual unified build?

@jack-woollen
Copy link
Contributor

jack-woollen commented Aug 2, 2022 via email

@jack-woollen
Copy link
Contributor

Jeff, sending more details of what I've done so far. I modified 43 subroutines out of a total of 79 I identified as needing wrappers. The two lists are attached. I am rethinking the template I used for the changes I made so far, and modified that a little. An example of how I think will be an okay form for adding the wrappers, consistent with some of your comments above can be seen in the subroutine atrcpt.f pushed into the branch I'm working in, called r8wraps, AKA "rate-wraps". Once we decide on the final format I'll proceed to change the already wrappered set to conform, and go from there.

Wrappers added so far in branch r8wraps.docx

Total set of wrappers.docx

@jbathegit
Copy link
Collaborator Author

Thanks Jack. I've been tied up with other stuff this past week, but I did start looking at your r8wraps branch a couple of days ago, and my schedule should be clear to take a much deeper dive tomorrow, including your two .docx documents.

Thanks again for continuing to work on this! Just FYI, and since this is technically now a closed PR, what we should probably do is resume this conversation over on one of the already-open issues for this topic (e.g. #78), and just reference this closed PR from there so that we can always refer back to it. But again, and just for easier tracking within GitHub, we should probably switch this main conversation thread over to one of those already-open issues and continue it there. Let me know if you disagree; otherwise, I'll plan to switch us over in the next day or so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

allow encoding and decoding of values larger than 32 bits
2 participants