Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove the _8 build #257

Closed
edwardhartnett opened this issue Jan 10, 2023 · 21 comments
Closed

remove the _8 build #257

edwardhartnett opened this issue Jan 10, 2023 · 21 comments
Assignees
Labels

Comments

@edwardhartnett
Copy link
Contributor

The idea of doing different builds for different sized default integers or floats is a bad idea.

Instead, we should use modern Fortran features to handle different types of data. As every other Fortran project does.

We intend to get rid of _4 and _d builds eventually, but we can get rid of the _8 build now. No one uses it.

@jbathegit
Copy link
Collaborator

This is no longer an issue as of PR #250 which was merged last month. There's now only one build, and this is now reflected in the README.md as well.

@edwardhartnett
Copy link
Contributor Author

OK, but the idea of removing the _8 build was not to make code changes and add a bunch of code to support it. It was to remove it and have it be gone completely.

Users can still build the _8 style library (along with their own codes) by setting the compiler flags themselves.

The intention was that _8 would be removed from the build system. Not that it would be added to the code.

@edwardhartnett edwardhartnett reopened this Feb 4, 2023
@jbathegit
Copy link
Collaborator

@edwardhartnett that was not at all how @jack-woollen and I understood the issue. In all of our prior discussions, the plan was to modify the existing code so that only one build of the library would be needed going forward which could seamlessly work with _8 and _d application codes. Updating the Fortran code to be able to remove all of those extra builds had always been the goal as we understood it. And the _8 and _d have indeed been removed from the build system - there's no reference to them anywhere in any of the CMake files, except for the test subdirectory so we can verify that the one remaining build continues to work for any application codes built with _8 and _d compiler flags.

If what you've now described was really your vision all along, then it would have been extremely helpful if you could have chimed in with that back in February of last year when we were having these initial discussions in #78, and even more so during June-August of last year when we were having more detailed discussions about how to resolve this in #217 and were asking/begging for feedback from you and others on the various approaches we had come up with. But we heard nothing but crickets, so we went ahead and agreed on a way forward between the two of us. Are you really jumping back in now to tell us that all of the work we did to resolve this was a complete waste of our time?

@jack-woollen
Copy link
Contributor

@edwardhartnett @jbathegit Ed's idea certainly was an option discussed a year ago. It was the simplest option; only 8byte users would need to compile their own version. Reading #78, the details regarding the wrapper approach weren't really clarified until September, when a fortran only solution with an 8byte option available using slightly modified 4byte code was adopted and in progress. It was fairly well detailed after that. The additional code required to make that work wasn't that much, as shown below. The wrappers added 151 lines. The documentation and other work since then removed 600 lines. Not to mention the complete removal of the _d and _8 versions of the library, from anywhere. I also thought that was clearly the main goal of this exercise. The amount of work involved to implement the wrapper option was considerable, and should be considered. On the other hand the amount of work saved by making the transition for 8byte codes much easier is on the plus side. There's more than one way of looking at this, but I'm pretty sure that done is done.

HEAD is now at dbec6c5 Merge branch 'develop' into r8wraps
36777 171433 1271867 total without wrappers
Previous HEAD position was dbec6c5 Merge branch 'develop' into r8wraps
HEAD is now at 12eca61 Merge pull request #250 from NOAA-EMC/r8wraps
36938 171885 1273470 total with wrappers
Previous HEAD position was 12eca61 Merge pull request #250 from NOAA-EMC/r8wraps
Switched to branch 'develop'
36338 172938 1243693 total for current develop

@edwardhartnett
Copy link
Contributor Author

I apologize for not watching this issue more closely. I also apologize for any lack of clarity in the issue, although reading it, it seems quite clear. No where is there any discussion of adding the _8 capability in some other way. The title of the issue is "remove the _8 build" which seems clear.

However, the amount of work put into an incorrect solution is not relevant. We must do what is correct. If we walk down the wrong path, then we must turn around and walk back and get on the correct path.

If we have no code to deal with _8 then users can easily build a _8 version (if they want one) by setting their own flags. That was the idea of this issue.

There was never any intention to move this capability from the build into the code. The goal is simplification. No one at NOAA is using the _8 builds of any of our libraries, so we have eliminated them everywhere.

The way to deal with _4/_d in fortran is to use a module and overload the functions, so that either _4 or _d style calls can be made, all on the same version of the library. This is more complicated than just removing the _8.

The idea of removing the _8 was to simplify the situation by getting rid of a build that no one was using.

@jbathegit
Copy link
Collaborator

I don't recall where/when/how we ever established that "No one at NOAA is using the _8 builds of any of our libraries". If we had had some way of knowing that, then it certainly would have been a lot easier to just get rid of it right from the beginning and not try to build it into our code, but that wasn't the case.

And BTW, we did look at using the module approach with overloading functions. This is noted in #78, and also in #217 which is where we really debated the two options last summer and were looking for feedback before we moved forward. But nobody else bothered to chime in to our discussion in #217, so we were left to decide for ourselves. And in the end we decided to go forward with the wrapper approach, in part because the module and overload approach required a lot more coding overhead in a lot more routines, including a lot of resource-intensive array copying inside of the new module interfaces of all of the user-callable routines.

Bottom line, there was a LOT more discussion about this topic than just here in this one issue. So, at least speaking for myself, right now I'm feeling incredibly deflated and demoralized to hear that all of the work that Jack and I put into this (and, as he noted, it was a considerable amount of work!) is now being dismissed as irrelevant and a complete waste of time. I can't believe that we're only hearing now that we were, apparently, "walking down the wrong path" this entire time!?

@jbathegit
Copy link
Collaborator

jbathegit commented Feb 6, 2023

I will also point out that this particular issue, with the title remove the _8 build was only opened last month (i.e. after all of the work had already been completed and committed to the develop branch!), whereas the original issue #78, and which all of this work was based on, was opened back in November 2020.

Furthermore, the original tasking in #78 includes the comment Currently we build 3 flavors of the library, 4, 8, and D. Instead, use fortran interfaces and kinds to provide one library that handles all three cases. And that's what we did - we modified the library so that it can now handle all three cases in one single build. So how, exactly, did we somehow misinterpret that? I've looked through all of the discussion since then, and I don't see any place where we agreed that simply removing the _8 build logic altogether and pushing that responsibility onto the users was a viable option. In fact, that wouldn't even be possible for WCOSS2 users, since they aren't allowed to build and install their own copies of any NCEPLIBS libraries!

@edwardhartnett
Copy link
Contributor Author

Kyle determined that no _8 version of any NCEPLIBS was being used in any NOAA build. Does anyone know of a case where he was mistaken? Do we know of any case with anyone using the _8 version of the library?

If not, then we don't need to support an _8 build in any way, right?

@jbathegit
Copy link
Collaborator

Kyle determined that no _8 version of any NCEPLIBS was being used in any NOAA build.

He did? When was this, and could you please point me to where (i.e. in which issue discussion)?

@edwardhartnett
Copy link
Contributor Author

Do you know of any use of the _8 version of the library?

@jbathegit
Copy link
Collaborator

No, but I also don't know every code that's running everywhere linking to the library. Just because I don't know about it personally doesn't mean it doesn't exist. That's like asking me to prove a negative.

Again, how/when did Kyle determine that nobody anywhere was using the _8 build? And if this is really the case, then why didn't he (or anybody?) speak up during the entire time that Jack and I were discussing options and working on this task?

@edwardhartnett
Copy link
Contributor Author

Kyle was involved in many of the CMake build systems used at EMC - he contributed to a lot of them. So he investigated by looking at the build systems to see what was actually being linked, and he found that it was almost always the _4, sometimes the _d, and never the _8 version of any NCEPLIBS library.

As to why he and no one else spoke up, it is because we are not sufficiently involved in NCEPLIBS-bufr. We have 20+ libraries and only 3 or 4 programmers. While NCEPLIBS-bufr is very familiar to you, it is only one of many packages Kyle was involved with, and neither he nor I followed it closely enough. It's not a mistake I will make again, I hope. But if you followed the other NCEPLIBS, you would have seen how _8 was handled everywhere else. Just as you don't follow progress in any of the other NCEPLIBS projects, so Kyle did not follow progress in NCEPLIBS-bufr.

When you removed the _8 version of the build, and changed how it worked, that will also break any libraries that use it, right? In other words, it was not a backwards-compatible change to the code. So in any case, any use of the _8 by NOAA users will be a problem after the next release.

If we just remove the _8 from CMake, as we did in all the other NCEPLIBS, and it turns out to be used, we can add it back in for a new release. In that case, the users need make no changes in their code or build system. But with the case of your changes, the end user must also change their code to continue to use the _4 version as if it were _8. So they will need to change both their code and their build systems.

So it seems like a much simpler approach is to take out the _8 build, and then, if we were wrong, to add it back in again to CMake, not to the Fortran code.

@jbathegit
Copy link
Collaborator

As to why he and no one else spoke up, it is because we are not sufficiently involved in NCEPLIBS-bufr. We have 20+ libraries and only 3 or 4 programmers. While NCEPLIBS-bufr is very familiar to you, it is only one of many packages Kyle was involved with, and neither he nor I followed it closely enough. It's not a mistake I will make again, I hope. But if you followed the other NCEPLIBS, you would have seen how _8 was handled everywhere else. Just as you don't follow progress in any of the other NCEPLIBS projects, so Kyle did not follow progress in NCEPLIBS-bufr.

Fair point, and yes it's true that I haven't been following details in other NCEPLIBS projects, but I also haven't been tagged in any GitHub issues or emails where this particular topic was ever discussed. And it's not like we've been having regular monthly team meetings either. So, going forward, are you suggesting that I should just try to monitor all of those other libraries myself within GitHub and keep track of issues/discussions that way? It seems to me it would be a better and more focused use of everyone's time if we all got in the habit of informing folks by tagging them in specific GitHub issues where/when we had reached a major decision point, or where/when we really needed their input on something. That was my thinking and is why I had specifically tagged you, Kyle, and others for your input on this particular issue on several occasions before we did all of this work. And that's also why I now feel totally blindsided and demoralized that nobody bothered to speak up at that time, and before Jack and I ended up wasting many weeks of our valuable time "walking down the wrong path".

When you removed the _8 version of the build, and changed how it worked, that will also break any libraries that use it, right? In other words, it was not a backwards-compatible change to the code. So in any case, any use of the _8 by NOAA users will be a problem after the next release.

If we just remove the _8 from CMake, as we did in all the other NCEPLIBS, and it turns out to be used, we can add it back in for a new release. In that case, the users need make no changes in their code or build system. But with the case of your changes, the end user must also change their code to continue to use the _4 version as if it were _8. So they will need to change both their code and their build systems.

Yes, there is only one build now, and the way we coded it was to add a new subroutine that any 8-byte application program now needs to call one time, at the beginning of the code, to let the library know that it will be passing 8-byte integer arguments. And in turn that's why this was always planned to be a major (i.e. v12.0.0) release, since it would require such a change to any 8-byte application program. But there's no change that any 4-byte or _d application program would need to make.

@jack-woollen
Copy link
Contributor

jack-woollen commented Feb 7, 2023 via email

@jack-woollen
Copy link
Contributor

Attaching the list:

bufr8-makes.docx

@jbathegit
Copy link
Collaborator

jbathegit commented Feb 7, 2023

So at least on WCOSS2, there are still quite a few codes which depend on the _8 build of the library. As Jack suggested, maybe Kyle had only been looking inside of CMakeLists.txt files for application codes that were built with cmake.
But many WCOSS2 codes are still built directly with make.

BTW Jack, I think a good place to check cmake packages would be within any CMakeLists.txt files.

@jbathegit
Copy link
Collaborator

Ed, going forward, it sounds like we could either just leave the develop branch as it is now and have one build of the library, or we could go back and remove all of the 8-byte code that we recently added and then restore the generation of the separate _8 build of the library within cmake.

Either would work, and as Jack noted there's more than one way to look at this. There is some elegance to only having one build of the library that works everywhere, vs. having to continue to carry the extra _8 and _d builds around to every platform wherever the library is built. But it's also true that the latter case would be less of a lift for existing _8 users who could then just continue to link directly to a separate _8 build and therefore wouldn't need to change a thing.

My personal vote is to leave the develop branch as it is now, because I don't think it's a lot to ask users to add that one new subroutine call to their application code if they really want to continue to do 8-byte builds of their code. My guess is that a lot of those codes would probably work perfectly fine compiled with 4-byte integers, and that doing 8-byte builds is just a bit of legacy inertia (i.e. "that's the way we've always built our codes for the past 20+ years"). As Jack stated, what's "done is done" at this point, and the additional subroutine call is a very straightforward option for users if they really do want/need to continue doing 8-byte builds.

But that said, it also wouldn't be too onerous of a lift to go back and remove all of the recently added 8-byte code if you really feel that's a better approach from an overall software design standpoint. You have a lot more knowledge about best practices in that area, so I'd certainly welcome your input on that point. Just please bear in mind that, at least on WCOSS2, it's not an option to tell users they can just add their own compile flags to build their own 8-byte version of the library if they want; we either need to build it for them as part of the cmake instructions in the official admin build, or we just build one version of the library and let them know they now need to add one new subroutine call to their application code if they want to continue to pass 8-byte integers into the library.

@jack-woollen
Copy link
Contributor

jack-woollen commented Feb 7, 2023 via email

@jbathegit
Copy link
Collaborator

Sorry Jack, I'm not quite following what you're suggesting. Are you thinking some sort of simple wrapper where the _8 build of the library was nothing more than a call to setim8b() and then a link to the main build? I'm having trouble visualizing how that would actually work.

@jack-woollen
Copy link
Contributor

jack-woollen commented Feb 7, 2023 via email

@jbathegit
Copy link
Collaborator

Hmm, in that case I'm thinking I'd rather we just stick with the single library build. I don't think it's a big ask that users have to make a minimal code change if they want to continue building their application codes with 8-byte integers, and maybe that will even spur some of them to think about whether they really need to be using 8-byte integers in the first place ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants