Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using gsl with shared libcblas instead of javacpp's provided libgslblas #18

Closed
blueberry opened this issue Oct 22, 2014 · 10 comments
Closed
Labels

Comments

@blueberry
Copy link

Hi, thank you for this fantastic project. I imported the library, an it works.

I would like to use atlas blas implementation for the blas part of gsl. In c, that would mean changing the make file to link libcblas and/or atlas instead of libgslblas. However, javacpp provides the prepackaged build that ignores system's files and uses the one provided in a platform jar.
Is there a way to specify which (shared) libraries to use instead of the ones provided inside jars?

If I'd have to build javacpp's gsl manually, where should i change the libcblas setting?

@saudet
Copy link
Member

saudet commented Oct 23, 2014

Glad to hear it's working well :)

I'm not sure how they are making this work from C, but maybe it works the same from Java. Can you try to call something like System.loadLibrary("atlas") or System.load("/path/to/libatlas.so") before trying to use GSL?

@saudet
Copy link
Member

saudet commented Oct 25, 2014

No, this isn't going to work. I see that we need to relink the libjnigsl.so file. The easiest way to go about this is by modifying the Java config file and rebuild the JAR file. On Linux, Mac OS X, or other Unix variants, make a change like this in the presets/gsl.java file:

- link={"gslcblas@.0", "gsl@.0"}),
+ link={"cblas@.3", "atlas@.3", "gsl@.0"}, linkpath={"/usr/lib/atlas/", "/usr/lib64/atlas/"}),

And rebuild by calling bash cppbuild.sh install gsl and mvn install --projects gsl. It's possible to skip the cppbuild phase and use the GSL library installed on the system. In that case, we would need to add the includepath and linkpath along the ones for ATLAS above, but that's it. I'm not sure if there would be an easier way to make all this work out of the box... Should we package GSL with ATLAS by default?

BTW, are there any other C/C++ libraries that you would like to use or are using from Java? They might be good candidates to add to this project :) Thanks for the feedback

@blueberry
Copy link
Author

Hi, thanks for the help.

Regarding atlas, it is pointless to bundle it with the jar because the whole point of atlas is to build it and fine tune it for a specific processor architecture.

I had to change plans and build my own java blas wrapper because gsl is gpl, so I am not allowed to mix it with Clojure libraries, which are usually EPL license.

Ideas for native libraries? I tend to think that cblas and lapacke would be good candidates, since Java still sucks at number crunching.
However, currently I have to resort to hand-written JNI because it is important to me to avoid copying arrays from jvm->native code and it seems that all autogenerated libraries do the copying. I need GetPrimitiveArrayCritical :)

Still, I am very impressed by javacpp - excellent performance :)

@saudet
Copy link
Member

saudet commented Nov 1, 2014

I see, thanks for the feedback! BTW, we could easily wrap CBLAS and LAPACK with JavaCPP:
https://github.com/bytedeco/javacpp-presets/wiki/Create-New-Presets
And GetPrimitiveArrayCritical() does get used on arrays.

@saudet
Copy link
Member

saudet commented Nov 1, 2014

Oh wait, you need to use GetPrimitiveArrayCritical() on all function calls? JavaCPP doesn't do that, no, because the JDK doesn't recommend it, and says it might not even do what we want it to do anyway:
http://docs.oracle.com/javase/7/docs/technotes/guides/jni/spec/functions.html#GetPrimitiveArrayCritical

When we're after performance, we should be using direct NIO buffers. Those are as fast as arrays inside both Java and in native code with JNI, and are well supported by the JDK in general.

@blueberry
Copy link
Author

I use the critical version on all arrays. It MIGHT return a copy, but in my (preliminary and simple, but run millions of times) tests it never did. On the other hand, the non-critical version always returned a copy.

I heard different kinds of stories about critical vs. buffer. Only anecdotal, without an example or demonstration, and without the any link to detailed discussion. I cannot find any part in the docs where critical call is not recommended or discouraged. There are only constraints, which are fine for my use case. fommil (netlib-java), who has similar use case, also uses critical calls for the project, and he did not have any problems.

I am still quite open to NIO, but that also have drawbacks - unnatural API. Instead of methods that take arrays, there would be bytebuffers all the way.

I am puzzled by another thing. In all java libs that I tried, even the simplest JNI call with primitive arguments took several time longer than a simple calculation JNI hello world. For example, javacpp-gsl call to logarithm function took, if I remember well, 25 ns (measured by the criterium library). The call to hello takes 8 ns. Where do 17 ns go? Even java's Math.log takes 2 ns or so, so I expect gsl log to be similar.
The same goes for other autogenerated wrappers - they take much more time than the handwritten versions that I rushed in an hour. One obvious reason would be that they copy arrays everywhere, but If they don't, what else could be the reason for the overhead? All those libraries are autogenerated, so finding the reason is not that simple as looking at the code, and I didn't pursue it further.

I understand that nanoseconds are not that important for the majority of software, but for the building blocks of numerical code that is a considerable waste that multiplies quickly ;)

@saudet
Copy link
Member

saudet commented Nov 3, 2014

In the link above they say "After calling GetPrimitiveArrayCritical, the native code should not run for an extended period of time before it calls ReleasePrimitiveArrayCritical." That sounds like a big fat warning to me. But if everything works fine in your application, that's fine I suppose. I'm just not sure what the advantages are over direct NIO buffers. I don't find calling get() and put() to be any more unnatural than brackets and equal signs :) Besides, I've implemented that way efficient multidimensional arrays from a contiguous block of memory, something that we can't do with Java arrays directly anyway:
bytedeco/javacpp@4d8dc9d

There are a few places where the overhead might come from. In this case, I think the largest overhead would come from checking for C++ exceptions, but we'd have to check the generated code to make sure. We can disable checking for C++ exceptions with the @NoException annotation, but the Parser doesn't do that automatically right now, because like you say it's not something that most people look into. But it is possible to optimize everything to make it as fast as handwritten JNI if necessary.

@blueberry
Copy link
Author

One source for the complexity might be that autogen libraries try to cover C++, which is very complex, and they have to deal with many edge cases, whereas when I choose plain C there are less of those things to consider, and also I can choose what applies in my use case.
Of course, the tool may provide different switches and settings, but then plain JNI code suddenly becomes simpler :)
All in all, my current (limited) experience with JNI is that it is not too much complex, it only has more boilerplate than Java, and I had to learn a few tools from the C ecosystem...

@saudet saudet added the question label Nov 9, 2014
@saudet
Copy link
Member

saudet commented Nov 9, 2014

Of course if you know exactly what you're doing, it's better to code everything in C, and a bit of JNI isn't going to change a whole lot. But the point of developing with Java is usually because we don't know exactly what we're doing. We build prototypes, try stuff out, debug the hell out of everything, etc, and the better the tools we have for those tasks, the more time we can spend on the essential.

For example, let's assume that we need to call the logarithm function on only one scalar very often. With the Java wrappers, we could start experimenting with that, and make sure that our algorithm works well. Then, when we know that everything is fine, but that it has become the bottleneck of our application, we make a new function in C/C++, and call that function from Java. And JavaCPP can still help here in calling that new function by abstracting the differences between platforms, and by maintaining compatibility with the rest of the native functions without effort.

@saudet saudet closed this as completed Nov 9, 2014
saudet added a commit that referenced this issue Feb 14, 2018
@saudet
Copy link
Member

saudet commented Feb 14, 2018

It's been a while, but with the commit above, presets for GSL now link automatically with OpenBLAS, MKL, Accelerate, etc instead of GSL CBLAS, and they do not try to catch C++ exceptions anymore, so they should be pretty much as fast as manually written JNI wrapper code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants