NullPointerException in Parser and poor Unicode support #70

gwenn · 2016-02-14T08:48:11Z

With JavaCPP 1.1:

import org.bytedeco.javacpp.*;
import org.bytedeco.javacpp.annotation.*;

@Properties(target="org.sqlite.SQLite", value={
@Platform(include="sqlite3.h")
})
public class SQLite {
}

javac -cp javacpp.jar SQLite.java
java -jar javacpp.jar SQLite

Targeting org/sqlite/SQLite.java
Parsing sqlite3.h
Exception in thread "main" java.lang.NullPointerException
    at org.bytedeco.javacpp.tools.Parser.declarator(Parser.java:913)
    at org.bytedeco.javacpp.tools.Parser.function(Parser.java:1245)
    at org.bytedeco.javacpp.tools.Parser.declarations(Parser.java:2347)
    at org.bytedeco.javacpp.tools.Parser.group(Parser.java:1985)
    at org.bytedeco.javacpp.tools.Parser.declarations(Parser.java:2346)
    at org.bytedeco.javacpp.tools.Parser.extern(Parser.java:2272)
    at org.bytedeco.javacpp.tools.Parser.declarations(Parser.java:2345)
    at org.bytedeco.javacpp.tools.Parser.parse(Parser.java:2421)
    at org.bytedeco.javacpp.tools.Parser.parse(Parser.java:2517)
    at org.bytedeco.javacpp.tools.Builder.parse(Builder.java:67)
    at org.bytedeco.javacpp.tools.Builder.build(Builder.java:624)
    at org.bytedeco.javacpp.tools.Builder.main(Builder.java:773)

The text was updated successfully, but these errors were encountered:

saudet · 2016-02-14T08:50:17Z

Try with the latest on the master branch. There's been a lot of fixes since 1.1.

gwenn · 2016-02-14T08:56:52Z

Same error with JavaCPP version 1.2-SNAPSHOT

Targeting org/sqlite/SQLite.java
Parsing sqlite3.h
Exception in thread "main" java.lang.NullPointerException
    at org.bytedeco.javacpp.tools.Parser.declarator(Parser.java:1005)
    at org.bytedeco.javacpp.tools.Parser.function(Parser.java:1441)
    at org.bytedeco.javacpp.tools.Parser.declarations(Parser.java:2598)
    at org.bytedeco.javacpp.tools.Parser.group(Parser.java:2198)
    at org.bytedeco.javacpp.tools.Parser.declarations(Parser.java:2597)
    at org.bytedeco.javacpp.tools.Parser.extern(Parser.java:2523)
    at org.bytedeco.javacpp.tools.Parser.declarations(Parser.java:2596)
    at org.bytedeco.javacpp.tools.Parser.parse(Parser.java:2672)
    at org.bytedeco.javacpp.tools.Parser.parse(Parser.java:2769)
    at org.bytedeco.javacpp.tools.Builder.parse(Builder.java:68)
    at org.bytedeco.javacpp.tools.Builder.build(Builder.java:620)
    at org.bytedeco.javacpp.tools.Builder.main(Builder.java:769)

gwenn · 2016-02-14T09:18:22Z

A quick and dirty fix:

diff --git a/src/main/java/org/bytedeco/javacpp/tools/Parser.java b/src/main/java/org/bytedeco/javacpp/tools/Parser.java
index 9b3c8bc..2cfe4d1 100644
--- a/src/main/java/org/bytedeco/javacpp/tools/Parser.java
+++ b/src/main/java/org/bytedeco/javacpp/tools/Parser.java
@@ -1002,7 +1002,10 @@ public class Parser {
                 }
                 info = infoMap.getFirst(cppType += ")");

-                String functionType = Character.toUpperCase(originalName.charAt(0)) + originalName.substring(1);
+                String functionType = null;
+                if (originalName != null) {
+                    functionType = Character.toUpperCase(originalName.charAt(0)) + originalName.substring(1);
+                }
                 if (info != null && info.pointerTypes.length > 0) {
                     functionType = info.pointerTypes[0];
                 } else if (typedef) {

It seems that the error is related to the following line:

  void (*(*xDlSym)(sqlite3_vfs*,void*, const char *zSymbol))(void);

saudet · 2016-02-14T09:51:14Z

Thanks!! The parser is (not so) quick and (but quite) dirty anyway. (BTW, the first thing I'd like to try at this point for the parser is to see if we couldn't use Clang for that: #51)

saudet · 2016-02-14T14:16:18Z

In any case, if you get sqlite working, let me know and we'll start a "presets" out of that! Thanks

gwenn · 2016-02-14T18:55:43Z

I've created a branch here.
But JavaCPP Builder fails:
https://travis-ci.org/gwenn/sqlite-jna/builds/109200887
I'll try to fix it.

…tion pointers (issue #70)

saudet · 2016-02-15T14:30:03Z

Thanks! I'll try to get a look at that sooner rather than later...

gwenn · 2016-02-15T20:34:13Z

Ok,
https://travis-ci.org/gwenn/sqlite-jna/builds/109444251
JavaCPP Builder succeed.
And only some tests related to callback are failing.

May I ask you two questions:
a) Is there any way to tell JavaCPP that some SQLite functions expect UTF-8 encoded string (a not the default/current encoding) ?
b) Is there any way to create a FunctionPointer with an address set to -1 (which means TRANSIENT for SQLITE) ?
I tried:

    public static final Destructor TRANSIENT = new Destructor() {
        {
            this.address = -1; // SQLITE_TRANSIENT
        }

        @Override
        public void call(Pointer p) {
            throw new UnsupportedOperationException();
        }
    };

with no success.
Thanks.

gwenn · 2016-02-16T17:30:42Z

Ok,
Callbacks fixed:
https://travis-ci.org/gwenn/sqlite-jna/builds/109658416

Regards.

saudet · 2016-02-17T00:48:11Z

Great, thanks! Are you by any chance comparing the overhead of each tool (JNA, JNR, etc)? It would be nice to have some benchmarks as reference.

For the function pointer, I was starting to consider adding support for native void allocate(int i) to FunctionPointer, but you're right, we can simply overload the method and use a cast for that:

void sqlite3_result_blob(sqlite3_context c, Pointer p, int i, FunctionPointer destructor);
void sqlite3_result_blob(sqlite3_context c, Pointer p, int i, @Cast("sqlite3_destructor_type") long destructor);

It's not type safe, but it isn't in C anyway. Does it still make sense to try to cram that into FunctionPointer?

As for the character encoding, String gets passed as "modified UTF-8", because that's what we get most efficiently with JNI. To get another encoding, we can use new BytePointer(String s, String charsetName) as a @Cast("const char*") BytePointer argument. We could probably add settings and stuff, but they would need to be dynamic in some way... If you'd like to try something, any experimentation would be welcome.

gwenn · 2016-02-17T07:02:33Z

Sorry, I have no benchmark between the different binding tools.
Right now, I don't have access to www.ch-werner.de/javasqlite/ site but I think there is a benchmark
included.

And SQLite don't like the "modified UTF-8" produced by JNI. So only the BytePointer can be used.

saudet · 2016-02-17T11:50:37Z

From what I see on your branches, it doesn't look like the other tools found a better way to pass strings in a different encoding, right?

BTW, if String.getBytes(Charset) is more efficient, we could add that to BytePointer for ease of use.

gwenn · 2016-02-17T13:22:43Z

It seems that both JNA and JNR use the correct UTF-8 encoding.
https://github.com/java-native-access/jna/blob/51cab7599d40076af275fcfaffe838dfc35e51f2/src/com/sun/jna/Native.java#L98

saudet · 2016-02-17T13:58:46Z

Right, but we still need to allocate bytes on the heap. It doesn't do it in native code somehow.

Or are you saying there would be an advantage in setting the default encoding used by BytePointer? We could add a property for that, and do the same as JNA.

gwenn · 2016-02-17T18:12:47Z

Maybe I am biased but I want my program to preserve data integrity (data not corrupted by a bad encoding) even if there is an additional cost.
I guess that many JNI bindings using GetStringUTFChars are buggy (including mine).

Something like that

static native void foo(@Encoding("UTF-8")String bar);

would be perfect.

gwenn · 2016-02-19T19:15:04Z

Some benchmarks using the BenchmarkDriver available here:

=====================sqlite-jna==========================
Driver: org.sqlite.driver.JDBC
URL:jdbc:sqlite:/Users/gwen/Java/sqlite-jna/bench.sqlite

Scale factor value: 1
Number of clients: 10
Number of transactions per client: 10

Initializing dataset...DBMS: SQLite 3

* Starting Benchmark Run *

* Benchmark Report *
* Featuring <direct queries> <auto-commit> 
--------------------
Time to execute 100 transactions: 0.485 seconds.
0 / 100 failed to complete.
Transaction rate: 206.18556701030928 txn/sec.

* Benchmark Report *
* Featuring <direct queries> <transactions> 
--------------------
Time to execute 100 transactions: 0.295 seconds.
0 / 100 failed to complete.
Transaction rate: 338.98305084745766 txn/sec.

* Benchmark Report *
* Featuring <prepared statements> <auto-commit> 
--------------------
Time to execute 100 transactions: 0.402 seconds.
0 / 100 failed to complete.
Transaction rate: 248.75621890547262 txn/sec.

* Benchmark Report *
* Featuring <prepared statements> <transactions> 
--------------------
Time to execute 100 transactions: 0.196 seconds.
0 / 100 failed to complete.
Transaction rate: 510.204081632653 txn/sec.

=====================sqlite-javacpp===========================
Driver: org.sqlite.driver.JDBC
URL:jdbc:sqlite:/Users/gwen/Java/sqlite-javacpp/bench.sqlite

Scale factor value: 1
Number of clients: 10
Number of transactions per client: 10

Initializing dataset...DBMS: SQLite 3

* Starting Benchmark Run *

* Benchmark Report *
* Featuring <direct queries> <auto-commit> 
--------------------
Time to execute 100 transactions: 0.423 seconds.
1 / 100 failed to complete.
Transaction rate: 234.04255319148936 txn/sec.

* Benchmark Report *
* Featuring <direct queries> <transactions> 
--------------------
Time to execute 100 transactions: 0.274 seconds.
0 / 100 failed to complete.
Transaction rate: 364.963503649635 txn/sec.

* Benchmark Report *
* Featuring <prepared statements> <auto-commit> 
--------------------
Time to execute 100 transactions: 0.317 seconds.
0 / 100 failed to complete.
Transaction rate: 315.45741324921136 txn/sec.

* Benchmark Report *
* Featuring <prepared statements> <transactions> 
--------------------
Time to execute 100 transactions: 0.159 seconds.
0 / 100 failed to complete.
Transaction rate: 628.9308176100628 txn/sec.

==================sqlite-jni===============================
Driver: org.sqlite.driver.JDBC
URL:jdbc:sqlite:/Users/gwen/Java/sqlite-jni/bench.sqlite

Scale factor value: 1
Number of clients: 10
Number of transactions per client: 10

Initializing dataset...DBMS: SQLite 3

* Starting Benchmark Run *

* Benchmark Report *
* Featuring <direct queries> <auto-commit> 
--------------------
Time to execute 100 transactions: 0.399 seconds.
0 / 100 failed to complete.
Transaction rate: 250.62656641604008 txn/sec.

* Benchmark Report *
* Featuring <direct queries> <transactions> 
--------------------
Time to execute 100 transactions: 0.28 seconds.
0 / 100 failed to complete.
Transaction rate: 357.1428571428571 txn/sec.

* Benchmark Report *
* Featuring <prepared statements> <auto-commit> 
--------------------
Time to execute 100 transactions: 0.325 seconds.
0 / 100 failed to complete.
Transaction rate: 307.6923076923077 txn/sec.

* Benchmark Report *
* Featuring <prepared statements> <transactions> 
--------------------
Time to execute 100 transactions: 0.175 seconds.
0 / 100 failed to complete.
Transaction rate: 571.4285714285714 txn/sec.

Java version: 1.8.0_60, vendor: Oracle Corporation
OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"

saudet · 2016-02-20T04:31:40Z

So, the wrapper based on JavaCPP is on average faster than the custom JNI? Cool :) Thanks for taking the time!

Of course, I understand the need to use standard UTF-8, but it's not just about UTF-8. Specifying the encoding info in an annotation like that prevents the end user from using any other encodings. In the general case, not in the case of SQLite, we often need to be able to set it at runtime, don't we?

gwenn · 2016-02-20T05:39:17Z

There are other libraries supporting only utf-8:

http://xmlsoft.org/encoding.html

xmlChar, the libxml2 data type is a byte, those bytes must be assembled as UTF-8 valid strings.

https://jansson.readthedocs.org/en/2.7/apiref.html#string

Jansson uses UTF-8 as the character encoding. All JSON strings must be valid UTF-8

saudet · 2016-02-20T06:52:39Z

Ah, why did the JDK go with "modified UTF-8"...

saudet · 2016-02-21T07:06:39Z

BTW, if you make a blog post or something about your experience using JNA, JNR, JNI, JavaCPP, BridJ, etc let me know! Thanks

…IFIED_UTF8_STRING` for old behavior) (issue #70)

saudet · 2016-04-17T11:36:18Z

About UTF-8 encoding, I've switched to String.getBytes() and new String(byte[]) by default in the latest commit, to have the default charset used. This is usually UTF-8 on Linux, but we can launch the JVM with the -Dfile.encoding=UTF-8 option to make sure that is what gets selected.

The calls are a bit slow, about 300 ns, but GetStringUTFChars() isn't exactly fast either (about 100 ns), so I think it's a good trade-off. Things are more consistent that way.

Let me know if you encounter any problems with that! Thanks for all the tests you've done :)

gwenn · 2016-04-17T11:57:32Z

Sorry, but for me, it is useless:
I don't want to use/change a global state/variable to say that sqlite3_bind_text expects UTF-8.

saudet · 2016-04-17T13:34:26Z

Well, ok, say we had an annotation and we could write something like the following:

static native void foo(@Encoding("UTF-8") String bar);

Other than saving a couple of keystrokes (that we could save using the parser anyway) what is the rationale for not doing it the way we already can as below?

static native void foo(BytePointer bar);
static void foo(String bar) { foo(new BytePointer(bar, "UTF-8")); }

Where we could wrap new BytePointer(bar, "UTF-8") in a helper method BTW. Also, sometimes, we can't get String objects from the native API, only BytePointer, so usually we do need overloads with BytePointer anyway.

In any case, if it's important enough, please do send a pull request, but unless I understand the benefit it's not something I'm going to take time to implement and maintain myself.

saudet · 2016-04-18T00:30:39Z

I've just had an idea though. We could hard code the encoding used for all String passed to and received from a given library via a macro like this @Platform(define = "STRING_BYTES_ENCODING UTF-8", ...). Now that's something that would make sense to me... What do you think?

gwenn · 2016-04-18T17:51:20Z

Good idea.
And I only use the UTF-8 methods in my binding.
But SQLite supports two encodings by default: UTF-8 (sqlite3_bind_text) and UTF-16 (sqlite3_bind_text16)...
So your solution works in my case but not in all cases.

saudet · 2016-04-19T03:52:20Z

Right, but you have to understand that in general there isn't any good options with standard Java types like primitive arrays, String or even direct NIO buffers. For example, we can't resize a ByteBuffer or modify the content of a String, so when we need to support a native function with an output argument like that, it's just easier (and more efficient than whatever hack we can come up with) to just use BytePointer (or CharPointer for UTF-16 or IntPointer for UTF-32). There are other things Java can't do, like arrays with more than Integer.MAX_VALUE elements. We could go on and on, but the ultimate solution to everything remains the same: Use an appropriate subclass of Pointer. :)

saudet · 2021-01-22T03:35:54Z

There is a good reason to provide UTF-16 support through JNI functions though: performance, see issue #442 (comment).

Thanks to @equeim, we should have this implemented soon! (Including "modified UTF-8" while we're at it.)

saudet · 2021-03-01T14:09:30Z

Ok, it's done! Thanks to @equeim and his contribution in pull #460, we can now map String arguments to, for example, UTF-8 by annotating a class with something like @Platform(define = "STRING_BYTES_CHARSET \"UTF-8\", ...), while we can still use @AsUtf16 on individual String parameters to get fast access to the underlying UTF-16 code units.

saudet · 2021-03-01T14:14:45Z

BTW, we may want to update the guide here with a new section for that:
https://github.com/bytedeco/javacpp/wiki/Mapping-Recipes

gwenn · 2021-03-08T17:51:34Z

Thanks. I will try to use sqlite3.*16 functions with @AsUtf16 annotation where possible. In these cases, I guess I can get rid of @Cast("const char*") BytePointer and use a String directly.

saudet · 2021-03-09T01:54:38Z

These changes have been released with JavaCPP 1.5.5. Enjoy!

Thanks. I will try to use sqlite3.*16 functions with @AsUtf16 annotation where possible. In these cases, I guess I can get rid of @Cast("const char*") BytePointer and use a String directly.

You can also provide overloads with CharPointer to allow passing UTF-16 data to and from off-heap memory as well.

saudet added the bug label Feb 14, 2016

saudet added a commit that referenced this issue Feb 15, 2016

* Fix NullPointerException in Parser on unexpected forms of func…

eabafa4

…tion pointers (issue #70)

saudet added a commit that referenced this issue Apr 17, 2016

* Treat all String with Charset.defaultCharset() (or define `MOD…

b8d6b97

…IFIED_UTF8_STRING` for old behavior) (issue #70)

saudet mentioned this issue Oct 9, 2017

Merging with JCuda and JOpenCL projects for better quality cuda interfaces bytedeco/javacpp-presets#475

Open

saudet added enhancement help wanted and removed bug labels Dec 15, 2018

saudet changed the title ~~NullPointerException in Parser~~ NullPointerException in Parser and poor Unicode support Dec 15, 2018

saudet mentioned this issue Jan 22, 2021

Add @Utf16String annotation to map java.lang.String to unsigned short* #442

Merged

saudet removed the help wanted label Mar 1, 2021

saudet closed this as completed Mar 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NullPointerException in Parser and poor Unicode support #70

NullPointerException in Parser and poor Unicode support #70

gwenn commented Feb 14, 2016

saudet commented Feb 14, 2016

gwenn commented Feb 14, 2016

gwenn commented Feb 14, 2016

saudet commented Feb 14, 2016

saudet commented Feb 14, 2016

gwenn commented Feb 14, 2016

saudet commented Feb 15, 2016

gwenn commented Feb 15, 2016

gwenn commented Feb 16, 2016

saudet commented Feb 17, 2016

gwenn commented Feb 17, 2016

saudet commented Feb 17, 2016

gwenn commented Feb 17, 2016

saudet commented Feb 17, 2016

gwenn commented Feb 17, 2016

gwenn commented Feb 19, 2016

saudet commented Feb 20, 2016

gwenn commented Feb 20, 2016

saudet commented Feb 20, 2016

saudet commented Feb 21, 2016

saudet commented Apr 17, 2016

gwenn commented Apr 17, 2016

saudet commented Apr 17, 2016

saudet commented Apr 18, 2016 •

edited

Loading

gwenn commented Apr 18, 2016 •

edited

Loading

saudet commented Apr 19, 2016

saudet commented Jan 22, 2021

saudet commented Mar 1, 2021

saudet commented Mar 1, 2021

gwenn commented Mar 8, 2021

saudet commented Mar 9, 2021

NullPointerException in Parser and poor Unicode support #70

NullPointerException in Parser and poor Unicode support #70

Comments

gwenn commented Feb 14, 2016

saudet commented Feb 14, 2016

gwenn commented Feb 14, 2016

gwenn commented Feb 14, 2016

saudet commented Feb 14, 2016

saudet commented Feb 14, 2016

gwenn commented Feb 14, 2016

saudet commented Feb 15, 2016

gwenn commented Feb 15, 2016

gwenn commented Feb 16, 2016

saudet commented Feb 17, 2016

gwenn commented Feb 17, 2016

saudet commented Feb 17, 2016

gwenn commented Feb 17, 2016

saudet commented Feb 17, 2016

gwenn commented Feb 17, 2016

gwenn commented Feb 19, 2016

saudet commented Feb 20, 2016

gwenn commented Feb 20, 2016

saudet commented Feb 20, 2016

saudet commented Feb 21, 2016

saudet commented Apr 17, 2016

gwenn commented Apr 17, 2016

saudet commented Apr 17, 2016

saudet commented Apr 18, 2016 • edited Loading

gwenn commented Apr 18, 2016 • edited Loading

saudet commented Apr 19, 2016

saudet commented Jan 22, 2021

saudet commented Mar 1, 2021

saudet commented Mar 1, 2021

gwenn commented Mar 8, 2021

saudet commented Mar 9, 2021

saudet commented Apr 18, 2016 •

edited

Loading

gwenn commented Apr 18, 2016 •

edited

Loading