Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: Libpostal data download via API #939

Closed
mrog opened this issue Sep 3, 2020 · 6 comments
Closed

Feature request: Libpostal data download via API #939

mrog opened this issue Sep 3, 2020 · 6 comments

Comments

@mrog
Copy link

mrog commented Sep 3, 2020

The libpostal library requires periodic data downloads. These downloads currently have to be done manually using a command line tool that's obtained by building the libpostal code from source. It would be very helpful if I could trigger a data download from Java via the API.

saudet added a commit that referenced this issue Sep 4, 2020
@saudet
Copy link
Member

saudet commented Sep 4, 2020

Done in commit bfbd6da! Please give it a try with the snapshots: http://bytedeco.org/builds/

/cc @Maurice-Betzel

@mrog
Copy link
Author

mrog commented Sep 4, 2020

Thanks for the quick fix!

I tried it on macOS 10.15.6 using OpenJDK 14.0.2, and it worked beautifully there. I'm really pleased with that!

Then I tried it on Ubuntu 18.04.4, also using OpenJDK 14.0.2, and it always fails partway through the download for some reason.

I used the same data directory on both machines: /tmp/libpostal. And the data directory started out empty each time.

Here's the output I'm getting on Ubuntu:

Old version of datadir detected, removing...
Checking for new libpostal language classifier data file...
Failed to set filetime 1515440843 on outfile: errno 1
New libpostal language classifier data file available
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 650 100 650 0 0 4744 0 --:--:-- --:--:-- --:--:-- 4744
100 48.0M 100 48.0M 0 0 5266k 0 0:00:09 0:00:09 --:--:-- 5781k
language_classifier/
language_classifier/language_classifier.dat
Checking for new libpostal data file...
Failed to set filetime 1520730226 on outfile: errno 1
New libpostal data file available
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 645 100 645 0 0 8716 0 --:--:-- --:--:-- --:--:-- 8716
100 9951k 100 9951k 0 0 4462k 0 0:00:02 0:00:02 --:--:-- 5622k
address_expansions/
address_expansions/address_dictionary.dat
numex/
numex/numex.dat
transliteration/
transliteration/transliteration.dat
Checking for new libpostal parser data file...
Failed to set filetime 1515381053 on outfile: errno 1
New libpostal parser data file available
Downloading multipart: https://github.com/openvenues/libpostal/releases/download/v1.0.0/parser.tar.gz, size=752483239, num_chunks=11
Downloading part 1: filename=/tmp/libpostal/parser.tar.gz.1, offset=0, max=67108863
Downloading part 2: filename=/tmp/libpostal/parser.tar.gz.2, offset=67108864, max=134217727
Downloading part 3: filename=/tmp/libpostal/parser.tar.gz.3, offset=134217728, max=201326591
Downloading part 4: filename=/tmp/libpostal/parser.tar.gz.4, offset=201326592, max=268435455
Downloading part 5: filename=/tmp/libpostal/parser.tar.gz.5, offset=268435456, max=335544319
Downloading part 6: filename=/tmp/libpostal/parser.tar.gz.6, offset=335544320, max=402653183
Downloading part 7: filename=/tmp/libpostal/parser.tar.gz.7, offset=402653184, max=469762047
Downloading part 8: filename=/tmp/libpostal/parser.tar.gz.8, offset=469762048, max=536870911
Downloading part 9: filename=/tmp/libpostal/parser.tar.gz.9, offset=536870912, max=603979775
Downloading part 10: filename=/tmp/libpostal/parser.tar.gz.10, offset=603979776, max=671088639
Downloading part 11: filename=/tmp/libpostal/parser.tar.gz.11, offset=671088640, max=752483239
address_parser/
address_parser/address_parser_crf.dat
address_parser/address_parser_phrases.dat
address_parser/address_parser_postal_codes.dat
address_parser/address_parser_vocab.trie
Exception in thread "main" java.lang.UnsatisfiedLinkError: no jnipostal in java.library.path: [/usr/java/packages/lib, /usr/lib64, /lib64, /lib, /usr/lib]
at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2680)
at java.base/java.lang.Runtime.loadLibrary0(Runtime.java:807)
at java.base/java.lang.System.loadLibrary(System.java:1907)
at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1683)
at org.bytedeco.javacpp.Loader.load(Loader.java:1300)
at org.bytedeco.javacpp.Loader.load(Loader.java:1123)
at org.bytedeco.libpostal.global.postal.(postal.java:14)
at com.opsecsecurity.genesis.nis.storage.transform.PhysicalAddressParser.initialize(PhysicalAddressParser.java:28)
at com.opsecsecurity.genesis.nis.dataload.DataLoader.main(DataLoader.java:49)
Caused by: java.lang.UnsatisfiedLinkError: /home/mrogers/.javacpp/cache/ui-network-intelligence-service-dataload-0-SNAPSHOT-jar-with-dependencies.jar/org/bytedeco/libpostal/linux-x86_64/libjnipostal.so: /lib/x86_64-linux-gnu/libm.so.6: version `GLIBC_2.29' not found (required by /home/mrogers/.javacpp/cache/ui-network-intelligence-service-dataload-0-SNAPSHOT-jar-with-dependencies.jar/org/bytedeco/libpostal/linux-x86_64/libpostal.so.1)
at java.base/java.lang.ClassLoader$NativeLibrary.load0(Native Method)
at java.base/java.lang.ClassLoader$NativeLibrary.load(ClassLoader.java:2452)
at java.base/java.lang.ClassLoader$NativeLibrary.loadLibrary(ClassLoader.java:2508)
at java.base/java.lang.ClassLoader.loadLibrary0(ClassLoader.java:2704)
at java.base/java.lang.ClassLoader.loadLibrary(ClassLoader.java:2637)
at java.base/java.lang.Runtime.load0(Runtime.java:745)
at java.base/java.lang.System.load(System.java:1871)
at org.bytedeco.javacpp.Loader.loadLibrary(Loader.java:1633)
... 5 more

Is this exception caused by a problem in libpostal_data, or in libpostal-platform?

@mrog
Copy link
Author

mrog commented Sep 4, 2020

I should also point out that I built libpostal from source on the Ubuntu computer, and I can run that build of libpostal_data successfully.

@saudet
Copy link
Member

saudet commented Sep 5, 2020

I've deployed from my local Fedora machine for now, that's going to go away once the build for CentOS is uploaded.

@mrog
Copy link
Author

mrog commented Sep 5, 2020

Sounds good. Thanks!

@saudet
Copy link
Member

saudet commented Sep 10, 2020

It's been released with version 1.5.4 and everything should work fine with that, but if not please let me know! Thanks for reporting

@saudet saudet closed this as completed Sep 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants