Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArrayIndexOutOfBoundsException in tryCity method from maxmind #68

Closed
edelgadoh opened this issue Feb 20, 2020 · 12 comments
Closed

ArrayIndexOutOfBoundsException in tryCity method from maxmind #68

edelgadoh opened this issue Feb 20, 2020 · 12 comments

Comments

@edelgadoh
Copy link

Dear maxmind developers,

We are using the last lib

   <dependency>
        <groupId>com.maxmind.geoip2</groupId>
        <artifactId>geoip2</artifactId>
        <version>2.13.0</version>
    </dependency>

From time to time we observed some errors from maxmind lib, you can see the stacktrace below:

"stack_trace":"java.lang.ArrayIndexOutOfBoundsException: Index 34 out of bounds for length 16\n\tat com.maxmind.db.Decoder$Type.get(Decoder.java:53)\n\tat com.maxmind.db.Decoder.decode(Decoder.java:129)\n\tat com.maxmind.db.Decoder.decode(Decoder.java:88)\n\tat com.maxmind.db.Reader.resolveDataPointer(Reader.java:256)\n\tat com.maxmind.db.Reader.getRecord(Reader.java:176)\n\tat com.maxmind.geoip2.DatabaseReader.get(DatabaseReader.java:271)\n\tat com.maxmind.geoip2.DatabaseReader.tryCity(DatabaseReader.java:334)

As we analysed the code, the error starts in Decoder.class, specifically in this part:

static enum Type {
EXTENDED,
POINTER,
UTF8_STRING,
DOUBLE,
BYTES,
UINT16,
UINT32,
MAP,
INT32,
UINT64,
UINT128,
ARRAY,
CONTAINER,
END_MARKER,
BOOLEAN,
FLOAT;

    static final Decoder.Type[] values = values();

    private Type() {
    }

    static Decoder.Type get(int i) {
        return values[i];
    }

sometimes the i parameter is 36, but there are only 16 fixed values in the ENUM.

Thanks

@oschwald
Copy link
Member

It sounds like your database is corrupt.

@edelgadoh
Copy link
Author

edelgadoh commented Feb 20, 2020

But I works successfully for about 12K requests by minute in PRODUCTION, and generates only for about 50 errors in 1 minute, why do you say it sounds like it is corrupted?

@oschwald
Copy link
Member

The fact that it is intermittent supports the idea that it is corrupted. Although the error handling could be improved in this case to throw a nicer exception, what is happening is that it is hitting an unknown and invalid data type in the file while decoding a record. Are you storing the database in a JAR file? If so, please ensure you have binary filtering disabled.

@edelgadoh
Copy link
Author

edelgadoh commented Feb 20, 2020

I got it, right now we are storing this maxmind database in a shared volume in kubernetes and not inside a JAR file. Yeah, I agree that if the maxmind lib could throw a checked exception it'll be better to us to catch it in the correct way. Well for now, until a possible improvement, we are going to catch the "runtime exception" in our code, thank you very much!

@edelgadoh
Copy link
Author

@oschwald I have a doubt, is going to be created any task to improve this exception treatment?

@oschwald
Copy link
Member

Yes, we plan on improving the exception to throw an InvalidDatabaseException under this condition. However, you should not be seeing this issue with a valid MMDB file. Under what conditions are you seeing this error?

@edelgadoh
Copy link
Author

Hi Gregory, great to know that!

As I commented before, we are using a shared volume in kubernetes, and we read the maxmind files from the directory below. We have two jar files that are completely independent and only read from the path to create a bean.
image
Another point is that these files are updated from time to time by another team, and we are suspecting about the cache from maxmind, but it was not conclusive yet.

As I commented before, it's working almost ok for about 98% of each request, and only 2% of the request were generating a runtime exception, but for now we try and catch this runtime exception.

We are going to inspect it in more detail,
Thank you!

@oschwald
Copy link
Member

If the files are not replaced atomically on update, you would very likely see errors such as this as well as various InvalidDatabaseException errors.

@edelgadoh
Copy link
Author

@oschwald yeah, the files are replaced atomically.

we reproduced the error in local environment, how to reproduce:

  1. make two sets of MindMind files, the first dataset has old files, the second dataset has new files
  2. configure JMeter to call the MaxMind API with many ips during 3 minutes
  3. while the process in 2 is running, replace between first and second dataset
  4. in our tests, we saw errors in logs related to MaxMind database corrupted

In our system, we were using the MaxMind read option: MEMORY_MAP (https://github.com/maxmind/MaxMind-DB-Reader-java#usage) ... after change it to MEMORY option, we couldn't reproduce again this error in our local environment.

@oschwald
Copy link
Member

oschwald commented Apr 8, 2020

Can you share the exact mechanism used for (3) and the OS and file system?

@oschwald oschwald transferred this issue from maxmind/GeoIP2-java Apr 9, 2020
oschwald added a commit that referenced this issue Apr 9, 2020
Previously, we would throw an ArrayIndexOutOfBoundsException.

Closes #68.
@horgh horgh closed this as completed in 4273ca8 Apr 9, 2020
@edelgadoh
Copy link
Author

The exact mechanism used in (3) is:

We used JMeter to make a GET request to one fixed endpoint that uses MaxMind (using SpringBoot and Java Amazon Corretto 11), we configured JMeter to create 100 threads, making an interval of 5 seconds to wait and repeat it 3 times. The all process takes for about 15 seconds.

During this 15 seconds of execution, we replaced between the old and new MaxMind databases (using the terminal we made a cp -r /tmp/oldMaxmind/ /project/maxmind-files/ and then a cp -r /tmp/newMaxmind/ /project/maxmind-files/ at least 6 times)

My notebook is an Ubuntu 19:
Linux NI-45848-1 5.3.0-050300-generic #201909152230 SMP Sun Sep 15 22:32:54 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

Thank you!

@oschwald
Copy link
Member

cp is not atomic, which explains the issue. If you want to atomically replace the file, you will need to move it to the save filesystem and then mv the new file over the old file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants