Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial rewrite of MMapDirectory for JDK-17 preview (incubating) Panama APIs (>= JDK-17-ea-b25) #177

Closed
wants to merge 47 commits into from

Conversation

uschindler
Copy link
Contributor

@uschindler uschindler commented Jun 8, 2021

INFO: This is a followup of #173: It's the same code base, but with API changes from JDK 17 applied

This is just a draft PR for a first insight on memory mapping improvements in JDK 17+.

Some background information: Starting with JDK-14, there is a new incubating module "jdk.incubator.foreign" that has a new, not yet stable API for accessing off-heap memory (and later it will also support calling functions using classical MethodHandles that are located in libraries like .so or .dll files). This incubator module has several versions:

This module more or less overcomes several problems:

  • ByteBuffer API is limited to 32bit (in fact MMapDirectory has to chunk in 1 GiB portions)
  • There is no official way to unmap ByteBuffers when the file is no longer used. There is a way to use sun.misc.Unsafe and forcefully unmap segments, but any IndexInput accessing the file from another thread will crush the JVM with SIGSEGV or SIGBUS. We learned to live with that and we happily apply the unsafe unmapping, but that's the main issue.

@uschindler had many discussions with the team at OpenJDK and finally with the third incubator, we have an API that works with Lucene. It was very fruitful discussions (thanks to @mcimadamore !)

With the third incubator we are now finally able to do some tests (especially performance). As this is an incubating module, this PR first changes a bit the build system:

  • disable -Werror for :lucene:core
  • add the incubating module to compiler of :lucene:core and enable it for all test builds. This is important, as you have to pass --add-modules jdk.incubator.foreign also at runtime!

The code basically just modifies MMapDirectory to use LONG instead of INT for the chunk size parameter. In addition it adds MemorySegmentIndexInput that is a copy of our ByteBufferIndexInput (still there, but unused), but using MemorySegment instead of ByteBuffer behind the scenes. It works in exactly the same way, just the try/catch blocks for supporting EOFException or moving to another segment were rewritten.

It passes all tests and it looks like you can use it to read indexes. The default chunk size is now 16 GiB (but you can raise or lower it as you like; tests are doing this). Of course you can set it to Long.MAX_VALUE, in that case every index file is always mapped to one big memory mapping. My testing with Windows 10 have shown, that this is not a good idea!!!. Huge mappings fragment address space over time and as we can only use like 43 or 46 bits (depending on OS), the fragmentation will at some point kill you. So 16 GiB looks like a good compromise: Most files will be smaller than 6 GiB anyways (unless you optimize your index to one huge segment). So for most Lucene installations, the number of segments will equal the number of open files, so Elasticsearch huge user consumers will be very happy. The sysctl max_map_count may not need to be touched anymore.

In addition, this implements readLongs in a better way than @jpountz did (no caching or arbitrary objects). Nevertheless, as the new MemorySegment API relies on final, unmodifiable classes and coping memory from a MemorySegment to a on-heap Java array, it requires us to wrap all those arrays using a MemorySegment each time (e.g. in readBytes() or readLELongs), there may be some overhead du to short living object allocations (those are NOT reuseable!!!). In short: In future we should throw away on coping/loading our stuff to heap and maybe throw away IndexInput completely and base our code fully on random access. The new foreign-vector APIs will in future also be written with MemorySegment in its focus. So you can allocate a vector view on a MemorySegment and let the vectorizer fully work outside java heap inside our mmapped files! :-)

It would be good if you could checkout this branch and try it in production.

But be aware:

  • You need JDK 17 to compile and run with Gradle (set JAVA_HOME to it)
  • The lucene-core.jar will be JDK17 class files and requires JDK-17 to execute.
  • Also you need to add --add-modules jdk.incubator.foreign to the command line of your Java program/Solr server/Elasticsearch server

It would be good to get some benchmarks, especially by @rmuir or @mikemccand.

My plan is the following:

  • report any bugs or slowness, especially with Hotspot optimizations. The last time I talked to Maurizio, he taked about Hotspot not being able to fully optimize for-loops with long instead of int, so it may take some time until the full performance is there.
  • wait until the final version of project PANAMA-foreign goes into Java's Core Library (no module needed anymore)
  • add a MR-JAR for lucene-core.jar and compile the MemorySegmentIndexInput and maybe some helper classes with JDK 18/19 (hopefully?).

…ning "buffer" to "segment"; also make the segments array final (curSegment == null when closed)
…eException: Cannot close while another thread is accessing the segment"
…ng objects to extend their functionality (like asserting in tests)
… can correctly throw AlreadyClosedEx; TODO: add a test
@uschindler uschindler self-assigned this Jun 8, 2021
@uschindler uschindler marked this pull request as draft June 8, 2021 11:13
@uschindler
Copy link
Contributor Author

It works with build 33, every hour a Jenkins build tests it: https://jenkins.thetaphi.de/job/Lucene-jdk17panama-Linux/ and https://jenkins.thetaphi.de/job/Lucene-jdk17panama-Windows/

Before 25 the API of Panama was on state of java 16, in that case you need to use the Java 16 pull request.

In addition make sure to not use any system installed Gradle but ./gradlew.

@jbhateja
Copy link

jbhateja commented Aug 3, 2021

It works with build 33, every hour a Jenkins build tests it: https://jenkins.thetaphi.de/job/Lucene-jdk17panama-Linux/ and https://jenkins.thetaphi.de/job/Lucene-jdk17panama-Windows/

Before 25 the API of Panama was on state of java 16, in that case you need to use the Java 16 pull request.

In addition make sure to not use any system installed Gradle but ./gradlew.

Thanks @uschindler for your help, I have removed my other comments from this patch to reduce the noise.

…o draft/jdk-foreign-mmap-jdk17

# Conflicts:
#	lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
@markrmiller
Copy link
Member

Just stumbled on this. I've also recently thought a bit about Lucene JMH benchmarks that are semi equivalent to lucene-util benchmarks. I've been spending some free time here and there trying to bring some JMH to lucene-util, mainly perfasm output and async profiler, for which I also have a rough patch to use in the tests waiting for some clean up. I have some rough working stuff, but there tends to be a lot of resulting noise in the outputs that I have to look at filtering out or targeting the capture more with a more invasive integration. More than once, I've thought, it would be nicer to just offer the same suite of benchmarks within JMH and capture the other things it offers as well.

As Robert says, it's no replacement for realistic high level benchmarks in most case, especially for Solr, I've used the same unit/integration test analogy, but it's fantastic for fast, reliable feedback with superb introspection tools built in.

Also, those lambdas are damn slow until hotspot takes care of them :) I saw that slowness way back and chased what the deal was for a while, I was ready to toss them out of my toolbox, and yeah, they just need time. Then they are fine, probably not great for places that don't run a lot.

JMH will actually run the profilers during warmup runs to warm them up, but then it dumps the data, so it's nice that even profiling just captures the warm iterations if you'd like. I don't like that it also includes setup, as it says, to be sure and catch the edges. I could do without that when you are constructing indexes or something to be queried, but if you setup larger stuff like lucene-util does to have at hand already, that circumvents that.

@markrmiller
Copy link
Member

and also enables strong physical isolation of indexing and searching JVMs which have very different resources requirements! OK

This is a real issue with impact that comes up. Indexing on the replica has a surprising impact on query performance if you believe the reports. Those used to adding a bunch of replicas just for read side fan out felt the difference if they tried to migrate.

…o draft/jdk-foreign-mmap-jdk17

# Conflicts:
#	gradle/testing/defaults-tests.gradle
#	lucene/core/src/java/org/apache/lucene/util/Unwrappable.java
uschindler and others added 7 commits December 20, 2021 13:16
# Conflicts:
#	gradle/java/javac.gradle
…o draft/jdk-foreign-mmap-jdk17

# Conflicts:
#	lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
#	lucene/core/src/test/org/apache/lucene/store/TestMmapDirectory.java
# Conflicts:
#	lucene/core/src/java/module-info.java
#	lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
…o draft/jdk-foreign-mmap-jdk17

# Conflicts:
#	lucene/core/src/java/org/apache/lucene/store/MMapDirectory.java
#	lucene/replicator/src/test/org/apache/lucene/replicator/nrt/TestStressNRTReplication.java
@uschindler
Copy link
Contributor Author

Closing this as the JDK 19 impl was merged (#912).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants