Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid NPE in AbstractHttpClientMetricsHandler write listener #2539

Merged

Conversation

pderop
Copy link
Contributor

@pderop pderop commented Oct 10, 2022

Under heavy load, an HttpClient may get the following NPE (using H2C):

10:54:05.730 [reactor-http-nio-6] WARN r.n.h.c.AbstractHttpClientMetricsHandler - [2a3a0c7d, L:/127.0.0.1:51440 - R:/127.0.0.1:8080](H2 - 6999) Exception caught while recording metrics.
java.lang.NullPointerException: null
    at java.base/java.util.Objects.requireNonNull(Objects.java:208)
    at io.micrometer.core.instrument.ImmutableTag.<init>(ImmutableTag.java:37)
    at io.micrometer.core.instrument.Tag.of(Tag.java:31)
    at io.micrometer.core.instrument.Tags.of(Tags.java:265)
    at io.micrometer.core.instrument.AbstractTimerBuilder.tags(AbstractTimerBuilder.java:60)
    at io.micrometer.core.instrument.Timer$Builder.tags(Timer.java:358)
    at reactor.netty5.http.client.MicrometerHttpClientMetricsRecorder.lambda$recordDataSentTime$1(MicrometerHttpClientMetricsRecorder.java:72)
    at java.base/java.util.concurrent.ConcurrentHashMap.computeIfAbsent(ConcurrentHashMap.java:1708)
    at reactor.netty5.internal.util.MapUtils.computeIfAbsent(MapUtils.java:46)
    at reactor.netty5.http.client.MicrometerHttpClientMetricsRecorder.recordDataSentTime(MicrometerHttpClientMetricsRecorder.java:70)
    at reactor.netty5.http.client.AbstractHttpClientMetricsHandler.recordWrite(AbstractHttpClientMetricsHandler.java:201)
    at reactor.netty5.http.client.AbstractHttpClientMetricsHandler.lambda$write$0(AbstractHttpClientMetricsHandler.java:97)
    at io.netty5.util.concurrent.DefaultFutureListeners.notifyListeners(DefaultFutureListeners.java:77)
    at io.netty5.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:454)
    at io.netty5.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:439)
    at io.netty5.util.concurrent.DefaultPromise$NotifyListeners.run(DefaultPromise.java:422)
    at io.netty5.util.concurrent.SingleThreadEventExecutor.runTask(SingleThreadEventExecutor.java:338)
    at io.netty5.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:361)
    at io.netty5.channel.SingleThreadEventLoop.run(SingleThreadEventLoop.java:180)
    at io.netty5.util.concurrent.SingleThreadEventExecutor.lambda$doStartThread$4(SingleThreadEventExecutor.java:774)
    at io.netty5.util.internal.ThreadExecutorMap.lambda$apply$1(ThreadExecutorMap.java:68)
    at io.netty5.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
    at java.base/java.lang.Thread.run(Thread.java:833)

it seems that the following scenario may explain why we can see such NPE:

  • Netty flushes some requests (it writes requests out of the client socket to the server), but the AbstractHttpClientMetricsHandler write listener is not executed immediately after the request is flushed, the write listener is just scheduled in the event loop.
  • Then , because the server is fast, the response is already available and it seems that netty is then handling some OP_READ operations before executing any write listeners.
  • So the selector reads the response and invoke the AbstractHttpClientMetricsHandler.channelRead, which calls reset (this method will clean all class fields including uri/path).
  • And next, when the write listener is called, we get the NPE because the uri/path are null.

The proposed patch is attempting to fix this issue using a flag which may have two values:

  • REQUEST_SENT: means that the request is full written and the AbstractHttpClientMetricsHandler write listener has been executed
  • RESPONSE_RECEIVED: means that the response is fully received by the AbstractHttpClientMetricsHandler.channelRead method

So, using the above flags, and if the write listener is invoked after AbstractHttpClientMetricsHandler.channelRead method, then it will call reset itself (and in this case the channelRead method won't call reset).

This PR does not include a junit reproducer test (The issue can only be reproduced manually using a custom http client loader).

@pderop pderop added the type/bug A general bug label Oct 10, 2022
@pderop pderop self-assigned this Oct 10, 2022
@pderop pderop added this to the 2.0.x Backlog milestone Oct 11, 2022
@pderop pderop marked this pull request as draft October 11, 2022 05:56
@violetagg violetagg modified the milestones: 2.0.x Backlog, 2.0.0 Oct 11, 2022
@violetagg
Copy link
Member

@pderop ping me when this is ready for a review

@pderop
Copy link
Contributor Author

pderop commented Oct 12, 2022

I need to double check something, will do it soon.

@pderop pderop marked this pull request as ready for review October 13, 2022 15:07
@pderop
Copy link
Contributor Author

pderop commented Oct 13, 2022

@violetagg ,

Can you please review this ? I verified that the NPE still persists even when using the latest netty with the WriteSink PR

thanks.

@pderop pderop requested a review from violetagg October 13, 2022 15:09
@pderop
Copy link
Contributor Author

pderop commented Oct 13, 2022

mmm, wait, I need to double check some more things related to HTTP1.1.
turning back this PR to draft mode ...

@pderop pderop marked this pull request as draft October 13, 2022 15:13
@pderop pderop marked this pull request as ready for review October 17, 2022 09:54
@pderop
Copy link
Contributor Author

pderop commented Oct 17, 2022

@violetagg ,

so, I did tests with HTTP 1.1 and I could not reproduce the problem. can you take a look ?
thanks

@pderop
Copy link
Contributor Author

pderop commented Oct 17, 2022

@violetagg, thanks for the review.

@pderop pderop merged commit f870138 into reactor:netty5 Oct 17, 2022
pderop added a commit that referenced this pull request Oct 18, 2022
Removes NarrowingCompoundAssignment compilation warnings introduced by PR #2539.

The flags field is now an int instead of a byte.
@pderop
Copy link
Contributor Author

pderop commented Sep 15, 2023

This PR has been superseded (replaced) by #2864, which has been merged into the netty5 branch.
For the moment, I can't make a junit test which is reproducing the NPE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug A general bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants