Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Watcher: Configure HttpClient parallel sent requests #30130

Conversation

spinscale
Copy link
Contributor

The HTTPClient used in watcher is based on the apache http client. The
current client is using a lot of defaults - which are not always
optimal. Two of those defaults are the maximum number of total
connections and the maximum number of connections to a single route.

If one of those limits is reached, the HTTPClient waits for a connection
to be finished thus acting in a blocking fashion. In order to prevent
this when many requests are being executed, we increase the limit of
total connections as well as the connections per route (a route is
basically an endpoint, which also contains proxy information, not
containing an URL, just hosts).

On top of that an additional option has been made configurable to evict
long running connections, which can potentially be reused after some
time. As this requires an additional background thread, this required
some changes to ensure that the httpclient is closed properly. Also the
timeout for this can be configured.

Reviewers note: I am happy to discuss the naming of those options, I am not a fan of the implementation of the http client in the setting name, but if we ever change that client again (which we might once the built-in java HTTP client becomes usable), it might be easier to differentiate - I tend to lean to remove the apache though.

The HTTPClient used in watcher is based on the apache http client. The
current client is using a lot of defaults - which are not always
optimal. Two of those defaults are the maximum number of total
connections and the maximum number of connections to a single route.

If one of those limits is reached, the HTTPClient waits for a connection
to be finished thus acting in a blocking fashion. In order to prevent
this when many requests are being executed, we increase the limit of
total connections as well as the connections per route (a route is
basically an endpoint, which also contains proxy information, not
containing an URL, just hosts).

On top of that an additional option has been made configurable to evict
long running connections, which can potentially be reused after some
time. As this requires an additional background thread, this required
some changes to ensure that the httpclient is closed properly. Also the
timeout for this can be configured.
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should avoid the apache in the setting names, as you mentioned in your summary.

But why do we need these to be settable by users? Could we start by changing the values we use internally, and only adding the settings if users have a need to use the non defaults, if/when we get enough feedback asking for it?

@spinscale
Copy link
Contributor Author

I think we can go without making them configurable at the beginning. I set the number of open connections to a pretty high number so people should not run into any limits with a 'normal' usage of watcher and left a comment.

Copy link
Member

@rjernst rjernst left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Override
public void close() throws IOException {
for (Plugin plugin : plugins) {
plugin.close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be IOUtils.close(plugins);?

@spinscale
Copy link
Contributor Author

@elasticmachine retest this please

@spinscale spinscale merged commit f00890e into elastic:master May 9, 2018
spinscale added a commit that referenced this pull request May 9, 2018
The HTTPClient used in watcher is based on the apache http client. The
current client is using a lot of defaults - which are not always
optimal. Two of those defaults are the maximum number of total
connections and the maximum number of connections to a single route.

If one of those limits is reached, the HTTPClient waits for a connection
to be finished thus acting in a blocking fashion. In order to prevent
this when many requests are being executed, we increase the limit of
total connections as well as the connections per route (a route is
basically an endpoint, which also contains proxy information, not
containing an URL, just hosts).

On top of that an additional option has been set to evict
long running connections, which can potentially be reused after some
time. As this requires an additional background thread, this required
some changes to ensure that the httpclient is closed properly. Also the
timeout for this can be configured.
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request May 9, 2018
…or-you

* elastic/master: (22 commits)
  Docs: Test examples that recreate lang analyzers  (elastic#29535)
  BulkProcessor to retry based on status code (elastic#29329)
  Add GET Repository High Level REST API (elastic#30362)
  add a comment explaining the need for RetryOnReplicaException on missing mappings
  Add `coordinating_only` node selector (elastic#30313)
  Stop forking groovyc (elastic#30471)
  Avoid setting connection request timeout (elastic#30384)
  Use date format in `date_range` mapping before fallback to default (elastic#29310)
  Watcher: Increase HttpClient parallel sent requests (elastic#30130)
  Mute ML upgrade test (elastic#30458)
  Stop forking javac (elastic#30462)
  Client: Deprecate many argument performRequest (elastic#30315)
  Docs: Use task_id in examples of tasks (elastic#30436)
  Security: Rename IndexLifecycleManager to SecurityIndexManager (elastic#30442)
  [Docs] Fix typo in cardinality-aggregation.asciidoc (elastic#30434)
  Avoid NPE in `more_like_this` when field has zero tokens (elastic#30365)
  Build: Switch to building javadoc with html5 (elastic#30440)
  Add a quick tour of the project to CONTRIBUTING (elastic#30187)
  Reindex: Use request flavored methods (elastic#30317)
  Silence SplitIndexIT.testSplitIndexPrimaryTerm test failure. (elastic#30432)
  ...
dnhatn added a commit that referenced this pull request May 10, 2018
* master:
  Upgrade to Lucene-7.4-snapshot-6705632810 (#30519)
  add version compatibility from 6.4.0 after backport, see #30319 (#30390)
  Security: Simplify security index listeners (#30466)
  Add proper longitude validation in geo_polygon_query (#30497)
  Remove Discovery.AckListener.onTimeout() (#30514)
  Build: move generated-resources to build (#30366)
  Reindex: Fold "with all deps" project into reindex (#30154)
  Isolate REST client single host tests (#30504)
  Solve Gradle deprecation warnings around shadowJar (#30483)
  SAML: Process only signed data (#30420)
  Remove BWC repository test (#30500)
  Build: Remove xpack specific run task (#30487)
  AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests
  LLClient: Add setJsonEntity (#30447)
  Expose CommonStatsFlags directly in IndicesStatsRequest. (#30163)
  Silence IndexUpgradeIT test failures. (#30430)
  Bump Gradle heap to 1792m (#30484)
  [docs] add warning for read-write indices in force merge documentation (#28869)
  Avoid deadlocks in cache (#30461)
  Test: remove hardcoded list of unconfigured ciphers (#30367)
  mute SplitIndexIT due to #30416
  Docs: Test examples that recreate lang analyzers  (#29535)
  BulkProcessor to retry based on status code (#29329)
  Add GET Repository High Level REST API (#30362)
  add a comment explaining the need for RetryOnReplicaException on missing mappings
  Add `coordinating_only` node selector (#30313)
  Stop forking groovyc (#30471)
  Avoid setting connection request timeout (#30384)
  Use date format in `date_range` mapping before fallback to default (#29310)
  Watcher: Increase HttpClient parallel sent requests (#30130)

# Conflicts:
#	x-pack/plugin/core/src/test/java/org/elasticsearch/xpack/core/LocalStateCompositeXPackPlugin.java
dnhatn added a commit that referenced this pull request May 10, 2018
* 6.x:
  Upgrade to Lucene-7.4-snapshot-6705632810 (#30519)
  Remove Discovery.AckListener.onTimeout() (#30514)
  Build: move generated-resources to build (#30366)
  Reindex: Fold "with all deps" project into reindex (#30154)
  Isolate REST client single host tests (#30504)
  Remove BWC repository test (#30500)
  Build: Remove xpack specific run task (#30487)
  AwaitsFix IntegTestZipClientYamlTestSuiteIT#indices.split tests
  LLClient: Add setJsonEntity (#30447)
  [docs] add warning for read-write indices in force merge documentation (#28869)
  Avoid deadlocks in cache (#30461)
  BulkProcessor to retry based on status code (#29329)
  Avoid setting connection request timeout (#30384)
  Test: remove hardcoded list of unconfigured ciphers (#30367)
  Add GET Repository High Level REST API (#30362)
  mute SplitIndexIT due to #30416
  Docs: Test examples that recreate lang analyzers  (#29535)
  add a comment explaining the need for RetryOnReplicaException on missing mappings
  Pass the task to broadcast actions (#29672)
  Stop forking groovyc (#30471)
  Add `coordinating_only` node selector (#30313)
  Fix accidental error in changelog
  Use date format in `date_range` mapping before fallback to default (#29310)
  Watcher: Increase HttpClient parallel sent requests (#30130)
  [Security][Tests] Azeri(Turkish) locale tripps opensaml dependency
spinscale added a commit to spinscale/elasticsearch that referenced this pull request Jul 6, 2018
The HTTPClient used in watcher is based on the apache http client. The
current client is using a lot of defaults - which are not always
optimal. Two of those defaults are the maximum number of total
connections and the maximum number of connections to a single route.

If one of those limits is reached, the HTTPClient waits for a connection
to be finished thus acting in a blocking fashion. In order to prevent
this when many requests are being executed, we increase the limit of
total connections as well as the connections per route (a route is
basically an endpoint, which also contains proxy information, not
containing an URL, just hosts).

On top of that an additional option has been set to evict
long running connections, which can potentially be reused after some
time. As this requires an additional background thread, this required
some changes to ensure that the httpclient is closed properly. Also the
timeout for this can be configured.
spinscale added a commit that referenced this pull request Jul 9, 2018
The HTTPClient used in watcher is based on the apache http client. The
current client is using a lot of defaults - which are not always
optimal. Two of those defaults are the maximum number of total
connections and the maximum number of connections to a single route.

If one of those limits is reached, the HTTPClient waits for a connection
to be finished thus acting in a blocking fashion. In order to prevent
this when many requests are being executed, we increase the limit of
total connections as well as the connections per route (a route is
basically an endpoint, which also contains proxy information, not
containing an URL, just hosts).

On top of that an additional option has been set to evict
long running connections, which can potentially be reused after some
time. As this requires an additional background thread, this required
some changes to ensure that the httpclient is closed properly.

This is a backport of #30130
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants