Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syncing with original #1

Merged
merged 10,000 commits into from
Sep 9, 2019
Merged

syncing with original #1

merged 10,000 commits into from
Sep 9, 2019

Conversation

kaypeter87
Copy link
Owner

  • Have you signed the contributor license agreement?
  • Have you followed the contributor guidelines?
  • If submitting code, have you built your formula locally prior to submission with gradle check?
  • If submitting code, is your pull request against master? Unless there is a good reason otherwise, we prefer pull requests against master and will backport as needed.
  • If submitting code, have you checked that your submission is for an OS that we support?
  • If you are submitting this code for a class then read our policy for that.

przemekwitek and others added 30 commits August 22, 2019 20:47
…thrown when the process doesn't start correctly. (#45846)
Two examples had swapped the order of lang and code when creating a
script.

Relates #43884
Today, when rolling a new translog generation, we block all write
threads until a new generation is created. This choice is perfectly 
fine except in a highly concurrent environment with the translog 
async setting. We can reduce the blocking time by pre-sync the 
current generation without writeLock before rolling. The new step 
would fsync most of the data of the current generation without 
blocking write threads.

Close #45371
This commit namespaces the existing processors setting under the "node"
namespace. In doing so, we deprecate the existing processors setting in
favor of node.processors.
In case of an in-progress snapshot this endpoint was broken because
it tried to execute repository operations in the callback on a
transport thread which is not allowed (only generic or snapshot
pool are allowed here).
This commit enables testing against JDK 14.
Customers occasionally discover a known behavior in Elasticsearch's pagination that does not appear to be documented. This warning is intended to educate customers of this behavior while still highlighting alternative solutions.
In the Sys V init scripts, we check for Java. This is not needed, since
the same check happens in elasticsearch-env when starting up. Having
this duplicate check has bitten us in the past, where we made a change
to the logic in elasticsearch-env, but missed updating it here. Since
there is no need for this duplicate check, we remove it from the Sys V
init scripts.
This commit changes the tests added in #45383 so that the fixture that 
emulates the S3 service now sometimes consumes all the request body 
before sending an error, sometimes consumes only a part of the request 
body and sometimes consumes nothing. The idea here is to beef up a bit 
the tests that writes blob because the client's retry logic relies on 
marking and resetting the blob's input stream.

This pull request also changes the testWriteBlobWithRetries() so that it 
(rarely) tests with a large blob (up to 1mb), which is more than the client's 
default read limit on input streams (131Kb).

Finally, it optimizes the ZeroInputStream so that it is a bit more effective 
(now works using an internal buffer and System.arraycopy() primitives).
Closing a `RemoteClusterConnection` concurrently with trying to connect
could result in double invoking the listener.

This fixes
RemoteClusterConnectionTest#testCloseWhileConcurrentlyConnecting

Closes #45845
* [ML][Transforms] fix doSaveState check

* removing unnecessary log statement
Previously, the stats API reports a progress percentage
for DF analytics tasks that are running and are in the
`reindexing` or `analyzing` state.

This means that when the task is `stopped` there is no progress
reported. Thus, one cannot distinguish between a task that never
run to one that completed.

In addition, there are blind spots in the progress reporting.
In particular, we do not account for when data is loaded into the
process. We also do not account for when results are written.

This commit addresses the above issues. It changes progress
to being a list of objects, each one describing the phase
and its progress as a percentage. We currently have 4 phases:
reindexing, loading_data, analyzing, writing_results.

When the task stops, progress is persisted as a document in the
state index. The stats API now reports progress from in-memory
if the task is running, or returns the persisted document
(if there is one).
…#45688)

This commits makes all the async methods in the high level client return the `Cancellable` object that the low level client now exposes.

Relates to #45379 
Closes #44802
This PR modifies the logic in IngestService to preserve the original content type 
on the IndexRequest, such that when a document with a content type like SMILE 
is submitted to a pipeline, the resulting document that is persisted will remain in 
the original content type (SMILE in this case).
This fixes two bugs:
- A recently introduced bug where an NPE will be thrown if a catch block is 
empty.
- A long-time bug where an NPE will be thrown if multiple catch blocks in a 
row are empty for the same try block.
If two translog syncs happen concurrently, then one can return before
its operations are marked as persisted. In general, this should not be
an issue; however, peer recoveries currently rely on this assumption.

Closes #29161
AbstractSimpleTransportTestCase.testTransportProfilesWithPortAndHost
expects a host to only have a single IPv4 loopback address, which isn't
necessarily the case. Allow for >= 1 address.
* [ML][Transforms] adjusting when and what to audit

* Update DataFrameTransformTask.java

* removing unnecessary audit message
jrodewig and others added 29 commits September 5, 2019 15:03
When the Elasticsearch process does not have write permissions to
upgrade the Elasticsearch keystore, we bail with an error message that
indicates there is a filesystem permissions problem. This commit
clarifies that error message by pointing out the directory where write
permissions are required, or that the user can also run the
elasticsearch-keystore upgrade command manually before starting the
Elasticsearch process. In this case, the upgrade would not be needed at
runtime, so the permissions would not be needed then.
improve error message if stopping of transform times out.

related #45610
This commit fixes the usage of randomIntBetween() in the test 
testWriteBlobWithRetries, when the test generates a random array  
of a single byte.
Resolve the incorrect current scroll for deleted or closed index
This refactors `DataFrameAnalyticsTask` into its own class.
The task has quite a lot of functionality now and I believe it would
make code more readable to have it live as its own class rather than
an inner class of the start action class.
Further investigation into #46091, expanding on #46363, to add even more
detailed logging around the retry behaviour during index creation.
simplify transform task and indexer

 - remove redundant transform id
 - moving client data frame indexer (and builder) into a separate file
…46432)

ML users who upgrade from versions prior to 7.4 to 7.4 or later
will have ML results indices that do not have mappings for the
total_search_time_ms field.  Therefore, when searching these
indices we must tolerate this field not having a mapping.

Fixes #46437
We are seeing requests take more than the default 30s
which leads to requests being retried and returning
unexpected failures like e.g. "index already exists"
because the initial requests that timed out, worked
out functionally anyway.
=> double the timeout to reduce the likelihood of
the failures described in #46091
=> As suggested in the issue, we should in a follow-up
turn off retrying all-together probably
* Fix issue with painless scripting not being correctly generated when
datetime functions are used for GROUPing of an INTERVAL operation.
Previously, we ignore replication for noop updates because they do not
have sequence numbers. Since #44603, we started assigning sequence
numbers to noop updates leading them to be replicated to replicas.

This bug occurs only on 8.0 for it requires #41065 and #44603.

Closes #46366
We hit a bug where we can't partially update documents created in a
mixed cluster between 5.x and 6.x. Although this bug does not affect
7.0 or later, we should have a good test that catches this issue.

Relates #46198
@kaypeter87 kaypeter87 merged commit b19f505 into kaypeter87:master Sep 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.