Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HLRC: reindex API with wait_for_completion false #35202

Merged
merged 11 commits into from
Nov 8, 2018

Conversation

pgomulka
Copy link
Contributor

@pgomulka pgomulka commented Nov 2, 2018

Extend High Level Rest Client Reindex API to support requests with
wait_for_completion=false. This method will return a TaskID and results
can be queried with Task API

refers: #27205

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

Extend High Level Rest Client Reindex API to support requests with
wait_for_completion=false. This method will return a TaskID and results
can be queried with Task API

refers: elastic#27205
@pgomulka pgomulka force-pushed the feature/reindex-wait-for-completion branch from 0195e7b to ec3c93d Compare November 2, 2018 13:31
import java.io.IOException;
import java.util.Objects;

public class ReindexSubmissionResponse extends ActionResponse implements ToXContentObject {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this I hope to be refactored with a common TaskSubmissionResponse. This could be then used in migration api and other methods with wait_for_completion=false

* @throws IOException in case there is a problem sending the request or parsing back the response
*/
public final ReindexSubmissionResponse submitReindexTask(ReindexRequest reindexRequest, RequestOptions options) throws IOException {
return performRequestAndParseEntity(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if there's any value to returning a ReindexSubmissionResponse rather than just a TaskId here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe as TaskId is good here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was considering this, but the reason I stayed with a Wrapping object is that Rest API returns taskId wrapped inside an object where TaskID is just one field. Same is in DeleteJobResponse (ML).
However I don't mind breaking this convention. It would however look a bit odd inside TaskID to have a parser that treats taskId as a field within itself. See TaskID from latest revision where I did that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmmmmmm. I see. So the idea is that if we added more than as TaskId to the response then we could stick it on this object. We can do that in the REST response because of the way the JSON is structured. The way your have it we could do that here. I like that argument. Even though it looks like an extra argument, it makes sense. It might be worth sticking on the Javadoc of that object. Somewhere.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im +1 for a response object matching the response name (like ReindexSubmissionResponse) for this "adding things to the API" reason as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of staying with TaskSubmissionResponse with String taskId field. As this would fit into other wait_for_completion=false type methods as well.

We could in theory always make the taskSubmission abstract and return concrete responses - with all the parsing and fields being already in the abstract class. That would be a bit too complex for that I guess

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with TaskSubmissionResponse. And I'm fine with String taskId. I'd also be fine with a client-side TaskId class that is super opaque. It is an identifier that client's aren't meant to pick apart. A String is fine for that though.

* @throws IOException in case there is a problem sending the request or parsing back the response
*/
public final ReindexSubmissionResponse submitReindexTask(ReindexRequest reindexRequest, RequestOptions options) throws IOException {
return performRequestAndParseEntity(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe as TaskId is good here.


public class ReindexIT extends ESRestHighLevelClientTestCase {

//TODO taken from CrudIT - should these scenarios be moved here?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with moving them. Especially if it lets us save some copy and pasting.

* @return the submission response
* @throws IOException in case there is a problem sending the request or parsing back the response
*/
public final TaskId submitReindexTask(ReindexRequest reindexRequest, RequestOptions options) throws IOException {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name submit is breaking the API spec contract. We have test that verifies if we are not adding any additional methods apart from those from the API spec.

RestHighLevelClientTests.testApiNamingConventions <<< FAILURES! 19:16:19 > Throwable #1: java.lang.AssertionError: Some client method doesn't match a corresponding API defined in the REST spec: [submit_reindex_task] 19:16:19 > Expected: <0> 19:16:19 > but: was <1>
https://elasticsearch-ci.elastic.co/job/elastic+elasticsearch+pull-request/1777/console

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I just hit that one too trying to return a GetTaskResponse wrapped in an Optional (following discussion on how Java clents should handle 404s)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maybe you can teach it about these sorts of names. They aren't required for every method, but maybe we could make them required if wait_for_completion was a parameter in the spec.

make rest client test assert about submit task methods
@@ -107,6 +107,9 @@

public class RestHighLevelClientTests extends ESTestCase {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't run this test for subclients like tasks() migraiton() etc
maybe we should..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! We almost certainly should. But we'd have to do some "interesting" work to get the specs in place. We probably could come up with something though.

Copy link
Contributor

@hub-cap hub-cap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work here, love the test checks. We can merge this or you can update the TaskId stuff, assuming we agree on a type soon :P

);
}
{
// test1: create one doc in dest
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this comment mean? i see it in a few places in this file

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will rename this - I guess someone meant to reindex one document with Id 1 from source to destination

assertThat("the second parameter to submit task method [" + method + "] is the wrong type",
method.getParameterTypes()[1], equalTo(RequestOptions.class));

assertThat("submit task method [" + method + "] must have wait_for_completion parameter in rest spec",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

@pgomulka pgomulka merged commit a90ef6b into elastic:master Nov 8, 2018
jasontedor added a commit to jasontedor/elasticsearch that referenced this pull request Nov 8, 2018
* elastic/master: (25 commits)
  Fixes fast vector highlighter docs per issue 24318. (elastic#34190)
  [ML] Prevent notifications on deletion of a non existent job (elastic#35337)
  [CCR] Auto follow Coordinator fetch cluster state in system context (elastic#35120)
  Mute test for elastic#35361
  Preserve `date_histogram` format when aggregating on unmapped fields (elastic#35254)
  Test: Mute failing SSL test
  Allow unmapped fields in composite aggregations (elastic#35331)
  [RCI] Add IndexShardOperationPermits.asyncBlockOperations(ActionListener<Releasable>) (elastic#34902)
  HLRC: reindex API with wait_for_completion false (elastic#35202)
  Add docs on JNA temp directory not being noexec (elastic#35355)
  [CCR] Adjust list of dynamic index settings that should be replicated (elastic#35195)
  Replicate index settings to followers (elastic#35089)
  Rename RealmConfig.globalSettings() to settings() (elastic#35330)
  [TEST] Cleanup FileUserPasswdStoreTests (elastic#35329)
  Scripting: Add back lookup vars in score script (elastic#34833)
  watcher: Fix integration tests to ensure correct start/stop of Watcher (elastic#35271)
  Remove ALL shard check in CheckShrinkReadyStep (elastic#35346)
  Use soft-deleted docs to resolve strategy for engine operation (elastic#35230)
  [ILM] Check shard and relocation status in AllocationRoutedStep (elastic#35316)
  Ignore date ranges containing 'now' when pre-processing a percolator query (elastic#35160)
  ...
@pcsanwald
Copy link
Contributor

@pgomulka I added v7.0.0 label here, please remove if that seems incorrect.

pgomulka added a commit to pgomulka/elasticsearch that referenced this pull request Nov 13, 2018
Extend High Level Rest Client Reindex API to support requests with
wait_for_completion=false. This method will return a TaskSubmissionResult with task identifier as string and results can be queried with Task API

refers: elastic#27205
@javanna
Copy link
Member

javanna commented Nov 13, 2018

@pgomulka I think this needs backporting, the 6.6.0 label was added but I don't see these changes in the 6.x branch.

@pgomulka
Copy link
Contributor Author

pgomulka commented Nov 14, 2018

@javanna sorry about that. Are all HLRC changes supposed to be in 6.x as well? The HLRC is labeled 7.0 in clubhouse

pgomulka added a commit to pgomulka/elasticsearch that referenced this pull request Nov 14, 2018
Extend High Level Rest Client Reindex API to support requests with
wait_for_completion=false. This method will return a TaskSubmissionResult with task identifier as string and results can be queried with Task API

refers: elastic#27205
pgomulka added a commit that referenced this pull request Nov 14, 2018
Extend High Level Rest Client Reindex API to support requests with
wait_for_completion=false. This method will return a TaskSubmissionResult with task identifier as string and results can be queried with Task API

refers: #27205
Original PR against master #35202 
Back Port PR #35527
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants