Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible Bulk Update Optimizations #26802

Closed
imotov opened this issue Sep 27, 2017 · 11 comments · Fixed by #29264
Closed

Possible Bulk Update Optimizations #26802

imotov opened this issue Sep 27, 2017 · 11 comments · Fixed by #29264
Labels
discuss :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search.

Comments

@imotov
Copy link
Contributor

imotov commented Sep 27, 2017

We had several reports of slow bulk updates in 5.0 and above. Some of them are summarized in #23792. The main issue that led to significant slow down seems to the refresh that is now performed before each update operation. Basically, in order to perform a bulk requests with 1000 updates on the same record we will have to perform refresh 1000 times. This problem can be avoided if we could combine all update operations on the same record within a shard batch into a single update operation, this way we could get away with performing refresh only once.

For example, if we have the following bulk request:

update rec A
update rec A
update rec B
update rec A

it currently translates into

refresh // possibly if A was updated in the previous bulk and wasn't refreshed yet
get A
update A
index A

refresh // always
get A
update A
index A

// no refresh - because B wasn't modified in this bulk yet
get B
update B
index B

refresh // always
get A
update A
index A

If we combine all updates together we could transform it

refresh
get A
update A
update A
update A
index A
get B
update B
index B

We would still have an issue if we have a mix of index or delete operations with update operations on the same record in the same bulk request, but, perhaps we can either fall back to the current slow way of doing things or optimize them only as much as we can, for example, we can ignore all updates before index operation and all updates and index operation before delete, etc.

@jpountz
Copy link
Contributor

jpountz commented Sep 28, 2017

@imotov in it currently translates into, I think we would not refresh before getting B since it is necessarily in the index already due to the refresh that we did for A?

@imotov
Copy link
Contributor Author

imotov commented Sep 28, 2017

@jpountz you are right, thanks a lot for the clarification. I have updated the issue.

@s1monw
Copy link
Contributor

s1monw commented Oct 6, 2017

We spoke about this in fixit-friday. While this optimization is doable on the client side (for now) we would like to first look into a more generic optimization that would hold a secondary (engine-private) index reader that is used internally that would not pay the price of refreshing caches, load global ords, rebuild parent-child relationships etc. It would also mean that we can honor the refresh semantics if a realtime get is used. I will keep this issue open for now and ope a new one to explore the opportunities.

s1monw added a commit to s1monw/elasticsearch that referenced this issue Oct 11, 2017
…er to disk"

Today, when ES detects it's using too much heap vs the configured indexing
buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move
the bytes to disk, clear version map, etc.

But this has the unexpected side effect of making newly indexed/deleted
documents visible to future searches, which is not nice for users who are trying
to prevent that, e.g. elastic#3593.

This is also an indirect spinoff from elastic#26802 where we potentially pay a big
price on rebuilding caches etc. when updates / realtime-get is used. We are
refreshing the internal reader for realtime gets which causes for instance
global ords to be rebuild. I think we can gain quite a bit if we'd use a reader
that is only used for GETs and not for searches etc. that way we can also solve
problems of searchers being refreshed unexpectedly aside of replica recovery /
relocation.

Closes elastic#15768
Closes elastic#26912
s1monw added a commit that referenced this issue Oct 12, 2017
…er to disk (#26972)

Today, when ES detects it's using too much heap vs the configured indexing
buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move
the bytes to disk, clear version map, etc.

But this has the unexpected side effect of making newly indexed/deleted
documents visible to future searches, which is not nice for users who are trying
to prevent that, e.g. #3593.

This is also an indirect spinoff from #26802 where we potentially pay a big
price on rebuilding caches etc. when updates / realtime-get is used. We are
refreshing the internal reader for realtime gets which causes for instance
global ords to be rebuild. I think we can gain quite a bit if we'd use a reader
that is only used for GETs and not for searches etc. that way we can also solve
problems of searchers being refreshed unexpectedly aside of replica recovery /
relocation.

Closes #15768
Closes #26912
@s1monw
Copy link
Contributor

s1monw commented Oct 12, 2017

@imotov I got the index reader change in I was talking about. While batching up those kind of updates might be a two sided sword I am still interested in seeing how it would look like code wise, would it be as simple as sorting the updates by ID and use insertion order as a secondary sort and then run them all consecutively? I think it can be as simple as checking if the previous ID is the same and don't fetch the document again? I really wonder if we can have a prototype that does the sorting in the test as a start?

@imotov
Copy link
Contributor Author

imotov commented Oct 12, 2017

@s1monw yes, I think this rearrangement shouldn't cause any differences in the final result of the bulk operation (at least on the level we can have any guarantee about). We will have to have a special handling for deletes (clean buffer), but otherwise that should optimize one of the issues.

We will still have an issue of indexing the same record over and over again and sending multiple copies of the same record to replicas, but I am fully sold on progress over perfection here :)

@jasontedor
Copy link
Member

We will still have an issue of indexing the same record over and over again and sending multiple copies of the same record to replicas, but I am fully sold on progress over perfection here :)

Can you clarify what you mean here? We have to send these operations to maintain the semantics of the translog as an operation log and for semantics of sequence IDs.

@imotov
Copy link
Contributor Author

imotov commented Oct 12, 2017

@jasontedor yes if we reindex the same record again and again on the primary shard, we have to send it to replicas as many times. I was just thinking, maybe, if we can index it on the primary only after all sequential operations on the same record are performed we will need to send only one copy to replicas. But as I said, we don't need to bundle that in, it is a somewhat separate optimization.

@jasontedor
Copy link
Member

@imotov For a shard-level bulk request, we still send a single bulk request to the replicas containing the results of all the indexing operations on the primary. That is, a shard bulk request is a single replication operation.

@imotov
Copy link
Contributor Author

imotov commented Oct 12, 2017

@jasontedor yes, the question here is, if a bulk request contains 100 script updates on the same record do we need to send 100 slightly different copies of this record to replicas or we can just send the final version and ignore all intermediate states on both primary and replicas?

@jasontedor
Copy link
Member

On the replica we can not do that, if we apply an operation the primary, we have to send it to the replica and we can not coalesce operations from the primary to the replica as doing so would violate operation and sequence ID semantics.

s1monw added a commit that referenced this issue Oct 13, 2017
…er to disk (#26972)

Today, when ES detects it's using too much heap vs the configured indexing
buffer (default 10% of JVM heap) it opens a new searcher to force Lucene to move
the bytes to disk, clear version map, etc.

But this has the unexpected side effect of making newly indexed/deleted
documents visible to future searches, which is not nice for users who are trying
to prevent that, e.g. #3593.

This is also an indirect spinoff from #26802 where we potentially pay a big
price on rebuilding caches etc. when updates / realtime-get is used. We are
refreshing the internal reader for realtime gets which causes for instance
global ords to be rebuild. I think we can gain quite a bit if we'd use a reader
that is only used for GETs and not for searches etc. that way we can also solve
problems of searchers being refreshed unexpectedly aside of replica recovery /
relocation.

Closes #15768
Closes #26912
@bleskes
Copy link
Contributor

bleskes commented Oct 13, 2017

@jasontedor if I get the suggestion the idea is to do one get from the engine, apply multiple updates and the indexed the versioned results once. This will result in one write operation to the engine with one issues seq#. I think that's fine?

All that said - I'm really not happy with the state of the TransportBulkShardAction and it's complexity. I think we should figure what we want to do there (I don't have a good suggestion yet) before introducing more complexity.

@lcawl lcawl added :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search. and removed :Bulk labels Feb 13, 2018
@s1monw s1monw closed this as completed in 13e19e7 Mar 28, 2018
s1monw added a commit that referenced this issue Mar 28, 2018
We historically removed reading from the transaction log to get consistent
results from _GET calls. There was also the motivation that the read-modify-update
principle we apply should not be hidden from the user. We still agree on the fact
that we should not hide these aspects but the impact on updates is quite significant
especially if the same documents is updated before it's written to disk and made serachable.

This change adds back the ability to read from the transaction log but only for update calls.
Calls to the _GET API will always do a refresh if necessary to return consistent results ie.
if stored fields or DocValues Fields are requested.

Closes #26802
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss :Distributed/CRUD A catch all label for issues around indexing, updating and getting a doc by id. Not search.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants