Skip to content
This repository has been archived by the owner on Sep 23, 2023. It is now read-only.

etl: do sort and file flush in another goroutine #1052

Merged
merged 9 commits into from
Sep 6, 2023

Conversation

AskAlexSharov
Copy link
Collaborator

  • if provider is in-memory: do sort+flush in same goroutine
  • if provider is file-based: do sort+flush in another goroutine, and Load method will wait for unfinished goroutines and return error if one happened inside goroutine. Also in this case do pre-palloc of new buffer with prevBufSize/8 size - because can't re-use prev buffer in this case.

Reason: E4 has 8 etl collectors in same time (for domains/history/inverted_indices) and sort.Stable is kind-of bottleneck.

@AskAlexSharov AskAlexSharov added this pull request to the merge queue Sep 6, 2023
Merged via the queue into main with commit a6ad145 Sep 6, 2023
3 checks passed
@AskAlexSharov AskAlexSharov deleted the etl_sort_in_another_goroutine branch September 6, 2023 07:51
blxdyx pushed a commit to blxdyx/bsc-erigon-lib that referenced this pull request Sep 13, 2023
- if provider is in-memory: do sort+flush in same goroutine
- if provider is file-based: do sort+flush in another goroutine, and
Load method will wait for unfinished goroutines and return error if one
happened inside goroutine. Also in this case do pre-palloc of new buffer
with `prevBufSize/8` size - because can't re-use prev buffer in this
case.

Reason: E4 has 8 etl collectors in same time (for
domains/history/inverted_indices) and `sort.Stable` is kind-of
bottleneck.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant