Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simpler new chunk key v12 #5054

Merged
merged 28 commits into from
Jan 7, 2022
Merged

Conversation

JordanRushing
Copy link
Contributor

@JordanRushing JordanRushing commented Jan 5, 2022

What this PR does / why we need it:
This PR branches #4857 and uses a simpler external key structure for chunks.

Which issue(s) this PR fixes:
N/A

Special notes for your reviewer:
N/A

Checklist

  • Documentation added
  • Tests updated
  • Add an entry in the CHANGELOG.md about the changes.

owen-d and others added 22 commits December 1, 2021 16:32
…nchmark

Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>

Fix `TestGrpcStore` to work with new chunk key structure

Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>

Add a function to SchemaConfig that returns the correct PeriodConfig for
a given time.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

Add schema config to object client so it we can use the right external
key function based on the schema version.

Signed-off-by: Callum Styan <callumstyan@gmail.com>

Fix failing compactor tests by passing SchemaConfig to chunkClient

Remove empty conditional from SchemaForTime

Define schema v12; wire-in ChunkPathShardFactor and ChunkPathPeriod as configurable values

Add PeriodConfig test steps for new values ChunkPathShardFactor and ChunkPathPeriod in v12

Update test for chunk.NewExternalKey(); remove completed TODO

Set defaults for new ChunkPathPeriod and ChunkPathShardFactor SchemaConfig values; change FSObjectClient to use IdentityEncoder instead of Base64Encoder

Use IdentityEncoder everywhere we use FSObjectClient for Chunks

Add ExternalKey() function to SchemaConfig; update ObjectClient for Chunks
schemaConfig.ExternalKey change.

Signed-off-by: Callum Styan <callumstyan@gmail.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
…er parsing it; refactor key tests and benchmarks

Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
…ternalKey()`

Co-authored-by: Owen Diehl <ow.diehl@gmail.com>
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
@JordanRushing JordanRushing mentioned this pull request Jan 5, 2022
3 tasks
@JordanRushing JordanRushing self-assigned this Jan 5, 2022
return parseLegacyChunkID(userID, externalKey)
} else if strings.Count(externalKey, "/") == 2 { // v12+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This a very hot path. Do we have a benchmark for this function ? If so we can we have a look at the price of the change.

I have some idea of improvement but before I wonder how much the string.Count cost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! I've fixed an existing benchmark and added two more for the old parsing conditional logic:

$ go test -bench=Key -benchmem
goos: darwin
goarch: amd64
pkg: github.com/grafana/loki/pkg/storage/chunk
cpu: VirtualApple @ 2.50GHz
BenchmarkParseNewerExternalKey-8         3107478               412.6 ns/op             0 B/op          0 allocs/op
BenchmarkParseNewExternalKey-8           2605483               392.4 ns/op             0 B/op          0 allocs/op
BenchmarkParseLegacyExternalKey-8        2919925               414.9 ns/op            48 B/op          1 allocs/op
BenchmarkParseOldLegacyExternalKey-8     2884390               412.2 ns/op            48 B/op          1 allocs/op
BenchmarkParseOldNewExternalKey-8        3363462               360.0 ns/op             0 B/op          0 allocs/op
PASS
ok      github.com/grafana/loki/pkg/storage/chunk       13.586s

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you bench compare the root function too

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you bench compare the root function too

I've added additional benchmarks for the root parsing functions!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you ! If you got some time, there's definitively improvement we can make once this is merged on that root ParseExternalKey. I'm thinking we should try to use only string.Index and reslice as we go to avoid too many string.Index, I think we get it to work with a max of 2 string.Index, instead of string.Index and string.Count (which does a for loop).

We'll talk about this offline, now that we have a benchmark we can compare it, this is not urgent, but something fun you can take on.

@JordanRushing JordanRushing marked this pull request as ready for review January 6, 2022 19:02
@JordanRushing JordanRushing requested a review from a team as a code owner January 6, 2022 19:02
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Copy link
Contributor

@KMiller-Grafana KMiller-Grafana left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs part of this PR are fine now.

Copy link
Contributor

@cyriltovena cyriltovena left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I think it's worth trying to optimize the parse function in a follow up pr, to remove those string function call

pkg/storage/chunk/schema_config.go Show resolved Hide resolved
return parseLegacyChunkID(userID, externalKey)
} else if strings.Count(externalKey, "/") == 2 { // v12+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you bench compare the root function too

JordanRushing and others added 3 commits January 6, 2022 14:26
Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
memoizes schema number calculations
@owen-d
Copy link
Member

owen-d commented Jan 6, 2022

@cyriltovena I've added 8ee35f8 to this PR.

ExternalKey() benches

Diff vs non-memoized version:

benchmark                  old ns/op     new ns/op     delta
BenchmarkExternalKey-4     538           448           -16.88%

benchmark                  old allocs     new allocs     delta
BenchmarkExternalKey-4     7              6              -14.29%

benchmark                  old bytes     new bytes     delta
BenchmarkExternalKey-4     136           112           -17.65%

Diff vs control (tip of main branch):

benchmark                  old ns/op     new ns/op     delta
BenchmarkExternalKey-4     415           448           +7.96%

benchmark                  old allocs     new allocs     delta
BenchmarkExternalKey-4     6              6              +0.00%

benchmark                  old bytes     new bytes     delta
BenchmarkExternalKey-4     112           112           +0.00%

I think it would be much cleaner to have defined our schemas like type Schema int and just had a codec for v12 <> 12, but there's a lot of code to touch there that would further complicate this PR.

Signed-off-by: Jordan Rushing <jordan.rushing@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants