-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Line Filters in StringLabelFilter #8659
Conversation
Nice find! Would you have a flame graph for the Unicode case? This is quite a drop. |
d27bcc0
to
225077c
Compare
@jeschkies Turns out it's super expensive to call |
@jeschkies I dug into this today and discovered that Prometheus Label Semantics are different than line filters. Label matches are anchored to the beginning and ending of lines so there is no concept of "contains a literal" for labels. This PR originally broke those semantics so some operations that were essentially no-ops now took a ton of time. I've fixed the regex simplification to account for the label semantics and the benchmarks make way more sense:
|
Converting this to a draft as I work through the implications of the above comment :) |
…gex goodness we have worked on in the line filters
5c99e5b
to
94685f4
Compare
ok, I think this the last one. The filters we generate for regexes for trees whose leaf nodes are either a This last commit adds the Note on benchmarks -- The benchmarks look great because everything reduces to a noop: most regexes reduce to some pile of
|
This reverts commit 7f42137.
We've done a ton of work to optimize regexes in line filters but nothing in label filters. To avoid duplicating the work or reworking all of our filter types, this PR creates a new label filter that has an optimized line filter. It uses that against the label value to determine if they match.
Note on Benchmarks: the input string for this is 1114195 bytes long. The
not
test tells us that our approach is quite a lot slower when the literal pattern isn't present in the input. The size of the input really emphasizes it.Benchmarks: