-
Notifications
You must be signed in to change notification settings - Fork 24.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft: Fix value scripts in composite aggs #53232
Conversation
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
run elasticsearch-ci/docs |
if (value.getClass() != BytesRef.class) { | ||
throw new IllegalArgumentException("Expected BytesRef, got " + value.getClass()); | ||
} | ||
currentValue = (BytesRef) value; | ||
final SortedBinaryDocValues dvs = docValuesFunc.apply(context); | ||
currentValue = dvs.normalizeValue((BytesRef) value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hit a pretty similar bug to this in #53315. I wonder if it'd be better to have a member on this class that is a Function<BytesRef, BytesRef>
that does the normalization. That way we don't need to build the SortedBinaryDocValues
. SortedBinaryDocValues
can be a fairly heavy thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other option I see: a function similar to docValuesFunc
that given a LeafReaderContext
returns a normalizer (Function<BytesRef, BytesRef>
). I can hook this into ValueSource, that's 1 level higher compared to SortedBinaryDocValues
.
The benefit: the change is not spread all over the place, because ValueSource has all definitions in 1 file.
However, @not-napoleon is re-factoring this part of the code, it seems to me it's better to wait for this re-factoring and re-visit this problem after it's done, if we want to go this way. Maybe we even benefit from the re-factoring (or it gets harder, but I hope not ;-) ).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see what you mean. I think having the appropriate reader
for this sort of thing in ValuesSource
would be useful. Not all the time, but for composite and for the min and max optimizations and the like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fun fact: The Values Source refactor barely touches ValuesSource.java
. Most of the work happens in ValuesSourceConfig
, ValuesSourceType
, and ValueType
(and, like, 200 other files) If memory serves, the only change I made to ValuesSource
was in #48320 which merged to master some time ago.
Composite has sort of a parallel value system (e.g. CompositeValuesSourceBuilder
, CompositeValuesSourceConfig
, &c) and we've been mostly ignoring that for the refactor. Eventually we'll need to reconcile them, but that's a ways off. In particular, there's a "legacy" version of ValuesSourceConfig#resolve()
that we'll be keeping in the refactor for now, which is what CompositeValuesSourceBuilder
calls. So I would not expect the behavior to change significantly after the refactor.
Also, we're pushing hard to get the VSRefactor into master soon, so if you want to wait on that, hopefully it won't be too long.
I haven't looked too deeply into this, so I'm not sure what our options are, but IMHO the closer to those composite-specific classes we can solve this, the better. It seems like this is a composite only problem (value scripts work correctly on non-composite aggs, AFAIK), which to me says composite is doing something wrong with how it handles the WithScript
values source subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked too deeply into this, so I'm not sure what our options are, but IMHO the closer to those composite-specific classes we can solve this, the better. It seems like this is a composite only problem (value scripts work correctly on non-composite aggs, AFAIK), which to me says composite is doing something wrong with how it handles the
WithScript
values source subclasses.
It's composite only but with the special case of match_all
:
It turns out that the problem is a regression of the optimization in #28745 (easy to prove: just add a query to the repro steps I gave in #53135).
If you add a query it starts working again due to the different execution path.
Thanks for all the input! I will re-visit later and e.g. check if/how it conflicts with your re-factoring.
The problem is fairly similar to the date nanos problem the min and Max
aggs. So it isn't entirely contained. Not the same, but pretty close.
…On Tue, Mar 10, 2020, 11:14 Mark Tozzi ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In
server/src/main/java/org/elasticsearch/search/aggregations/bucket/composite/BinaryValuesSource.java
<#53232 (comment)>
:
> if (value.getClass() != BytesRef.class) {
throw new IllegalArgumentException("Expected BytesRef, got " + value.getClass());
}
- currentValue = (BytesRef) value;
+ final SortedBinaryDocValues dvs = docValuesFunc.apply(context);
+ currentValue = dvs.normalizeValue((BytesRef) value);
Fun fact: The Values Source refactor barely touches ValuesSource.java.
Most of the work happens in ValuesSourceConfig, ValuesSourceType, and
ValueType (and, like, 200 other files) If memory serves, the only change
I made to ValuesSource was in #48320
<#48320> which merged to
master some time ago.
Composite has sort of a parallel value system (e.g.
CompositeValuesSourceBuilder, CompositeValuesSourceConfig, &c) and we've
been mostly ignoring that for the refactor. Eventually we'll need to
reconcile them, but that's a ways off. In particular, there's a "legacy"
version of ValuesSourceConfig#resolve() that we'll be keeping in the
refactor for now, which is what CompositeValuesSourceBuilder calls. So I
would not expect the behavior to change significantly after the refactor.
Also, we're pushing hard to get the VSRefactor into master soon, so if you
want to wait on that, hopefully it won't be too long.
I haven't looked too deeply into this, so I'm not sure what our options
are, but IMHO the closer to those composite-specific classes we can solve
this, the better. It seems like this is a composite only problem (value
scripts work correctly on non-composite aggs, AFAIK), which to me says
composite is doing something wrong with how it handles the WithScript
values source subclasses.
—
You are receiving this because you are on a team that was mentioned.
Reply to this email directly, view it on GitHub
<#53232?email_source=notifications&email_token=AABUXIVCQ6YCU3YLG22B6U3RGZKODA5CNFSM4LDDBUD2YY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCYWQJEA#discussion_r390390379>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUXIS72PCKLEYNQB5TI3LRGZKODANCNFSM4LDDBUDQ>
.
|
add a normalizeValue method to normalize values for scripted fields
fixes #53135
This is a draft fix for #53135 after debugging it. It turns out that the problem is a regression of the optimization in #28745 (easy to prove: just add a query to the repro steps I gave in #53135). I further found out that the script isn't applied for this optimization and that's how I fixed it. In order to do so, I introduced a new method for standalone value normalization, which does nothing if no script is used.
This fix is best effort, I do not know much about the codebase. I think it works but might not be a good implementation. This might just be a helper for creating a proper fix or I can make this a proper change (including test coverage) after getting some guidance.