Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Query Profiling #539

Closed
jkowall opened this issue Apr 13, 2021 · 20 comments
Closed

[Proposal] Query Profiling #539

jkowall opened this issue Apr 13, 2021 · 20 comments
Labels
community Issues raised by community members and contributors discuss Issues intended to help drive brainstorming and decision making enhancement Enhancement or improvement to existing feature or request feature New feature or request idea Things we're kicking around. untriaged

Comments

@jkowall
Copy link
Contributor

jkowall commented Apr 13, 2021

Requirements - What kind of business use case are you trying to solve?

  • Fixing performance issues on OpenSearch is difficult, this is due to limited observability within the database itself.
  • In order to take action or understand the performance and behavior of the database we need to understand the workload that a specific node or cluster is running.
  • This data would be collected in a sampled format to allow for the analysis of the data.
  • In the future this data would be used to block, throttle, or otherwise control the queries.

Problem - What blocks you from solving this today?

Today the profiling capabilities are deep but basic. They are also not part of the open source codebase. The only way to profile OpenSearch is via Java/JVM profiling which is highly inefficient and expensive from a performance perspective.

Proposal - what do you suggest to solve the problem or improve the existing situation?

The basic profiler capability within OpenSearch. This would allow for basic data collection and a pluggable architecture to allow the profiler to be extended with additional capabilities. Some of the future capabilities would be to allow queries to be actioned upon for example blocking, throttling, or prioritizing the queries being executed.

Selection of queries to profile

  1. The initial proposal would be to allow queries to be named in order to be flagged as always profile.
  2. Profiling can be configured to collect data on a specific query type or collect data based on a percentage of the total volume.
  3. If the query is named it should always be collected unless specified with a percentage (ex: 10%)
  4. Each query should be normalized to a general definition if there is no name, this should be defined by a query match

Data collected on queries

  1. Number of times the query was run
  2. Number of documents returned by the query
  3. Total time
  4. Number of shards scanned
  5. success/failed across the shards

Data storage

  1. The data should be stored in OpenSearch to be accessed by other plugins or the engine itself
  2. Alternatively, the metrics could be exposed in an endpoint and be scraped with Prometheus for monitoring/observability use cases
  3. Another option is to create a new kind of node for this purpose - a profiling node. This way the performance of the other nodes would remain the same and this new profiling node would have its own indexing settings optimized for the profiling data. Additionally this could allow admins to add such a node and remove it whenever they want.

Assumptions

We assume the user is running OpenSearch and we can define a new index for the data storage.

Any open questions to address

What did we miss? We don’t want this feature to get over-engineered, but I am sure more things should be added.

One open item is where this should live. This could be an internal plugin, or it could be an external self-contained node or cluster itself. There are pros and cons to each design decision.

We appreciate your input on this concept before design and coding start.

Thank you from @jkowall and @AmiStrn

@jkowall jkowall added the enhancement Enhancement or improvement to existing feature or request label Apr 13, 2021
@rursprung
Copy link
Contributor

having this as an external cluster or self-contained node would create additional maintenance overhead when managing the cluster (esp. if it's not managed with a k8s operator; i.e. either run on bare metal or on k8s but w/o an operator). thus i'd vote for this being part of the nodes (be that as a built-in feature or a plugin).

storing this in opensearch vs. exposing it to prometheus comes down to whether you want to have run-time profiling (=> then it'd probably be something for prometheus; but that'd require people to actually have prometheus (which you do in a big setup, esp. with k8s)) or want to have performance metrics of a single query you're currently working on.
for those of us working with traditional RDBMS this could be compared to looking at the explain plan of a single query (while actively working on it, i.e. literally EXPLAIN PLAN FOR SELECT * FROM DUAL) or looking at things like ASH or AWR reports.

@AmiStrn
Copy link
Contributor

AmiStrn commented Apr 27, 2021

@rursprung I agree with your point regarding the extra node idea proposed. I was thinking this could simplify the architecture by separation of concerns, but as you correctly point out:

[it] would create additional maintenance overhead when managing the cluster

Initially, I had been aiming for run-time profiling. Aggregating data by query context over time and exposing it as a metric. This way using something like a task throttling feature we could keep harmful query contexts at bay.

WDYT about exposing the metrics this way?

@jkowall jkowall changed the title Proposal: Query Profiling [Proposal] Query Profiling May 3, 2021
@malpani
Copy link
Contributor

malpani commented May 3, 2021

Couple of questions/suggestions

The initial proposal would be to allow queries to be named in order to be flagged as always profile. If the query is named it should always be collected unless specified with a percentage

Would it be better to have this as a query param? eg. there is already a "profile": true

Data collected

num-shards, latency, doc-counts, frequency will provide a nice high level view. Will this include time spent on queues and shard level latency? Is the use case here more on analytics of the kind of queries running on the cluster or is the end goal is to leverage this for query planning? If it is the latter, then is the long term plan here to add more granular metrics eg. memory footprint and should this proposal include granular metrics? If its the former, then i would prefer something more generic like a netty interceptor that could be used to track not just search but also all requests like bulk?

@AmiStrn
Copy link
Contributor

AmiStrn commented May 4, 2021

Would it be better to have this as a query param? eg. there is already a "profile": true

As stated at the top of the doc in the link you provided :

The Profile API is a debugging tool and adds significant overhead to search execution.

So the idea is to have the context of the query added to the query request (perhaps as a query param, but that is already more of an implementation detail)
I am currently working on a breakdown for this proposal into phases, this first phase would add a nice way to monitor the queries in the slow logs grouped by their respective context params.
Regarding the profiler -- it is way more robust than what we are going for here, and no one should expect that to run all the time on each and every query.


Will this include time spent on queues and shard level latency?

Is that something you would like monitored per query context provided?


Is the use case here more on analytics of the kind of queries running on the cluster or is the end goal is to leverage this for query planning?

The first couple of phases are adding the basic metrics required to track queries by context and not by their actual query. So it won't be for query planning or for analytics on the kind of queries but rather analytics on the quey contexts running in the cluster.

For example, you could say that a query context is the origin of the request, and then scrape the metrics and see how the queries marked with that context performed. An origin could be a dashboard and its id, or a direct API call... this way if the cluster is really slow and you see that there is a massive spike in queries from a certain dashboard - you can act on this information in some way, and then fix that dashboard.

Later phases focus on the ability to derive part of the context from the query itself (categorizing them is an open question here).

@malpani
Copy link
Contributor

malpani commented May 5, 2021

thanks for the examples, now i understand the use case better. So the target audience for this feature is primarily multi-tenant/multi-user clusters where queries land up from different dashboards/tools and this feature is to allow labeling queries/tracking source referrer 'context' and in the first phase collect shard counts and overall latency per 'context'. My guess is this is to give insights into the kind of dashboards/referrers are popular/slow? Is it more apt to update the issue name to something like 'query source analytics'?

Is that something you would like monitored per query context provided?

Reason i asked about time spent on queue(at a shard level) and other finer metrics is e2e time could show up as slow but might actually be due to other queries, as currently search queues are FCFS

@AmiStrn
Copy link
Contributor

AmiStrn commented May 6, 2021

So the target audience for this feature is primarily multi-tenant/multi-user clusters

This is a pain point for these audiences, however, also labeling the searches performed by the alerting system is a good idea since this will allow you to see what alerts are causing strain on the cluster. But generally speaking -- yes this is mostly usefull for monitoring multi-tenant/multi-user clusters.

Is it more apt to update the issue name to something like 'query source analytics'?

I'm fine with that idea, @jkowall WDYT? ☝️

e2e time could show up as slow but might actually be due to other queries, as currently search queues are FCFS

Thanks for this input I'll mention this in the phase breakdown 👍

@jkowall
Copy link
Contributor Author

jkowall commented May 6, 2021

@AmiStrn I think the idea is to make some additional issues off this master issue as we break things up.

The capabilities to analyze the source of a query is merely one way the data is to be used.

We are also going to create metrics that summarize the performance, utilization, and errors of specific queries or the OpenSearch engine as a while which are far outside of "query source". This is more of the core of a query profiler which is why we called this feature that name. We could go deeper and generate more data that can help optimize or tune queries or where time should be spent to optimize the queries.

@AmiStrn
Copy link
Contributor

AmiStrn commented May 30, 2021

Breaking this feature down into several phases:

  1. Enhance existing search stats groups feature to accommodate the query contexts

    • Index stats, currently, are summed from the shard level. They need to become level aware:
      • If we search in the index level we should get the max query_time_in_millis of the shards level stats
      • In the _all level we should be seeing the max query_time_in_millis of all the indices, which is the max of any of the shards.
  2. Add metrics collecting (by context) for queries

  3. Enable automatic query context resolving so all queries can be profiled by either the defined context header OR an auto resolved context

    • Create contexts on the fly defined by the actual query contents, for example:
      • Does the query contain a Wildcard (leading or trailing)?
      • Does the query contain nested aggregations? How many levels of aggregation are there? (“...,nested-agg-levels:3,...”)
    • Enable profiling a sample of the queries (e.i. 10% of them) if they do not contain a query context header.

(This feature is optional - cluster settings will contain a flag to enable this feature, as well as definitions regarding sampling)

Future features that can build off this one - the ability to reject queries based on their context (i.e. reject all queries from dashboard-98765). This will be highly beneficial for managed environments.

**edited based on comments by @malpani

@malpani
Copy link
Contributor

malpani commented Jun 2, 2021

Search DSL today already provides a stats parameter which is an optional string array to track SearchStats per group/label. This is already integrated with stats apis and slow log. Instead of a new header and new apis, should the idea be to enhance the current stats groups functionality?

@AmiStrn
Copy link
Contributor

AmiStrn commented Jun 2, 2021

Thanks! To be honest I never used this tag. After reading the docs it looks like it is way better than adding a new header.
I will check it out and update the feature breakdown accordingly.

The idea is to enhance this and also in the 3rd phase, to add the capability to add stats on the fly.

@AmiStrn
Copy link
Contributor

AmiStrn commented Jun 8, 2021

@malpani oddly enough the stats data is really not useful in its current state unless you want the shard level data (which can be a massive response given high shard count and a multi-index search).
The reason I say this is that the stats appear to sum up the data -
if you query an index with 10 shards once then the stats at the index level would say your query_total is 10. And worse yet, the query_time_in_millis is a SUM (!!!) of the query_time_in_millis per shard. So anything above shard level is being summed up.
The first phase would enhance the stats by adding a max value metric to the stats. I will update the phases accordingly:)

@nknize
Copy link
Collaborator

nknize commented Jun 17, 2021

Quickly having a look at this; query profiling typically has to disable the query cache to ensure valid results (e.g., you're profiling the actual query and not measuring a cache hit). This is one of the reasons "profiling adds significant overhead". Is the same intended here?

Can you edit the original issue and include the phased approach as a check list in the description? It'll make it easier to convert this to a meta issue.

@nknize nknize added discuss Issues intended to help drive brainstorming and decision making community Issues raised by community members and contributors feature New feature or request idea Things we're kicking around. untriaged labels Jun 17, 2021
@jkowall
Copy link
Contributor Author

jkowall commented Jun 17, 2021

@nknize Depends on your use case for the query profiler. Sometimes this is used as an "explain" use case for developing efficient queries while other times it's used for observability and debugging. In the later case, I don't think any type of caching changes or disabling caching is required. In fact you wouldn't want to do that at all.

I would this of this as more of what we were looking at, especially with the Prometheus integration we want to do for scraping the data. Being observability people I think this is the use case we are looking to fulfill.

@AmiStrn
Copy link
Contributor

AmiStrn commented Jun 17, 2021

@nknize I am going to use the indices search stats groups for the measurements. Cache hits and misses are part of the way the query is being sent (if it is the same then you get hits, so it is not a query that should worry us usually in terms of cluster health).
I second what @jkowall mentioned regarding explain vs observability.

What would you suggest regarding making this more clear in the issue description?

@nknize
Copy link
Collaborator

nknize commented Jun 18, 2021

Depends on your use case for the query profiler. Sometimes this is used as an "explain" use case for developing efficient queries while other times it's used for observability and debugging.

So this is a great point. The Explain API in OpenSearch and Lucene is something different than what you're describing; it's a way to measure how a specific document was scored within a query context. This is different than the "way to write efficient queries" use-case which is more aligned with the Profile parameter; an "all or nothing" query analyzer to look not just at query performance (CPU time) but how Lucene's rewrite mechanism formulated the final 'Query' obect.

I think what you're after here for the "observability" use case is either

  1. a new TRACE API (maybe start as a new search parameter) that enables users to trace the query request through the server and disclose "exactly" what happened without disabling query cache.

or

  1. An enhancement to the profile parameter like @malpani was getting at? To achieve something like a TRACE described in 1. It would be nice to enhance the profile parameter to take an object json instead of a primitive Boolean to enable users to execute selective profiling; which would include this new trace feature. In this manner you'd only disable query caching say if you want "full scope" performance profiling.

@nknize
Copy link
Collaborator

nknize commented Jun 18, 2021

What would you suggest regarding making this more clear in the issue description?

You can take your "Breaking the feature down into several phases" comment above and either:

  1. update the description of this issue to include it as a checklist that can link to other feature implementation issues.

Or,

  1. open a new meta issue with that as the description and use this as the incubated "discuss" issue.

I like 1. so everything is in one place but that's just my opinion.

@nknize
Copy link
Collaborator

nknize commented Jun 18, 2021

Today the profiling capabilities are deep but basic. They are also not part of the open source codebase.

Just a minor correction that Profiling is part of the OSS codebase. Were you referring to something different?

@AmiStrn
Copy link
Contributor

AmiStrn commented Jun 20, 2021

  1. a new TRACE API (maybe start as a new search parameter) that enables users to trace the query request through the server and disclose "exactly" what happened without disabling query cache.

This is basically what the stats param gives us. You "tag" your searches with that so that the shard search stats will begin to aggregate stats according to the groups. Thestats param expects a list of these tags.

Regarding an enhancement to the profile parameter, I will play around with the cache enable/disable to see what we could learn in such cases.

I like 1. so everything is in one place but that's just my opinion.

Thanks, I also like this option.

Profiling is part of the OSS codebase

True, the thing that is not OSS is the Kibana side of this API.

@AmiStrn
Copy link
Contributor

AmiStrn commented Jun 23, 2021

I will be opening a Meta issue for the stats and profiling pain points raised in this thread

@jkowall
Copy link
Contributor Author

jkowall commented Jun 30, 2021

We decided that this issue was not easily solvable, at least not to get the outcome we wanted. I am going to close this out for now. We are going to use some of the query naming capabilities in the code to provide more control. We will be opening an issue/proposal on query control soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community Issues raised by community members and contributors discuss Issues intended to help drive brainstorming and decision making enhancement Enhancement or improvement to existing feature or request feature New feature or request idea Things we're kicking around. untriaged
Projects
None yet
Development

No branches or pull requests

6 participants