[Proposal] dump query processing performance metrics from various stages #3324

himanshug · 2016-08-04T04:49:21Z

While executing a query, Druid Broker and Historicals (and realtime tasks) publish very useful metrics like

at broker -
query/time
query/bytes
query/node/time
query/node/ttfb

at historical
query/time
query/bytes
query/segment/time
query/segment/wait
...

all the metrics contain queryId, host etc in the dimensions. so if Druid metrics were ingested in another Druid cluster, then users can understand where all the time for a query execution was spent. And, we do have a druid cluster (aka metrics-cluser) to debug performance issues.

However,

some users do not have bandwidth to maintain another druid cluster for metrics and push aggregated metrics to monitoring systems like Graphite. With aggregation, it becomes difficult to understand performance issues for specific queryId.
even with having a druid "metrics" cluster, it takes some time for metrics to get ingested to that cluster. sometimes we want to be able to do the debugging interactively, that is be able to send a query and see all the performance metrics in one place.
introduce /druid/v3 query endpoint that gives query responseContext #3319 and WIP: optionally configure DirectDruidClient to use /druid/v3 instead of /druid/v2 #3323 enable the ability to have large responseContext from broker (and same accumulated from all historicals)

This proposal is to enable dumping the query performance metrics in the responseContext if query context contains a flag, "dumpPerformance".

With the flag, end user would see a responseContext like below( which would be very useful to debug query performance problems).....

{
    "result": [ .... ],
    "context": {
        ....
        "broker": {
            "query/time" : 783,
            "query/bytes": 1234,
            "historical1": {
                "query/node/ttfb": 124,
                "query/node/time": 567,
                "query/node/bytes": 3564
            },
            "historical2": {
                "query/node/ttfb": 379,
                "query/node/time": 685,
                "query/node/bytes": 5632
            },
        },
        "historical1": {
            "query/time": 554,
            "query/bytes": 3564,
            "segments": [
                "segment_id1": {
                     "query/segment/time": 324,
                     "query/wait/time": 87
                },
                "segment_id2": {
                     "query/segment/time": 314,
                     "query/wait/time": 79
                }
            ]
        },
        "historical2": { .... }
    }
}

Depends on #3319 and #3323

The text was updated successfully, but these errors were encountered:

erikdubbelboer · 2016-08-04T06:57:46Z

Yes please, this would be something we're interested in. See: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/druid-development/MNHqZl7weLw/3CoM1MrgBAAJ

navis · 2016-08-04T07:34:27Z

👍

himanshug · 2016-08-04T18:22:14Z

for some queries , number of segments scanned might be very large and that can blow up the context... so i will probably limit the number of segments reported per historical to something like 10 (...assuming other segments behaved similarly, this much information would be enough)

also may be, have separate flags for only broker reporting the performance, and a "detailed" flag to include reports from historicals too.

github-actions · 2023-05-30T00:18:22Z

This issue has been marked as stale due to 280 days of inactivity.
It will be closed in 4 weeks if no further activity occurs. If this issue is still
relevant, please simply write any comment. Even if closed, you can still revive the
issue at any time or discuss it on the dev@druid.apache.org list.
Thank you for your contributions.

github-actions · 2023-06-27T00:21:08Z

This issue has been closed due to lack of activity. If you think that
is incorrect, or the issue requires additional review, you can revive the issue at
any time.

himanshug self-assigned this Aug 4, 2016

himanshug mentioned this issue Aug 5, 2016

introduce /druid/v3 query endpoint that gives query responseContext #3319

Closed

himanshug mentioned this issue Aug 17, 2016

WIP: dump query processing performance metrics from various stages #3371

Closed

github-actions bot added the stale label May 30, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] dump query processing performance metrics from various stages #3324

[Proposal] dump query processing performance metrics from various stages #3324

himanshug commented Aug 4, 2016 •

edited

Loading

erikdubbelboer commented Aug 4, 2016

navis commented Aug 4, 2016

himanshug commented Aug 4, 2016

github-actions bot commented May 30, 2023

github-actions bot commented Jun 27, 2023

[Proposal] dump query processing performance metrics from various stages #3324

[Proposal] dump query processing performance metrics from various stages #3324

Comments

himanshug commented Aug 4, 2016 • edited Loading

erikdubbelboer commented Aug 4, 2016

navis commented Aug 4, 2016

himanshug commented Aug 4, 2016

github-actions bot commented May 30, 2023

github-actions bot commented Jun 27, 2023

himanshug commented Aug 4, 2016 •

edited

Loading