Junit report try #2

kgyrtkirk · 2023-10-05T09:22:39Z

Fixes #XXXX.

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

Release note

Key changed/added classes in this PR

MyFoo
OurBar
TheirBaz

This PR has:

)

* Minimize PostAggregator computations Since a change back in 2014, the topN query has been computing all PostAggregators on all intermediate responses from leaf nodes to brokers. This generates significant slow downs for queries with relatively expensive PostAggregators. This change rewrites the query that is pushed down to only have the minimal set of PostAggregators such that it is impossible for downstream processing to do too much work. The final PostAggregators are applied at the very end.

…he#14662) ### Description Previously, the `maxSegments` configured for auto kill could be ignored if an interval of data for a given datasource had more than this number of unused segments, causing the kill task spawned with the task of deleting unused segments in that given interval of data to delete more than the `maxSegments` configured. Now each kill task spawned by the auto kill coordinator duty, will kill at most `limit` segments. This is done by adding a new config property to the `KillUnusedSegmentTask` which allows users to specify this limit.

Changes: * Add and invoke `StateListener` when state changes in `KubernetesPeonLifecycle` * Report `task/pending/time` metric in `KubernetesTaskRunner` when state moves to RUNNING

* cold tier wip * wip * copyedits * wip * copyedits * copyedits * wip * wip * update rules page * typo * typo * update sidebar * moves durable storage info to its own page in operations * update screenshots * add apache license * Apply suggestions from code review Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * add query from deep storage tutorial stub * address some of the feedback * revert screenshot update. handled in separate pr * load rule update * wip tutorial * reformat deep storage endpoints * rest of tutorial * typo * cleanup * screenshot and sidebar for tutorial * add license * typos * Apply suggestions from code review Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> * rest of review comments * clarify where results are stored * update api reference for durablestorage context param * Apply suggestions from code review Co-authored-by: Karan Kumar <karankumar1100@gmail.com> * comments * incorporate apache#14720 * address rest of comments * missed one * Update docs/api-reference/sql-api.md * Update docs/api-reference/sql-api.md --------- Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com> Co-authored-by: demo-kratia <56242907+demo-kratia@users.noreply.github.com> Co-authored-by: Karan Kumar <karankumar1100@gmail.com>

…apache#14753) Fix the queries that have latest aggregator with an expression as time column

…interfaces (apache#14572) * refactor front-coded into static classes instead of using functional interfaces * shared v0 static method instead of copy

* Additional dimensions for service/heartbeat * docs * review * review

Changes - Add abstract class `MetadataCleanupDuty` - Make `KillAuditLogs`, `KillCompactionConfig`, etc extend `MetadataCleanupDuty` - Improve log and error messages - Cleanup tests - No functional change

…pache#14571) * Add support for different result format * Add tests * Add tests * Fix checkstyle * Remove changes to destination * Removed some unwanted code * Address review comments * Rename parameter * Fix tests

…tions (apache#14280)

…pache#14758) * Add pod name to location * Add log * fix style * Update extensions-contrib/kubernetes-overlord-extensions/src/main/java/org/apache/druid/k8s/overlord/KubernetesPeonLifecycle.java Co-authored-by: Suneet Saldanha <suneet@apache.org> * Fix unit tests --------- Co-authored-by: Suneet Saldanha <suneet@apache.org>

* Rolling supervior task publishing * add an option for number of task groups to roll over * better * remove docs * oops * checkstyle * wip test * undo partial test change * remove incomplete test

Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>

…4752) * Metric to report time spent fetching and analyzing segments * fix test * spell check * fix tests * checkstyle * remove unused variable * Update docs/operations/metrics.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/operations/metrics.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> * Update docs/operations/metrics.md Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> --------- Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>

…uty (apache#14769) ### Description Previously, the `KillUnusedSegments` coordinator duty, in charge of periodically deleting unused segments, could spawn an unlimited number of kill tasks for unused segments. This change adds 2 new coordinator dynamic configs that can be used to control the limit of tasks spawned by this coordinator duty `killTaskSlotRatio`: Ratio of total available task slots, including autoscaling if applicable that will be allowed for kill tasks. This limit only applies for kill tasks that are spawned automatically by the coordinator's auto kill duty. Default is 1, which allows all available tasks to be used, which is the existing behavior `maxKillTaskSlots`: Maximum number of tasks that will be allowed for kill tasks. This limit only applies for kill tasks that are spawned automatically by the coordinator's auto kill duty. Default is INT.MAX, which essentially allows for unbounded number of tasks, which is the existing behavior. Realize that we can effectively get away with just the one `killTaskSlotRatio`, but following similarly to the compaction config, which has similar properties; I thought it was good to have some control of the upper limit regardless of ratio provided. #### Release note NEW: `killTaskSlotRatio` and `maxKillTaskSlots` coordinator dynamic config properties added that allow control of task resource usage spawned by `KillUnusedSegments` coordinator task (auto kill)

The current version of jackson-databind is flagged for vulnerabilities CVE-2020-28491 (Although cbor format is not used in druid), CVE-2020-36518 (Seems genuine as deeply nested json in can cause resource exhaustion). Updating the dependency to the latest version 2.12.7 to fix these vulnerabilities.

Removing Hadoop 2 support as discussed in https://lists.apache.org/list?dev@druid.apache.org:lte=1M:hadoop

* update RoaringBitmap to 0.9.49 update RoaringBitmap from 0.9.0 to 0.9.49 Many optimizations and improvements have gone into recent releases of RoaringBitmap. It seems worthwhile to incorporate those. * implement workaround for BatchIterator interface change * add test case for BatchIteratorAdapter.advanceIfNeeded

* Adding new function decode_base64_utf8 and expr macro * using BaseScalarUnivariateMacroFunctionExpr * Print stack trace in case of debug in ChainedExecutionQueryRunner * fix static check

…or querying with deep storage (apache#14944) Currently, only the user who has submitted the async query has permission to interact with the status APIs for that async query. However, often we want an administrator to interact with these resources as well. Druid handles these with the STATE resource traditionally, and if the requesting user has necessary permissions on it as well, alternatively, they should be allowed to interact with the status APIs, irrespective of whether they are the submitter of the query.

This commit pulls out some changes from apache#14407 to simplify that PR. Changes: - Rename `IndexerMetadataStorageCoordinator.announceHistoricalSegments` to `commitSegments` - Rename the overloaded method to `commitSegmentsAndMetadata` - Fix some typos

…oad (apache#15000) With PR apache#14322 , MSQ insert/Replace q's will wait for segment to be loaded on the historical's before finishing. The patch introduces a bug where in the main thread had a thread.sleep() which could not be interrupted via the cancel calls from the overlord. This new patch addressed that problem by moving the thread.sleep inside a thread of its own. Thus the main thread is now waiting on the future object of this execution. The cancel call can now shutdown the executor service via another method thus unblocking the main thread to proceed.

* K8s tasks restore should be from lifecycle start * add test * add more tests * fix test * wait tasks restore finish when start * fix style * revert previous change and add comment

- Add `KillTaskReport` that contains stats for `numSegmentsKilled`, `numBatchesProcessed`, `numSegmentsMarkedAsUnused` - Fix bug where exception message had no formatter but was was still being passed some args. - Add some comments regarding deprecation of `markAsUnused` flag.

Changes: - Add task context parameter `taskLockType`. This determines the type of lock used by a batch task. - Add new task actions for transactional replace and append of segments - Add methods StorageCoordinator.commitAppendSegments and commitReplaceSegments - Upgrade segments to appropriate versions when performing replace and append - Add new metadata table `upgradeSegments` to track segments that need to be upgraded - Add tests

This entails: Removing the enableUnnest flag and additional machinery Updating the datasource plan and frame processors to support unnest Adding support in MSQ for UnnestDataSource and FilteredDataSource CalciteArrayTest now has a MSQ test component Additional tests for Unnest on MSQ

These were added in apache#14977, but the implementations are incorrect, because they return null when the input arg is null. They should return false when the input is null. Remove them for now, rather than fixing them, since they're so new that they might as well never have existed.

…press false-positive gRPC CVEs (apache#15026)

) * Commit segments only when they are covered by active locks

* Remove stale comment since we're on avro version 1.11.1 * Update exception blocks. With 1.11.1, read() only throws IOException. * Unit tests * Cleanup and add more tests.

…ilter. (apache#14978)" (apache#15029) This reverts commit 4f498e6.

…14980) Add ingest/tombstones/count and ingest/segments/count metrics in MSQ.

The aggregators had incorrect types for getResultType when shouldFinalze is false. They had the finalized type, but they should have had the intermediate type. Also includes a refactor of how ExprMacroTable is handled in tests, to make it easier to add tests for this to the MSQ module. The bug was originally noticed because the incorrect result types caused MSQ queries with DS_HLL to behave erratically.

The KubernetesAndWorkerTaskRunner currently doesn't implement getTaskLocation, so tasks run by it will show a unknown TaskLocation in the druid console after a task has completed. Fix bug in KubernetesAndWorkerTaskRunner that manifests as missing information in the druid Web Console.

Upgrade maven shade plugin to try to fix build failures Sometimes we get maven shade errors in our integ tests becasue we don't run clean in between runs to clear the cache in order to speed them up. This can lead to the below error. Error: Failed to execute goal org.apache.maven.plugins:maven-shade-plugin:3.2.4:shade (opentelemetry-extension) on project opentelemetry-emitter: Error creating shaded jar: duplicate entry: META-INF/services/org.apache.druid.opentelemetry.shaded.io.grpc.NameResolverProvider See: https://issues.apache.org/jira/projects/MSHADE/issues/MSHADE-425?filter=allissues An example run that failed: https://github.com/apache/druid/actions/runs/6301662092/job/17117142375?pr=14887 According to the ticket this is fixed by updating shade to 3.4.1. When I updated to 3.4.1 I kept running into a different issue during static checks. (Caused by: java.lang.NoClassDefFoundError: com/github/rvesse/airline/parser/errors/ParseException) I had to add the createDependencyReducedPom: false to get the build to pass. The dependency reduced pom feature was added in 3.3.0 which we were not using before so setting it explicitly to false should not be a issue. https://issues.apache.org/jira/browse/MSHADE-36)

…processed (apache#15012)

* disable parallel builds; enable batch mode to get rid of transfer progress * restore .m2 from setup-java if not found * some change to sql * add ws * fix quote * fix quote * undo querytest change * nullhandling in mvtest * init more * skip commitid plugin * add-back 1.0C to build ; remove redundant skip-s from copy-resources; add comment

benchmarks/src/test/java/org/apache/druid/benchmark/FrontCodedIndexedBenchmark.java

@@ -166,11 +166,11 @@
    fileGeneric = File.createTempFile("genericIndexedBenchmark", "meta");

    smooshDirFrontCodedIncrementalBuckets = FileUtils.createTempDir();
-    fileFrontCodedIncrementalBuckets = File.createTempFile("frontCodedIndexedBenchmarkIncrementalBuckets", "meta");
+    fileFrontCodedIncrementalBuckets = File.createTempFile("frontCodedIndexedBenchmarkv1Buckets", "meta");


suneet-s and others added 30 commits August 3, 2023 06:07

Enable ServiceStatusMonitor in the examples (apache#14744)

00f1f8c

Remove unused param in MetadataResource (apache#14747)

b27d281

Retry S3 task log fetch in case of transient S3 exceptions (apache#14714

20c48b6

)

Report task/pending/time metrics for k8s based ingestion (apache#14698)

3335040

Changes: * Add and invoke `StateListener` when state changes in `KubernetesPeonLifecycle` * Report `task/pending/time` metric in `KubernetesTaskRunner` when state moves to RUNNING

Fix the bug in getIndexInfo for mysql (apache#14750)

d31c04c

Latest aggregator factories should accept time as VectorValueSelecto… (…

0d73480

…apache#14753) Fix the queries that have latest aggregator with an expression as time column

Improve the backport missing script (apache#14723)

6ced208

Cleanup the documentation for deep storage

d6c73ca

refactor front-coded into static classes instead of using functional …

e5661a3

…interfaces (apache#14572) * refactor front-coded into static classes instead of using functional interfaces * shared v0 static method instead of copy

Update tutorial-kafka.md (apache#14749)

590734b

Additional dimensions for service/heartbeat (apache#14743)

62ddeaf

* Additional dimensions for service/heartbeat * docs * review * review

Refactor: Cleanup coordinator duties for metadata cleanup (apache#14631)

2d8e0f2

Changes - Add abstract class `MetadataCleanupDuty` - Make `KillAuditLogs`, `KillCompactionConfig`, etc extend `MetadataCleanupDuty` - Improve log and error messages - Cleanup tests - No functional change

Docs: Include EARLIEST_BY and LATEST_BY as supported aggregation func…

7d78133

…tions (apache#14280)

Rolling Supervisor restarts at taskDuration (apache#14396)

b624a4e

* Rolling supervior task publishing * add an option for number of task groups to roll over * better * remove docs * oops * checkstyle * wip test * undo partial test change * remove incomplete test

Update kinesis docs (apache#14768)

bff8f9e

Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com> Co-authored-by: Katya Macedo <38017980+ektravel@users.noreply.github.com>

upgrade org.mozilla:rhino (apache#14765)

d0403f0

add new filters to unnest filter pushdown (apache#14777)

2845b6a

docs: remove experimental from schema auto-discoery (apache#14759)

8a4dabc

document expression aggregator (apache#14497)

667e4da

document new filters and stuff (apache#14760)

e57f880

Fixing typo in resultsTruncated (apache#14779)

cd817fc

Removes support for Hadoop 2 (apache#14763)

a45b25f

Removing Hadoop 2 support as discussed in https://lists.apache.org/list?dev@druid.apache.org:lte=1M:hadoop

xvrl and others added 25 commits September 20, 2023 15:52

Adding new function decode_base64_utf8 and expr macro (apache#14943)

883c269

* Adding new function decode_base64_utf8 and expr macro * using BaseScalarUnivariateMacroFunctionExpr * Print stack trace in case of debug in ChainedExecutionQueryRunner * fix static check

Use annotation to mark DecoupleIgnore (apache#15005)

e76962f

Restore tasks when lifecycle start (apache#14909)

be3f93e

* K8s tasks restore should be from lifecycle start * add test * add more tests * fix test * wait tasks restore finish when start * fix style * revert previous change and add comment

skip org.owasp:dependency-check on extensions-contrib modules and sup…

48b6d2a

…press false-positive gRPC CVEs (apache#15026)

Commit segments only when they are covered by active locks (apache#15027

f7a5491

) * Commit segments only when they are covered by active locks

Remove EOFException catch block from the Avro decoders (apache#15018)

ba6101a

* Remove stale comment since we're on avro version 1.11.1 * Update exception blocks. With 1.11.1, read() only throws IOException. * Unit tests * Cleanup and add more tests.

Revert "SQL: Plan non-equijoin conditions as cross join followed by f…

75af741

…ilter. (apache#14978)" (apache#15029) This reverts commit 4f498e6.

Add metrics for number of segments generated per task in MSQ (apache#…

7301e60

…14980) Add ingest/tombstones/count and ingest/segments/count metrics in MSQ.

MV_FILTER_ONLY may run into Exceptions in case duplicate values were …

022950a

…processed (apache#15012)

fix uploading IT docker logs to GHA artifacts (apache#15046)

fa61e65

junit-report

3a0ffd1

make some failure

a154a28

break more

69b913f

github-actions bot added the Area - Documentation label Oct 5, 2023

fox complit

0604c08

github-advanced-security bot found potential problems Oct 5, 2023

View reviewed changes

quote

d7564cb

kgyrtkirk closed this Oct 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Junit report try #2

Junit report try #2

kgyrtkirk commented Oct 5, 2023

Junit report try #2

Junit report try #2

Conversation

kgyrtkirk commented Oct 5, 2023

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...

Release note

Key changed/added classes in this PR