fix flakyness by replacing HashSet and HashMap with LinkedHashSet and LinkedHashMap #11941

Nemesis123925 · 2023-11-02T23:19:00Z

As Stated in the issue/11939

… LinkedHashMap

Jackie-Jiang

We should not change ProjectPlanNode or AggregationFunctionUtils because it involves performance overhead to the production code. Can you try handling the order issue within the test?

walterddr

+1 on @Jackie-Jiang 's comment. best solution is to use a parsed out explanation on project node and do a lexi-sorting on the test side on both the expected and actual value

Nemesis123925 · 2023-11-04T01:40:21Z

Sure, let me try to do that. However, how can I handle the case which there could be random order for the nested sqls like line 1447 in ExplainPlanQueriesTest.java

Jackie-Jiang · 2023-11-04T18:04:44Z

I can see a simpler way is to create a HashSet in the test and use toString() to generate the PROJECT clause since for the same java version the order should be deterministic

Nemesis123925 · 2023-11-06T23:54:18Z

I try the way that Jackie Jiang proposed to use toString() of HashSet in the test to generate the PROJECT clause. It doesn't seems to work because the NonDex tool, which we use to check the falkyness, use differet and random seed for all the HashSet. So almost in every run that the order is still not the same.

I check the way that walteraddr proposed, it seems like it needs to change the validateRows functions in QueriesTestUtils.java which is also used by other functions. Is there other better options that we can do this way?

codecov-commenter · 2023-11-07T01:20:25Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (fee11d6) 61.41% compared to head (27755b8) 61.62%.
Report is 52 commits behind head on master.

Additional details and impacted files

@@             Coverage Diff              @@
##             master   #11941      +/-   ##
============================================
+ Coverage     61.41%   61.62%   +0.21%     
- Complexity      207     1150     +943     
============================================
  Files          2378     2385       +7     
  Lines        128865   129275     +410     
  Branches      19927    20016      +89     
============================================
+ Hits          79143    79671     +528     
+ Misses        44001    43808     -193     
- Partials       5721     5796      +75

Flag	Coverage Δ
custom-integration1	`<0.01% <ø> (-0.01%)`	⬇️
integration	`<0.01% <ø> (-0.01%)`	⬇️
integration1	`<0.01% <ø> (-0.01%)`	⬇️
integration2	`0.00% <ø> (ø)`
java-11	`61.57% <ø> (+0.19%)`	⬆️
java-21	`61.50% <ø> (+0.23%)`	⬆️
skip-bytebuffers-false	`61.59% <ø> (+0.19%)`	⬆️
skip-bytebuffers-true	`61.48% <ø> (+0.22%)`	⬆️
temurin	`61.62% <ø> (+0.21%)`	⬆️
unittests	`61.62% <ø> (+0.21%)`	⬆️
unittests1	`46.97% <ø> (+0.35%)`	⬆️
unittests2	`27.57% <ø> (-0.07%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Nemesis123925 · 2023-11-07T15:56:24Z

I took a look at the failed test and don't have too much clue on how to solve that. Can I get a hint?

Jackie-Jiang · 2023-11-08T00:48:48Z

I guess we need to modify the test to not test on the string value. @walterddr any suggestion?

walterddr · 2023-11-08T16:10:03Z

i would suggest instead of using QueriesTestUtils.testInterSegmentsResult, we should create a test validator specific for explained results. coupling the data result and explain result in the same assert logic is the problem here

Nemesis123925 · 2023-11-10T21:51:43Z

@walterddr Do you mean that I can create a different validateRow function for ExplainPlanQueriesTest, so we can solve the nondeternimism without changing the HashSet to LinkedHashSet?

walterddr · 2023-11-11T18:57:38Z

correct. explain query itself is not technically a "query" thus the response doesn't necessarily need to be interpretd/validated via the same method.

see https://www.postgresql.org/docs/current/using-explain.html#USING-EXPLAIN:

If you want to feed EXPLAIN's output to a program for further analysis, you should use one of its machine-readable output formats (XML, JSON, or YAML) instead.

we don't support format in explain, but i think parsing each operator and compare in unordered fashion should be suffice now. e.g. the format is always:

format:
<operator_name> ['(' <attribute> [, <attribute>] ')' ]

operator_name: 
    string

attribute:
    string | string:<attribute> | string=<attribute>

walterddr · 2023-11-11T19:02:44Z

IMO at this stage we can

add a special handling for the operators with indeterministic attribute list (sort by lex order)
the rest can be just string-match

Nemesis123925 · 2023-11-12T04:05:41Z

Got it, let me try doing it.
I also have a question such that If I created a function like ValidateExplained, do I need to write the test to test ValidateExplained function?

Nemesis123925 · 2023-11-12T04:16:38Z

Also I am aware of the code coverage change of my change. Can I ask the tool we are using to detect code coverage here? so I can validate the coverage myself before updating the change

walterddr · 2023-11-12T16:34:53Z

IMO you dont need to write test to validate test
i usually use intellij's Run with coverage to visualize inside the IDE, but you can also use mvn plugin to report such if that's more preferrable. there's no restriction here.

Thank you for considering the test validity and coverage! this is superb!

Nemesis123925 · 2023-11-13T18:06:05Z

I tried to change the LinkedHashSet and LinkedHashMap back to HashSet and HashMap, and found out that there is also nonderterminism in the PLAN_START node, which the numSegmentsForThisPlan also got randomized. While I am debuging this part, do you guys have any clue what is causing this nondeterminism?

Nemesis123925 · 2023-11-14T23:15:04Z

I found that a part of the non-determinism of numSegmentsForThisPlan happens insde of reduceOnDataTable, which in my understanding it compare and select the best query generated from the two servers, and add up the numSegments of them if the two queries are identical, which happens in extractUniqueExplainPlansAcrossServers. Due to the non-determinism inside of the PROJECT clause created by NonDex, the test which suppose to have the two generates queries identical will differ from the PROJECT field, thus be understood as different queries by extractUniqueExplainPlansAcrossServers. Thus the function will select the best one instead of adding their numSegments together, and create the nondeternism inside of the numSegmentsForThisPlan part.

For this part, I think the solution would be sorting the PROJECT clause before the two generated queries get into the extractUniqueExplainPlansAcrossServers function, so the function will add the numSegment up instead of choising the better one.

However this is not the only cause of the non-deternism of the numSegmentsForThisPlan part. I found that applying NonDex will change the numSegments part even before the two generqte queries get into the extractUniqueExplainPlansAcrossServers function. So I will be keep exploring that part.

walterddr · 2023-11-15T04:42:54Z

i think we can separate the 2 issues, IIUC the project ordering issue is independent from the numSegmentsForThisPlan

the ordering issue will occur even if only 1 server is servering, thus making sure the PROJECT node compare method in test is order invariant is the main thing.
the numSegmentsForThisPlan IMO is really a "judgement call", b/c
- technically speaking we should return ALL plans if there are multiple returns from different servers.
- however pinot doesn't really have a "logical" vs. "physical" plan concept, where "logical" is invariant of the data layout; and "physical" should be reported on every execution path.
- the result we are comparing here is neither truly logical or physical, that it only return a single plan ("as logical") but it consolidates from multiple server return ("physical").

hence my suggestion is we address the PROJECT ordering issue first in this PR and address the plan issue separately

CC @somandal who improved upon this part of the logic to chime in :-)

Nemesis123925 · 2023-11-15T06:38:09Z

Roger that

Here is the solution I suggest

In QueriesTestUtils.java, I write two new functions:

public static void testExplainSegmentsResult(BrokerResponseNative brokerResponse, ResultTable expectedResultTable) {
    assertEquals(brokerResponse.getResultTable().getDataSchema(), expectedResultTable.getDataSchema());
    validateExplainedRows(brokerResponse.getResultTable().getRows(), expectedResultTable.getRows());
  }

private static void validateExplainedRows(List<Object[]> actual, List<Object[]> expected) {
    assertEquals(actual.size(), expected.size());
    for (int j = 0; j < actual.size(); j++) {
      Object[] act = actual.get(j);
      Object[] exp = expected.get(j);
      String attributeToSort = "PROJECT";
      assertEquals(act.length, exp.length);
      if (act[0].toString().startsWith(attributeToSort)) {
        char[] act0 = act[0].toString().toCharArray();
        char[] exp0 = exp[0].toString().toCharArray();
        Arrays.sort(act0);
        Arrays.sort(exp0);
        act[0] = new String(act0);
        exp[0] = new String(exp0);
      }
      assertEquals(act, exp);
    }
  }

In ExplainPlanQueriesTest.java, I redirect the checkWithQueryExecutor to call testExplainSegmentsResult

If apply this fix, instead of fixing all 27 flaky tests in ExplainPlanQueriesTest.java, this will fix 14 of them, which are

org.apache.pinot.queries.ExplainPlanQueriesTest#testSelect
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectAggregate
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectAggregateUsingFilterGroupBy
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectAggregateUsingFilterGroupByVerbose
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectAggregateUsingFilterIndex
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectAggregateUsingFilterIndexGroupBy
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectAggregateUsingFilterIndexGroupByHaving
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectAggregateUsingFilterIndexGroupByOrderBy
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectAggregateUsingFilterOnTextIndexColumn
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectColumnUsingFilterOnRangeIndexColumn
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectColumnsUsingFilter
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectColumnsUsingFilterOnInvertedIndexColumn
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectColumnsVariationsOfAndOperators
org.apache.pinot.queries.ExplainPlanQueriesTest#testSelectColumnsVariationsOfOrOperators

If you guys think this fix is good, I will push the new commit.

fix flakyness by replacing HashSet and HashMap with LinkedHashSet and…

a563090

… LinkedHashMap

Nemesis123925 mentioned this pull request Nov 2, 2023

[Flaky-Test] Fix Flaky tests in Pinot-Core org.apache.pinot.queries.ExplainPlanQueriesTest #11939

Open

Jackie-Jiang added the flaky-test label Nov 3, 2023

Jackie-Jiang reviewed Nov 3, 2023

View reviewed changes

walterddr reviewed Nov 3, 2023

View reviewed changes

Nemesis123925 mentioned this pull request Nov 10, 2023

Pinot Pinot-Core ExplainPlanQueriesTest Opened TestingResearchIllinois/idoft#948

Merged

fix by sorting PROJECT field

27755b8

walterddr approved these changes Nov 20, 2023

View reviewed changes

walterddr merged commit 2c88c4f into apache:master Nov 20, 2023
19 checks passed

This was referenced Nov 24, 2023

update Accepts TestingResearchIllinois/idoft#1073

Merged

Try Fix By Replacing HashSet by LinkedHashSet #12048

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix flakyness by replacing HashSet and HashMap with LinkedHashSet and LinkedHashMap #11941

fix flakyness by replacing HashSet and HashMap with LinkedHashSet and LinkedHashMap #11941

Nemesis123925 commented Nov 2, 2023

Jackie-Jiang left a comment

walterddr left a comment

Nemesis123925 commented Nov 4, 2023 •

edited

Loading

Jackie-Jiang commented Nov 4, 2023

Nemesis123925 commented Nov 6, 2023

codecov-commenter commented Nov 7, 2023 •

edited

Loading

Nemesis123925 commented Nov 7, 2023

Jackie-Jiang commented Nov 8, 2023

walterddr commented Nov 8, 2023 •

edited

Loading

Nemesis123925 commented Nov 10, 2023

walterddr commented Nov 11, 2023 •

edited

Loading

walterddr commented Nov 11, 2023 •

edited

Loading

Nemesis123925 commented Nov 12, 2023

Nemesis123925 commented Nov 12, 2023

walterddr commented Nov 12, 2023

Nemesis123925 commented Nov 13, 2023

Nemesis123925 commented Nov 14, 2023

walterddr commented Nov 15, 2023

Nemesis123925 commented Nov 15, 2023

fix flakyness by replacing HashSet and HashMap with LinkedHashSet and LinkedHashMap #11941

fix flakyness by replacing HashSet and HashMap with LinkedHashSet and LinkedHashMap #11941

Conversation

Nemesis123925 commented Nov 2, 2023

Jackie-Jiang left a comment

Choose a reason for hiding this comment

walterddr left a comment

Choose a reason for hiding this comment

Nemesis123925 commented Nov 4, 2023 • edited Loading

Jackie-Jiang commented Nov 4, 2023

Nemesis123925 commented Nov 6, 2023

codecov-commenter commented Nov 7, 2023 • edited Loading

Codecov Report

Nemesis123925 commented Nov 7, 2023

Jackie-Jiang commented Nov 8, 2023

walterddr commented Nov 8, 2023 • edited Loading

Nemesis123925 commented Nov 10, 2023

walterddr commented Nov 11, 2023 • edited Loading

walterddr commented Nov 11, 2023 • edited Loading

Nemesis123925 commented Nov 12, 2023

Nemesis123925 commented Nov 12, 2023

walterddr commented Nov 12, 2023

Nemesis123925 commented Nov 13, 2023

Nemesis123925 commented Nov 14, 2023

walterddr commented Nov 15, 2023

Nemesis123925 commented Nov 15, 2023

Nemesis123925 commented Nov 4, 2023 •

edited

Loading

codecov-commenter commented Nov 7, 2023 •

edited

Loading

walterddr commented Nov 8, 2023 •

edited

Loading

walterddr commented Nov 11, 2023 •

edited

Loading

walterddr commented Nov 11, 2023 •

edited

Loading