Testing Framework V2 #1521

ray6080 · 2023-05-08T04:54:13Z

Goal

The goal of this proposal is to improve our existing testing framework, especially end-to-end testings:

re-organize existing tests in a better way according to functionalities of our database (see our documentation website), and remove unnecessary legacy tests. For example, to test COPY, we have tests distributed in several places, such as test/copy/copy_*.cpp, test/runner/e2d_copy_transaction_test.cpp, and test/runner/e2e_read_test.cpp, we should re-organize them altogether, so it's easier to navigate and make changes.
enhance our end-to-end tests. Currently, our end-to-end tests are quite simple. They cannot cover lots of complicated testing cases, such as exceptions, multiple statements with transactions, file-based result verification, etc. We need to make our test files more expressible and also easy to write. This will allow us to convert more inline tests to end-to-end tests.
allow partial and parallel runs of specific test groups. This is to speed up our test runs as we add more tests in the future and make it easy to run tests locally only relevant to one's interests.
find and fix critical code paths that are not covered by our tests. As we iterate, we should pay some attention to codecov reports.

The general principle for our testing is to avoid testing components individually, instead we should route all tests, when possible, in the end-to-end way through Cypher statements.
The main reason is that, through our experience, inline tests are not easy to maintain and do not bring much extra benefits compared to end-to-end ones.

Fuzzy tests is something we're interested but requires more thoughts, we are not yet sure when and how to add it. I'm looking into mutation-based fuzzy tests, and keep it as a TODO item for now.

Detailed Designs

Overview of Our Existing Tests

We use GTest as the underlying testing framework.
Logically, there are two testing modes: inline and end-to-end.
Inline tests require manually writing cpp code with gTests. Usually these serve as unit tests for an individual module. Examples are tests on parse, binder, optimizer, etc. Besides, most update tests are also inline, due to that they often require more complicated logic for testing, which is beyond our current end-to-end tests.
End2end tests test the system as a black-box. This is done through Cypher statements written in test files, see test/test_files.

Generally, we have two categories of tests in terms of their functionality: read-only and updates.
The former one is mainly read-only queries, including tests on data types, functions, expressions, and query clauses.
The latter one covers DDL, COPY, and DML. Tests on updates are usually coupled with tests on transactions and recovery.

Additionally, we have specialized tests in our language bindings, such as py tests and c api tests.

Basic End-to-end Tests

End-to-end tests rely on test files to describe testing logics.
Each test file expresses tests on a specific system functionality, such as node scans.
We separate end-to-end tests from inline tests that end-to-end tests doesn't require manually writing cpp code. Our testing framework auto parses test files and run accordingly.
Test files serve to similar functionality and same dataset can be organized in the same directory.
All of these tests under the same directory belong to the same test group.
Tests within the same test group will run on the same dataset with the same database directory.

Enhanced End-to-end Tests

The .test file includes two sections: header and body (the tests). The header contains the information about the group, the test name and the dataset. Note that the defined test granularity is to register one test per file. For example, in the previous version, the TEST_F block below registers a test case (GROUP) DemoDBTest with the name DemoDBTest.

TEST_F(DemoDBTest, DemoDBTest) {
    runTest(TestHelper::appendKuzuRootPath("test/test_files/demo_db/demo_db.test"));
    runTestAndCheckOrder(
        TestHelper::appendKuzuRootPath("test/test_files/demo_db/demo_db_order.test"));
}

In the new framework, two tests will be registered separately, using the header information. For instance, if both demo_db.test and demo_db_order.test contains the following header:

-GROUP DemoDBTest
-DATASET demo-db/csv
--

That will be equivalent (for register purposes) to:

TEST_F(DemoDBTest, DemoDBTest) {
    runTest(TestHelper::appendKuzuRootPath("test/test_files/demo_db/demo_db.test"));
}

TEST_F(DemoDBTest, DemoDBTest) {
    runTestAndCheckOrder(
        TestHelper::appendKuzuRootPath("test/test_files/demo_db/demo_db_order.test"));
}

The -- separator ends the header reading and starts the content of the end-to-end test.

Each test can also consist of several statements together with either the expected result, a ok indicator, or a error indicator.

The syntax of test files should be enhanced as following:

allow multiple statements to be executed sequentially in a single test.
allow each statement to be followed not only by the expected result, but also an indicator. ok indicator means the statement is expected to be executed successfully, and don't verify values in QueryResult; error indicates the statement throws an exception during execution, we check the exception message.
allow compare query result with a csv file. This can be used when the query result is large, and we don't want to bloat the test file with large query result.
define a statement block with several statements, which can be reused and inserted into any tests in the same file.
allow running shell command as a statement, such as :thread x, :timeout x, etc.
allow running of transactions in recovery mode. (TBD)
for loop and variable declaration. (TBD)

An example test file:

-GROUP Create
-TEST CreateRelTest
-DATASET tinysnb

--

-DEFINE_STATEMENT_BLOCK create_rel_set [
    -STATEMENT MATCH (a:person), (b:person) WHERE a.ID=10 AND b.ID=20 CREATE (a)-[e:knows]->(b);
    ---- ok
    -STATEMENT MATCH (a:person), (b:person) WHERE a.ID=1 AND b.ID=2 CREATE (a)-[e:knows]->(b);
    ---- ok
    -STATEMENT MATCH (a:person), (b:person) WHERE a.ID=1 AND b.ID=20 CREATE (a)-[e:knows]->(b);
    ---- error
    "Exception: Duplicate primary key"
]

-CASE CreateRelRollbackTest
-STATEMENT MATCH (a:person)-[e:knows]->(b:person) RETURN COUNT(*)
---- 1
14
-STATEMENT BEGIN TRANSACTION
---- ok
-STATEMENT_BLOCK create_rel_set
-QUERY MATCH (a:person)-[e:knows]->(b:person) RETURN COUNT(*)
---- 1
16
-STATEMENT ROLLBACK;
-QUERY MATCH (a:person)-[e:knows]->(b:person) RETURN COUNT(*)
---- 1
14


-CASE CreateRelCommitTest
-STATEMENT MATCH (a:person)-[e:knows]->(b:person) RETURN COUNT(*)
---- 1
14
-STATEMENT BEGIN TRANSACTION
---- ok
-STATEMENT_BLOCK create_rel_set
-QUERY MATCH (a:person)-[e:knows]->(b:person) RETURN COUNT(*)
---- 1
16
-STATEMENT COMMIT;
-QUERY MATCH (a:person)-[e:knows]->(b:person) RETURN COUNT(*)
---- 1
16


-CASE MatchTest
-COMMAND :thread 8
-STATEMENT MATCH (a:person)-[e:knows]->(b:person) RETURN a.ID, b.ID
---- 16
# Compare results with a csv file
<FILE>:test/tinysnb/match/answers/create_node_test.csv

Note that test cases in the test file are executed sequentially, thus changes in a test case can propagate to later ones.

Partial Runs and Parallel Runs

Provide interfaces to run a (or some) specific test group(s).
Allow manually configuring test groups to run in parallel when they are under different database directories.

PyTest Framework

Set up a similar testing framework for pytests.
TODO(Guodong): fill more details.

TODOs

Each of the following todo item will correspond to one or more PRs.
It's always good to split major changes into small PRs, so we can iterate more frequently as we move on.

Fuzzy Tests

Consider adding fuzzy copy tests with random graphs.

TODO(Guodong)

Conversion of current tests

test

binder: not e2e
c_api: not e2e
common: not e2e
copy: add support of comparing query result with csv files. convert cpp tests to end to end ones.
main: not e2e except for the exception_test.cpp. btw, maybe we should group all kinds of exception tests together when possible.
optimizer: not e2e
parser: It's possible to convert to content to test_files/parser/parser.test, but I'm not sure if we should keep this test here. (It's fine, let's also convert this into end2end tests)
processor: not e2e
runner: more info below
storage: not e2e
transaction: requires support of transaction related statements, such as BEGIN TRANSACTION, COMMIT, ROLLBACK. etc. Need to think a bit on how to rewrite existing cpp to end to end tests.

test/runner

e2e_update_node_test.cpp: The only constraint to convert this file is a function getStringExceedsOverflow that generates a long list [0, 1, 2 ... 5990]. The simplest solution is to hard-code this list into the test file. Other possible solutions would be to read these files from a file or somehow generate the values inside the test file:

# Alternative 1
-QUERY MATCH (a:person) WHERE a.ID=0 SET a.fName=[0,1,2,3,4,5,6,7,8........

or

# Alternative 2
-DEFINE LONG_LIST READ_FIXTURE("long_list_file.txt")
-QUERY MATCH (a:person) WHERE a.ID=0 SET a.fName=${LONG_LIST}

or

# Alternative 3
-DEFINE LONG_LIST [0..5990]
# or -DEFINE LONG_LIST LIST(0,5990)
-QUERY MATCH (a:person) WHERE a.ID=0 SET a.fName=${LONG_LIST}

I tend towards the third alternative since it's more flexible and might be helpful for other tests.

e2e_copy_transaction_test.cpp: leave untouched for now
e2e_create_rel_test.cpp: leave untouched for now. Update: should be fine to convert. It will require implementing commitOrRollbackConnectionAndInitDBIfNecessary on TestRunner and also FOR loops for the multiple insertion part.
e2e_ddl_test.cpp: leave untouched for now
e2e_delete_create_transaction_test.cpp: leave untouched for now
e2e_exception_test.cpp: done
e2e_read_test.cpp: done
e2e_delete_rel_test.cpp: leave untouched for now
e2e_set_transaction_test.cpp: leave untouched for now
e2e_update_rel_test.cpp: leave untouched for now

The text was updated successfully, but these errors were encountered:

andyfengHKU · 2024-06-07T19:11:12Z

The most critical part has been done. Fuzzy testing and pytest we don't have the engineer in the near future.

ray6080 assigned rfdavid May 8, 2023

rfdavid mentioned this issue May 17, 2023

Testing framework v2 #1548

Merged

2 tasks

rfdavid mentioned this issue Jun 5, 2023

Test Framework: Support CSV to Parquet conversion #1611

Merged

andyfengHKU closed this as completed Jun 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Testing Framework V2 #1521

Testing Framework V2 #1521

ray6080 commented May 8, 2023 •

edited by andyfengHKU

Loading

andyfengHKU commented Jun 7, 2024

Testing Framework V2 #1521

Testing Framework V2 #1521

Comments

ray6080 commented May 8, 2023 • edited by andyfengHKU Loading

Goal

Detailed Designs

Overview of Our Existing Tests

Basic End-to-end Tests

Enhanced End-to-end Tests

Partial Runs and Parallel Runs

PyTest Framework

TODOs

Fuzzy Tests

Conversion of current tests

test

test/runner

andyfengHKU commented Jun 7, 2024

ray6080 commented May 8, 2023 •

edited by andyfengHKU

Loading