Consider consolidating common tests for Catalog implementations in a new crate #519

fqaiser94 · 2024-08-03T17:44:22Z

As part of the MemoryCatalog implementation, I wrote ~1500 lines worth of tests to comprehensively exercise catalog behaviour. These tests are now being reused in the SqlCatalog implementation by copy-pasting.

We should consider a more DRY approach like creating a new crate with those tests that can be called by a function, maybe something like this:

fn test_catalog<C: Catalog>(catalog: &C) {
  todo!()
}

Although it might be better to pass in a function that can setup a new catalog so that each test can work on a fresh instance e.g.

/// Takes a function that accepts a FileIO and an optional warehouse_location and returns a fresh Catalog instance
fn test_catalog<C: Catalog>(setup_catalog_fn: impl FnOnce(FileIO, Option<String>) -> C) {
  todo!()
}

The main concern I have is that the existing catalogs don't all follow exactly the same behaviour, there are small, subtle differences in behaviour like checking for tables before dropping a namespace, error messaging, etc. It may or may not be worth aligning behaviours but we can discuss all of that when we get there.

The other concern I have is developer-experience. We should still strive for clear test failures. I'm not sure a function-based approach as illustrated above will offer a good DevX but it may still be worth the compromise. In other languages, this would just be an abstract class for me but I'm not sure what the Rust equivalent is at this moment in time.

This is currently blocked by: #503
I'll probably take it on myself.

liurenjie1024 · 2024-08-04T03:37:29Z

Hi, @fqaiser94 Thanks for raising this, and I'm definitely +1 for this proposal to avoid duplicate codes!

About the overall design, inspired by your design, I think maybe we can have origanize things like following:

struct CatalogTestSuite<C> {
    catalog: C,
    file_io: FileIO
}

impl<C: Catalog> CatalogTestSuite {
  async fn test_catalog_create_empty_namespace(self) {
     ...
  }

async fn test_catalog_create_empty_hierarchy_namespace(self) {
     ...
  }
  ...
}

Then in each catalog tests module, we can just have tests to setup this CatalogTestSuite and call different method in different case:

#[tokio::test]
async fn test_create_empty_namespace() {
   let suite = CatalogTestSuite {}
   suite.test_catalog_create_empty_namespace().await
}

With this approach we have boilerplate codes in each catalog test module, but I think this is a trade off. We can use macros or libtest_mimic to remove duplicate codes as much as possible, but the problem is that it makes running single test in ide a little more difficult.

The main concern I have is that the existing catalogs don't all follow exactly the same behaviour, there are small, subtle differences in behaviour like checking for tables before dropping a namespace, error messaging, etc. It may or may not be worth aligning behaviours but we can discuss all of that when we get there.

I think there are two problems here:

Not all catalogs support exactly all test cases. For example, IIRC, hive catalog doesn't support nested namespaces, and rest catalog's support for namespaces is not determined yet.
About the subtle behaviors differentct(such as check table), I think we should align them, but we don't need to do them together, we can do them one by one once we have finished the necessary preparatory work.

fqaiser94 mentioned this issue Aug 3, 2024

SQL Catalog #503

Closed

liurenjie1024 added the enhancement New feature or request label Sep 4, 2024

liurenjie1024 added this to the 0.4.0 Release milestone Sep 4, 2024

liurenjie1024 mentioned this issue Sep 4, 2024

feat: support create partition table for non REST catalog #577

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider consolidating common tests for Catalog implementations in a new crate #519

Consider consolidating common tests for Catalog implementations in a new crate #519

fqaiser94 commented Aug 3, 2024 •

edited

Loading

liurenjie1024 commented Aug 4, 2024

Consider consolidating common tests for Catalog implementations in a new crate #519

Consider consolidating common tests for Catalog implementations in a new crate #519

Comments

fqaiser94 commented Aug 3, 2024 • edited Loading

liurenjie1024 commented Aug 4, 2024

fqaiser94 commented Aug 3, 2024 •

edited

Loading