Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider consolidating common tests for Catalog implementations in a new crate #519

Open
fqaiser94 opened this issue Aug 3, 2024 · 1 comment
Labels
enhancement New feature or request
Milestone

Comments

@fqaiser94
Copy link
Contributor

fqaiser94 commented Aug 3, 2024

Some context here: #503 (comment)

As part of the MemoryCatalog implementation, I wrote ~1500 lines worth of tests to comprehensively exercise catalog behaviour. These tests are now being reused in the SqlCatalog implementation by copy-pasting.

We should consider a more DRY approach like creating a new crate with those tests that can be called by a function, maybe something like this:

fn test_catalog<C: Catalog>(catalog: &C) {
  todo!()
}

Although it might be better to pass in a function that can setup a new catalog so that each test can work on a fresh instance e.g.

/// Takes a function that accepts a FileIO and an optional warehouse_location and returns a fresh Catalog instance
fn test_catalog<C: Catalog>(setup_catalog_fn: impl FnOnce(FileIO, Option<String>) -> C) {
  todo!()
}

The main concern I have is that the existing catalogs don't all follow exactly the same behaviour, there are small, subtle differences in behaviour like checking for tables before dropping a namespace, error messaging, etc. It may or may not be worth aligning behaviours but we can discuss all of that when we get there.

The other concern I have is developer-experience. We should still strive for clear test failures. I'm not sure a function-based approach as illustrated above will offer a good DevX but it may still be worth the compromise. In other languages, this would just be an abstract class for me but I'm not sure what the Rust equivalent is at this moment in time.

This is currently blocked by: #503
I'll probably take it on myself.

@fqaiser94 fqaiser94 mentioned this issue Aug 3, 2024
@liurenjie1024
Copy link
Collaborator

Hi, @fqaiser94 Thanks for raising this, and I'm definitely +1 for this proposal to avoid duplicate codes!

About the overall design, inspired by your design, I think maybe we can have origanize things like following:

struct CatalogTestSuite<C> {
    catalog: C,
    file_io: FileIO
}

impl<C: Catalog> CatalogTestSuite {
  async fn test_catalog_create_empty_namespace(self) {
     ...
  }

async fn test_catalog_create_empty_hierarchy_namespace(self) {
     ...
  }
  ...
}

Then in each catalog tests module, we can just have tests to setup this CatalogTestSuite and call different method in different case:

#[tokio::test]
async fn test_create_empty_namespace() {
   let suite = CatalogTestSuite {}
   suite.test_catalog_create_empty_namespace().await
}

With this approach we have boilerplate codes in each catalog test module, but I think this is a trade off. We can use macros or libtest_mimic to remove duplicate codes as much as possible, but the problem is that it makes running single test in ide a little more difficult.

The main concern I have is that the existing catalogs don't all follow exactly the same behaviour, there are small, subtle differences in behaviour like checking for tables before dropping a namespace, error messaging, etc. It may or may not be worth aligning behaviours but we can discuss all of that when we get there.

I think there are two problems here:

  1. Not all catalogs support exactly all test cases. For example, IIRC, hive catalog doesn't support nested namespaces, and rest catalog's support for namespaces is not determined yet.
  2. About the subtle behaviors differentct(such as check table), I think we should align them, but we don't need to do them together, we can do them one by one once we have finished the necessary preparatory work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

2 participants