Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

move rag answer correctness metrics tests to test_metrics.py #1131

Closed
wants to merge 1 commit into from

Conversation

matanor
Copy link
Member

@matanor matanor commented Aug 11, 2024

No description provided.

Copy link
Member

@elronbandel elronbandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this test should be at the preparation card without the if __name__ == "__main__": so it will work every time someone import this file. The difference is that here its an activation of existing class (MetricPipline and its inner metric) . This will be tested that way every preparation test for every PR.

@elronbandel
Copy link
Member

Related: #1132

@matanor
Copy link
Member Author

matanor commented Aug 11, 2024

I think this test should be at the preparation card without the if __name__ == "__main__": so it will work every time someone import this file. The difference is that here its an activation of existing class (MetricPipline and its inner metric) . This will be tested that way every preparation test for every PR.

hmm... for context_correctness we just did the same transition here: #1092 (you advised it).. and its the same kind of tests (MetricPipeline over existing metrics)

Also, if you look at https://github.com/IBM/unitxt/blame/main/prepare/metrics/rag_answer_correctness.py#L63, @eladven explicitly added the if __name__ == "__main__": statement 3 weeks ago (i think cause these metrics use models)

@matanor
Copy link
Member Author

matanor commented Aug 14, 2024

@elronbandel @dafnapension @yoavkatz

So, what should i do with this PR? (close it?)

Generally, where should tests that involve models be?

@yoavkatz
Copy link
Member

There are two issues here. (1) Where to place the tests (2) when we run the tests

The rule is that testing python metric classes should be done in test_metrics, while checking catalog assets should be done at the prepare file when the catalog assets are added. Since in your case, you are testing assets, it should be in the prepare file.

The second point is that we can not afford to run tests that require large model loading on each PR. This is because it has to download large models to a new VM on each PR. This is why long tests are executed only with the if __name__ == "__main__": condition. So they are executed only if explicitly run.

@elronbandel - how do you suggest to avoid this and still keep reasonable runtimes?

@matanor matanor changed the title move rag context correctness metrics tests to test_metrics.py move rag answer correctness metrics tests to test_metrics.py Aug 15, 2024
@matanor
Copy link
Member Author

matanor commented Aug 15, 2024

Thanks @yoavkatz . This PR tests with a model, so i will close it (and we keep the test in the prepare script, under the if __name__ == "__main__": condition).

A question about another test please:
The test_context_correctness test is in test_metrics.py. It tests rag context correctness metrics, which are MetricPipeline objects.
Is this considered testing catalog assets? Should i move it to rag_context_correctness.py?

Generally, tests in test_metrics.py and tests placed in the prepare cards (without the if __name__ == "__main__": condition) are both executed with each PR, right?

@yoavkatz
Copy link
Member

Thanks @yoavkatz . This PR tests with a model, so i will close it (and we keep the test in the prepare script, under the if __name__ == "__main__": condition).

A question about another test please: The test_context_correctness test is in test_metrics.py. It tests rag context correctness metrics, which are MetricPipeline objects. Is this considered testing catalog assets? Should i move it to rag_context_correctness.py?

Generally, tests in test_metrics.py and tests placed in the prepare cards (without the if __name__ == "__main__": condition) are both executed with each PR, right?

Yes. It makes sense to move test_context_correctness to the prepare file.

And yes , the test_metrics are run in the Test Library Code git action and the prepare cards in Test Catalog Preparation git action. This is done on every PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants