Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Branch][LLM Testing] Full Testing Harness for LLMs #1216

Merged
merged 20 commits into from
Sep 13, 2023

Conversation

dbogunowicz
Copy link
Contributor

@dbogunowicz dbogunowicz commented Aug 29, 2023

The implementation of the test harness for LLMs. By default, the tests are turned off so that we do not choke GHA.
To enable tests: remove @pytest.mark.skip(reason="Those tests are too heavy to run as a normal part of the CI.")
@pytest.mark.skip(reason="Those tests are too heavy to run as a normal part of the CI.")
and run pytest tests/deepsparse/transformers/pipelines/test_text_generation.py
Future consideration: adding config and utilizing small toy models to make tests extremely lightweight.

Includes PRs:

dbogunowicz and others added 8 commits August 28, 2023 08:54
* initial commit

* finish creation of helper objects

* Update tests/conftest.py

* small refactor

* [Feature Branch][LLM Testing] LLM Testing Suite (#1227)

* Update README.md

* Update src/deepsparse/yolov8/README.md

* Update text_generation.py

* quality

* readability

* all tests passing

* added some full kv cache tests

* initial commit

* ready for review

* Delete tests/deepsparse/transformers/pipelines/proposal_text_generation_tests.md
Copy link
Contributor

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How confident are we in our test coverage? Possibly add tests when running with deterministic off or multiple input sequences?

Copy link
Contributor

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Remove ORT ground truth class and use deepsparse pipeline instead

Copy link
Member

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. as discussed offline - will need some refactors to move cleanly to a config based method

Copy link
Contributor

@dsikka dsikka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Support for getting this to run on a nightly basis is still pending?

@dbogunowicz dbogunowicz merged commit 907ea83 into main Sep 13, 2023
7 of 13 checks passed
@dbogunowicz dbogunowicz deleted the feature/damian/llm_testing_feature_branch branch September 13, 2023 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants