Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KV Cache Injection] Causal Mask for CodeGen #1676

Conversation

dbogunowicz
Copy link
Contributor

@dbogunowicz dbogunowicz commented Jul 18, 2023

This PR:

  • implements causal mask injection for the CodeGen model
  • begins a process of refactoring the AdditionalTransforms objects, so that the AdditionalTransformsBase is to share more and more helper methods that can be shared across its children classes and reduce boilerplate code. This process will be continued with the incoming causal mask injections for other LLMs

For testing please refer to: neuralmagic/deepsparse#1127

@dbogunowicz dbogunowicz changed the title [WiP] [CodeGen][Causal Mask] [KV Cache Injection] Causal Mask for CodeGen Jul 20, 2023
@dbogunowicz dbogunowicz changed the base branch from main to feature/damian/refactor_injection July 20, 2023 06:41
@dbogunowicz dbogunowicz marked this pull request as ready for review July 20, 2023 06:41
@dbogunowicz dbogunowicz changed the base branch from feature/damian/refactor_injection to main July 20, 2023 06:48
@dbogunowicz dbogunowicz changed the base branch from main to feature/damian/refactor_injection July 20, 2023 06:48
@dbogunowicz dbogunowicz changed the base branch from feature/damian/refactor_injection to main July 20, 2023 07:49
@dbogunowicz dbogunowicz changed the base branch from main to feature/damian/refactor_injection July 20, 2023 07:50
@@ -1174,3 +1177,41 @@ def detach(x: Union[torch.Tensor, List, Tuple]):
return tuple([detach(e) for e in x])
else:
raise ValueError("Unexpected type to detach")


def adjust_quantization_for_onnx_export(module: torch.nn.Module) -> torch.nn.Module:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this needs rebase

Copy link
Member

@bfineran bfineran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM pending rebase

@dbogunowicz dbogunowicz merged commit 9368269 into feature/damian/refactor_injection Jul 25, 2023
@dbogunowicz dbogunowicz deleted the feature/damian/causal_mask_codegen branch July 25, 2023 14:35
bfineran pushed a commit that referenced this pull request Jul 27, 2023
…1677)

* initial commit

* [KV Cache Injection] Causal Mask for CodeGen (#1676)

* initial implementation; testing now

* fix a small blunder

* cleanup

---------

Co-authored-by: bogunowicz@arrival.com <bogunowicz@arrival.com>

* [KV Cache Injection] Causal Mask for OPT (#1688)

* initial implementation; testing now

* fix a small blunder

* cleanup

* initial implementation

* on to testing with deepsparse

---------

Co-authored-by: bogunowicz@arrival.com <bogunowicz@arrival.com>

* replace boolean causal mask for int64 causal mask

* better logging info

* allow transformations to be also a list

---------

Co-authored-by: bogunowicz@arrival.com <bogunowicz@arrival.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants