New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Feat/add t5 support #131

Merged

ayoub-louati merged 79 commits into main from feat/add-t5-support

Jun 1, 2023

Contributor

ayoub-louati commented Aug 23, 2022 •

edited

Loading

Add t5 support in convert script (transformer-deploy).

WIP

ayoub-louati added 5 commits

August 17, 2022 10:14


          add t5 onnx conversion

01edf51


          add t5 onnx conversion 2

add8da5


          fix t5 onnx conversion

e6275d2


          feat: add t5 onnx conversion support

c8c8e65


          add t5 tensorrt conversion

bf97065

ayoub-louati requested a review from pommedeterresautee

August 23, 2022 13:29

ayoub-louati added 5 commits

August 23, 2022 16:26


          feat: add t5 tensorrt conversion support

21ed2ec


          feat: add t5 tensorrt conversion support

9cf9357


          feat: add triton confs for onnx and tensorrt t5 models

fea4a40


          feat: add triton configuration for t5 converted models

a7bb7f4


          feat: fix code format

1f38b1c

ayoub-louati marked this pull request as ready for review

August 29, 2022 13:43

Contributor Author

ayoub-louati commented Aug 29, 2022

@pommedeterresautee I'm still testing triton but i need your review for the other parts (conversion onnx and tensorRt) and i'd like to see if those conversions can run on your machine without errors.


          feat: add missing fragments of code for t5 onnx conversion

6c16391

ayoub-louati self-assigned this

ayoub-louati added the enhancement label

ayoub-louati requested a review from gaetansnl

September 1, 2022 12:15

ayoub-louati added 4 commits

September 2, 2022 09:57


          feat: fix missing imports

96d24c0


          feat: fix onnx decoder conversion

02658a4


          feat: add testing converted onnx t5 text generation

a4dde3f


          feat: finalize t5 onnx conversion

5d91212

gaetansnl reviewed

View reviewed changes

gaetansnl left a comment

It's only the first part of the review because thing are still in progress. triton, etc...

src/transformer_deploy/backends/pytorch_utils.py Outdated

@@ @@ -26,17 +26,24 @@ @@
               def infer_classification_pytorch(
-                  model: PreTrainedModel, run_on_cuda: bool
+                  model: PreTrainedModel, run_on_cuda: bool, generate_text: bool = False

gaetansnl Sep 8, 2022

maybe it's misleading that we have generate_text in a function infer_classification_pytorch

gaetansnl Sep 8, 2022

maybe could be removed depending on my comment below

Contributor Author

ayoub-louati Feb 15, 2023

I removed it and I added a dedicated inference funcion for text generation with appropriate parameters:

def infer_text_generation(
    model: PreTrainedModel, run_on_cuda: bool, min_length: int, max_length: int, num_beams: int
) -> Callable[[Dict[str, torch.Tensor]], torch.Tensor]:

src/transformer_deploy/convert.py Outdated


		if run_on_cuda:

gaetansnl Sep 8, 2022

Maybe move this line after model_pytorch.eval()

Contributor Author

ayoub-louati Feb 15, 2023

done

src/transformer_deploy/convert.py Outdated

                   timings = {}
                   def get_pytorch_infer(model: PreTrainedModel, cuda: bool, task: str):
+                      if commands.generative_model == "t5":
+                          return infer_classification_pytorch(model=model, run_on_cuda=cuda, generate_text=True)

gaetansnl Sep 8, 2022

seems to be used for benchmark and accuracy check. Do we need to do it using .generate and not simply call the model one time like it was done before ? Like for GPT2

src/transformer_deploy/convert.py Outdated

                       if task in ["classification", "text-generation", "token-classification", "question-answering"]:
                           return infer_classification_pytorch(model=model, run_on_cuda=cuda)
                       if task == "embedding":
                           return infer_feature_extraction_pytorch(model=model, run_on_cuda=cuda)
                       raise Exception(f"unknown task: {task}")
+                  if run_on_cuda:

gaetansnl Sep 8, 2022

Line duplicated ?

Contributor Author

ayoub-louati Feb 15, 2023

removed

src/transformer_deploy/convert.py Outdated

-                      )
+                      if commands.generative_model == "t5":
+                          encoder_onnx_path = os.path.join(commands.output, "t5-encoder") + "/model.onnx"
+                          tensorrt_encoder_path = os.path.join(commands.output, "t5-encoder") + "/model.plan"

gaetansnl Sep 8, 2022

IMO it's better to declare variable close to where they are used

Contributor Author

ayoub-louati Feb 15, 2023

done

src/transformer_deploy/convert.py Outdated

+                              results = (
+                                  inference_onnx_binding(model_onnx=ort_model, inputs=inputs, device=commands.device)
+                                  if commands.generative_model != "t5"
+                                  else ort_model.generate(

gaetansnl Sep 8, 2022

same here, do we need generate ? it's for accuracy check

src/transformer_deploy/utils/t5_model.py Outdated

+                      inference: Callable[[torch.Tensor], torch.Tensor],
+                      torch_type: torch.dtype = torch.float32,
+                  ):
+                      super().__init__(config, device, encoder_path, decoder_path, torch_type)

gaetansnl Sep 8, 2022

I don't understand the purpose of those arguments

Contributor Author

ayoub-louati Feb 15, 2023

removed

src/transformer_deploy/utils/t5_utils.py Outdated

		return model_decoder(input_ids=decoder_input_ids, encoder_hidden_states=encoder_hidden_states)


		def are_equal(a: torch.Tensor, b: torch.Tensor, atol: float = fp16_default_tolerance) -> None:

gaetansnl Sep 8, 2022

IMO we need only one in such function in the whole project

Contributor Author

ayoub-louati Feb 15, 2023

I kept this one, because the other function is in the demo folder

src/transformer_deploy/utils/t5_utils.py Outdated

+                  for (o_dec_k, o_dev_v, o_enc_k, o_enc_v), (p_dec_k, p_dev_v, p_enc_k, p_enc_v) in zip(
+                      out_decoder_onnx_no_cache.past_key_values, out_decoder_pytorch.past_key_values
+                  ):
+                      are_equal(a=o_dec_k, b=p_dec_k)

gaetansnl Sep 8, 2022

Do we need to do all quality check ? The test in convert isn't enough ?
In the other convert_to_onnx we don't do this. So if we keed something like that it should be extracted IMO

Contributor Author

ayoub-louati Feb 15, 2023

I removed the quality check (I think the test in conversion is enough)

src/transformer_deploy/utils/t5_utils.py Outdated

		return inputs


		def decoder_onnx_inference(

gaetansnl Sep 8, 2022

code duplicated ?

Contributor Author

ayoub-louati Feb 15, 2023

removed

ayoub-louati added 2 commits

September 9, 2022 00:25


          feat: fix decoder if node configuration for triton

e94e3d6


          feat: fix triton inference server

7a7a3ae

pommedeterresautee removed their request for review

September 19, 2022 14:09

ayoub-louati added 5 commits

September 20, 2022 15:20


          feat:test with hardcoded model names

3c6dfd9


          fix configuration for triton inference

2167ce7


          feat: change model inputs (enable_cache int instead of bool)

3933e9b


          feat: add t5 conversion (fix config + t5 model inference)

54b1c10


          feat: add t5 conversion test

d0f943e

ayoub-louati and others added 25 commits

April 27, 2023 15:08


          Merge branch 'feat/add-t5-support' of github.com:ELS-RD/transformer-d…

3608c28

…eploy into feat/add-t5-support


          fix: use python3.8 for github action workflow

9059ca4


          fix: try to get nvidia-pyindex to work with pip

06bb11b


          feat: fix dockerfile

19ffe7e


          Merge branch 'feat/add-t5-support' of github.com:ELS-RD/transformer-d…

f0b7dca

…eploy into feat/add-t5-support


          feat: update workflow

e4e1206


          feat: fix workflow

c5eca56


          feat: try to fix workflow

f1481f2


          feat: update dockerfile

17af087


          feat: remove code formatting

ca18698


          feat: update workflow

d6a050c


          fix: install package in workflow

a324296


          fix: use pip3

11922d7


          fix: do whatever possible to make pyindex install

de732d7


          feat: fix tests by removing trt dependency in tests

f958bfc


          Merge branch 'feat/add-t5-support' of github.com:ELS-RD/transformer-d…

554f30b

…eploy into feat/add-t5-support


          feat: remove trt dependency for tests

eb4584d


          feat: use python3.9

7c635ae


          fix: fix tests on CI

b6f314f


          fxi: install torch cpu


          fix: fix requirements versions

0b02497


          fix: downgrade the IR version for onnxruntime

19c9e7e


          fix: downgrade numpy

693518c


          fix: downgrade onnx version

884f426


          feat: remove code formatting

9eaa451

ayoub-louati requested a review from jonathlela

June 1, 2023 14:48

jonathlela approved these changes

View reviewed changes

Contributor

jonathlela left a comment

Everything good for me, tested locally.

ayoub-louati merged commit 6b88e24 into main

ayoub-louati deleted the feat/add-t5-support branch

June 1, 2023 14:52

jonathlela mentioned this pull request

Support converting T5 model #122

Closed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels