Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: look into ONNX enhanched transformer embeddings #14

Open
davidberenstein1957 opened this issue Nov 15, 2022 · 3 comments
Open

feat: look into ONNX enhanched transformer embeddings #14

davidberenstein1957 opened this issue Nov 15, 2022 · 3 comments
Labels
enhancement New feature or request

Comments

@davidberenstein1957
Copy link
Owner

Creating embeddings roughly takes 50% of the inference time. allennlp/modules/token_embedders/pretrained_transformer_embedder.py hold the logic for creating these embeddings. Make sure we can call them in a faster way.

@davidberenstein1957 davidberenstein1957 added the enhancement New feature or request label Nov 15, 2022
@davidberenstein1957
Copy link
Owner Author

https://huggingface.co/docs/transformers/serialization
xlm model achitectures are supported by huggingface
POC

python -m transformers.onnx --model="microsoft/Multilingual-MiniLM-L12-H384" onnx/                                                                                                          
from transformers import AutoTokenizer
from onnxruntime import InferenceSession

tokenizer = AutoTokenizer.from_pretrained("microsoft/Multilingual-MiniLM-L12-H384")
session = InferenceSession("onnx/model.onnx")
# ONNX Runtime expects NumPy arrays as input
inputs = tokenizer("Using DistilBERT with ONNX Runtime!", return_tensors="np")
outputs = session.run(output_names=["last_hidden_state"], input_feed=dict(inputs))
print(outputs)

@davidberenstein1957
Copy link
Owner Author

Align this effort with davidberenstein1957/fast-sentence-transformers#5.

Note that, potentially, model quantization will lead to degraded model performance, because the embeddings are used for a downstream task.

@davidberenstein1957
Copy link
Owner Author

Probably also need this feature.
davidberenstein1957/fast-sentence-transformers#7

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant