Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Quantization speedup]support TensorRT8.0.0 #3866

Merged
merged 3 commits into from
Jul 9, 2021

Conversation

linbinskn
Copy link
Contributor

This PR aims to support the latest TensorRT version. Current quantization speedup tool is implemented based on TensorRT7.0. However, the latest tensorrt python api has changed which separates network definition and configuration after version 8.0. All configuration of low precision have been moved to IBuilderConfig and our current implementation can't work on it (the problem raised in issue #3857 ). This PR supports the new TensorRT version and the new api.

@J-shang
Copy link
Contributor

J-shang commented Jul 1, 2021

this means we upgrade TensorRT dependency to >= 8.0? Is these upgrade available for most users?
And pls update the quantization speed up doc for this change.

@linbinskn
Copy link
Contributor Author

this means we upgrade TensorRT dependency to >= 8.0? Is these upgrade available for most users?
And pls update the quantization speed up doc for this change.

@J-shang Good point! Have updated quantization speedup doc. I think the departure of network definition and configuration in latest TensorRT version is rational, and support the latest version is necessary for us. I believe most of people will use the latest version especially these people who want to try mixed precision in TensorRT.

@J-shang J-shang merged commit a4760ce into microsoft:master Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants