Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QAT folding update #1639

Merged
merged 26 commits into from
Jul 10, 2023
Merged

QAT folding update #1639

merged 26 commits into from
Jul 10, 2023

Conversation

anmarques
Copy link
Member

@anmarques anmarques commented Jun 23, 2023

Incorporate these changes to the ONNXToDeepsparse QAT folding:

In addition, fixed the following issues:

  • relaxed pattern matching for Convolution folding that was leading to unfolded convs
  • allowed MatMul weights to be in either input 0 or input 1

Testing plan:

  • Exported the following models w/o QAT folding and then manually applied ONNXToDeepsparse to fold graphs:
  • YOLOv8n base_quant
  • MobileBERT 14layer_pruned50_quant-none-vnni
  • YOLOv5s pruned50_quant
  • DistilBERT one-shot pruned quantized

@anmarques anmarques requested a review from bfineran June 23, 2023 22:13
@anmarques anmarques requested a review from natuan June 27, 2023 14:58
abhinavnmagic
abhinavnmagic previously approved these changes Jul 1, 2023
natuan
natuan previously approved these changes Jul 1, 2023
@anmarques anmarques dismissed stale reviews from abhinavnmagic and natuan via 7629f5e July 5, 2023 17:59
@bfineran
Copy link
Member

GHA failure unrelated - merging given extensive testing and previous reviews

@bfineran bfineran merged commit d0ba055 into main Jul 10, 2023
9 of 10 checks passed
@bfineran bfineran deleted the feature/qat_folding_update branch July 10, 2023 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants