Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] (suggested fix) mmrazor.models.algorithms.mm_architecture.MMArchitectureQuant.get_deploy_model() fails if predict mode lacks nodes from the model.quantizer.tracer.skipped_methods configuration, but the architecture quantizer.prepare(fp32_model) has these nodes. #642

Open
elisa-aleman opened this issue Apr 23, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@elisa-aleman
Copy link

elisa-aleman commented Apr 23, 2024

Describe the bug

Looking to do a QAT of the TopdownPoseEstimator model, I skipped RTMCCHead.predict and RTMCCHead.loss to avoid errors during tracing. This model trains fine as a fake quantized model.

Once running the deploy with a local branch merge from the mmdeploy branch for_mmrazor to the v1.3.1 branch, and applying patches decribed in #632 #633 #634 and #637, I run into an error.


  File .../mmrazor/models/algorithms/quantization/mm_architecture.py, line 362, in get_deploy_model
    observed_model.load_state_dict(quantized_state_dict)
  File .../torch/nn/modules/module.py, line 2041, in load_state_dict
    raise RuntimeError("Error(s) in loading state_dict for {}:\n\t{}".format(
RuntimeError: Error(s) in loading state_dict for GraphModule

I suggest to use the tensor mode instead, but i'be yet to test it.

def get_deploy_model(self):
        """Prepare for deploy to the backend with mmdeploy, which will be used
        in mmdeploy, and usually includes as follows:

        1. prepare for the float model rewritten by mmdeploy.
        2. load checkpoint consists of float weight and quantized params in
        mmrazor.
        3. post process weight fakequant for exporting .onnx that meet
        the backend's requirement.
        """
        device = next(self.parameters()).device
-       quantized_state_dict = self.qmodels['predict'].state_dict()
+       quantized_state_dict = self.qmodels['tensor'].state_dict()
        fp32_model = self.architecture
        self.quantizer.convert_batchnorm2d(fp32_model)
        observed_model = self.quantizer.prepare(fp32_model)
        observed_model.load_state_dict(quantized_state_dict)

        self.quantizer.post_process_for_deploy(
            observed_model,

EDIT: after testing this I got this error:

  File .../mmrazor/models/algorithms/quantization/mm_architecture.py, line 376, in get_deploy_model
    fakequant_new = QConfigHandler.replace_fakequant(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File .../torch/nn/modules/module.py, line 1614, in __getattr__
    raise AttributeError(...
AtributeError: 'FixedQParamsObserver' object has no attribute 'min_val'

I gather this means that to fix this one needs to use the predict mode state dict, even with the missing keys, somehow.

@elisa-aleman elisa-aleman added the bug Something isn't working label Apr 23, 2024
@elisa-aleman
Copy link
Author

Looking more into it, it seems the tensor mode fix is correct, but the next error has to do with the model having some modules that can't be fake quantized modules like Hard Sigmoid, which are defined in torch.ao.quantization.qconfig_mapping._FIXED_QPARAMS_OP_TO_OBSERVER.

@Veccoy
Copy link

Veccoy commented Jun 14, 2024

Hi! How did you use the output of the method get_deploy_model of MMArchitectureQuant to then convert the model to onnx?

I'm trying to export a quantized model using PyTorch 1.13.1 and using MMDeploy for_mmrazor branch didn't work for me... So, looking at MMRazor, I have created a hook called at the end of the training, that gets the output of get_deploy_model (an torch ObservedGraphModule) and passes it to torch.onnx.export with some arguments.

I'm not sure what I'm doing is right, but I also don't understand why using MMDeploy if MMRazor provides quantizers with onnx_export methods...

EDIT: I have an exported onnx file but it seems that I only have quantized weights and not activations

@elisa-aleman
Copy link
Author

I used a merge of the most recent mmdeploy with the for_mmrazor branch.

@Veccoy
Copy link

Veccoy commented Jun 17, 2024

Which script(s) are you using? What is the purpose of having a get_deploy_model method in MMRazor that returns an GraphModule and a deploy.py script in MMDeploy that takes a checkpoint file as input? I'm confused. Moreover, we already have an export_onnx method in TorchNativeQuantizer of MMRazor and it seems that the get_deploy_model method is never called in the QAT training of MMRazor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants