Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

signed-int4 support for Pack/Unpack #3359

Merged
merged 1 commit into from
Sep 18, 2024
Merged

Conversation

lakhinderwalia
Copy link
Contributor

No description provided.

Copy link

codecov bot commented Aug 12, 2024

Codecov Report

Attention: Patch coverage is 91.66667% with 4 lines in your changes missing coverage. Please review.

Project coverage is 92.14%. Comparing base (51a24d8) to head (a38549f).
Report is 30 commits behind head on develop.

Files with missing lines Patch % Lines
src/include/migraphx/op/pack_int4.hpp 84.00% 4 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #3359      +/-   ##
===========================================
- Coverage    92.16%   92.14%   -0.02%     
===========================================
  Files          504      504              
  Lines        20486    20505      +19     
===========================================
+ Hits         18880    18895      +15     
- Misses        1606     1610       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

0xBA,
0xDC,
0xFE};
0xFF,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add tests "ref" tests for "unpack" as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There already is a test for pack_unpack. Isn't that enough?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to clarify further, there are ref tests that check for pack operations, and then followed by its unpack counterpart. And those initial and final values are compared appropriately. Thanks.

Copy link
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the math's fine here, yeah

On the MLIR side, I'd prefer unpack to carry an attribute indicating signedness behavior instead, but that can be special-cased at the edge

int8_t val2 = inp[in_data_multi_idx];
val2 = std::min(std::max(val2, min_4bit), max_4bit); // clip

out[i] = (val2 << 4) | (val1 & 0xf); // pack
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re these warnings, you'll want to static_cast yourself to the unsigned version and then static_cast yourself back.

... Heck, personally, I'd prefer that the signededness of the packed values not be indicated by int8/uint8 in the packed buffer - instead, put a little flag on unpack whether the uint8s represent signed or unsigned int4's

out[data_idx] = val1;

data_idx[axis] += 1;
val2 >>= 4;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this does look like a correct implementation of signed unpack

@migraphx-bot
Copy link
Collaborator

Test Batch Rate new
a38549
Rate old
e583e7
Diff Compare
torchvision-resnet50 64 3,251.44 3,250.33 0.03%
torchvision-resnet50_fp16 64 6,994.49 6,992.29 0.03%
torchvision-densenet121 32 2,434.77 2,433.11 0.07%
torchvision-densenet121_fp16 32 4,052.15 4,103.85 -1.26%
torchvision-inceptionv3 32 1,635.96 1,635.59 0.02%
torchvision-inceptionv3_fp16 32 2,739.84 2,740.97 -0.04%
cadene-inceptionv4 16 776.84 776.60 0.03%
cadene-resnext64x4 16 808.20 808.06 0.02%
slim-mobilenet 64 7,457.36 7,457.10 0.00%
slim-nasnetalarge 64 208.18 207.88 0.14%
slim-resnet50v2 64 3,354.95 3,343.78 0.33%
bert-mrpc-onnx 8 1,156.68 1,151.31 0.47%
bert-mrpc-tf 1 311.10 311.19 -0.03%
pytorch-examples-wlang-gru 1 417.53 424.43 -1.63%
pytorch-examples-wlang-lstm 1 380.25 377.46 0.74%
torchvision-resnet50_1 1 818.77 812.29 0.80%
cadene-dpn92_1 1 437.20 435.71 0.34%
cadene-resnext101_1 1 382.90 383.82 -0.24%
onnx-taau-downsample 1 346.30 344.50 0.52%
dlrm-criteoterabyte 1 35.08 35.09 -0.04%
dlrm-criteoterabyte_fp16 1 58.40 58.05 0.59%
agentmodel 1 7,851.88 7,873.33 -0.27%
unet_fp16 2 57.82 58.01 -0.33%
resnet50v1_fp16 1 941.42 917.06 2.66%
resnet50v1_int8 1 973.23 920.98 5.67% 🔆
bert_base_cased_fp16 64 1,154.04 1,151.61 0.21%
bert_large_uncased_fp16 32 356.22 355.61 0.17%
bert_large_fp16 1 211.52 210.45 0.51%
distilgpt2_fp16 16 2,163.52 2,156.95 0.30%
yolov5s 1 538.96 535.99 0.55%
tinyllama 1 43.40 43.42 -0.04%
vicuna-fastchat 1 172.20 174.03 -1.06%
whisper-tiny-encoder 1 411.41 406.71 1.15%
whisper-tiny-decoder 1 435.05 425.82 2.17%

Check results before merge 🔆

@migraphx-bot
Copy link
Collaborator


     ✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

     ✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

     ✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

     ✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

     ✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

     ✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

     ✅ agentmodel: PASSED: MIGraphX meets tolerance

     ✅ unet: PASSED: MIGraphX meets tolerance

     ✅ resnet50v1: PASSED: MIGraphX meets tolerance

     ✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output


     ✅ bert_large: PASSED: MIGraphX meets tolerance

     ✅ yolov5s: PASSED: MIGraphX meets tolerance

     ✅ tinyllama: PASSED: MIGraphX meets tolerance

     ✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

     ✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

     ✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

Copy link
Collaborator

@pfultz2 pfultz2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but can we get another reviewer from migraphx team?

@causten causten merged commit 5e8b60a into develop Sep 18, 2024
47 of 48 checks passed
@causten causten deleted the lw/int4_pack_signed_fixes branch September 18, 2024 12:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add signed int4 support for pack_int4 operator. Currently only supports unsigned.
6 participants