Improvement to ck integration #1859

pfultz2 · 2023-06-20T20:27:39Z

Add a CI job to test CK
Add MIGRAPHX_TUNE_CK env variable to only do tuning for CK
Continue tuning even when there is invalid configs
Fix a bug with parallel compilation not using all available threads
Add additional test for gemms using half types
Removed int32 as supported type since it doesnt pass our test suite

codecov · 2023-06-20T23:14:54Z

Codecov Report

Merging #1859 (60e1f8e) into develop (1f827a7) will not change coverage.
The diff coverage is n/a.

❗ Current head 60e1f8e differs from pull request most recent head 6696d44. Consider uploading reports for the commit 6696d44 to get more accurate results

@@           Coverage Diff            @@
##           develop    #1859   +/-   ##
========================================
  Coverage    91.39%   91.39%           
========================================
  Files          419      419           
  Lines        15542    15542           
========================================
  Hits         14204    14204           
  Misses        1338     1338

src/targets/gpu/compile_ops.cpp

umangyadav · 2023-06-26T13:52:59Z

src/targets/gpu/compile_ops.cpp

@@ -185,7 +203,10 @@ void par_compile(std::size_t n, F f)
 {
    if(n == 0)
        return;
-    par_for(n, n / value_of(MIGRAPHX_GPU_COMPILE_PARALLEL{}, n), f);
+    auto d = value_of(MIGRAPHX_GPU_COMPILE_PARALLEL{});


Suggested change

auto d = value_of(MIGRAPHX_GPU_COMPILE_PARALLEL{});

auto d = value_of(MIGRAPHX_GPU_COMPILE_PARALLEL{}, n);

That wont work, because the n will be cached when MIGRAPHX_GPU_COMPILE_PARALLEL is not set.

src/targets/gpu/jit/ck_gemm.cpp

migraphx-bot · 2023-06-29T15:12:38Z

Test	Batch	Rate new 6696d4	Rate old d6aade	Diff	Compare
torchvision-resnet50	64	894.85	895.65	-0.09%	✅
torchvision-resnet50_fp16	64	5,313.14	5,315.31	-0.04%	✅
torchvision-densenet121	32	1,123.99	1,126.78	-0.25%	✅
torchvision-densenet121_fp16	32	3,275.44	3,282.20	-0.21%	✅
torchvision-inceptionv3	32	592.96	593.29	-0.06%	✅
torchvision-inceptionv3_fp16	32	2,494.23	2,517.31	-0.92%	✅
cadene-inceptionv4	16	328.70	329.04	-0.10%	✅
cadene-resnext64x4	16	396.92	397.15	-0.06%	✅
slim-mobilenet	64	7,132.06	7,136.84	-0.07%	✅
slim-nasnetalarge	64	159.84	160.01	-0.11%	✅
slim-resnet50v2	64	1,089.43	1,090.55	-0.10%	✅
bert-mrpc-onnx	8	718.90	719.50	-0.08%	✅
bert-mrpc-tf	1	369.81	370.16	-0.10%	✅
pytorch-examples-wlang-gru	1	294.66	304.54	-3.25%	🔴
pytorch-examples-wlang-lstm	1	298.45	308.91	-3.39%	🔴
torchvision-resnet50_1	1	91.44	91.91	-0.51%	✅
torchvision-inceptionv3_1	1	128.98	128.80	0.14%	✅
cadene-dpn92_1	1	333.88	336.62	-0.82%	✅
cadene-resnext101_1	1	236.49	237.34	-0.36%	✅
slim-vgg16_1	1	53.34	53.36	-0.04%	✅
slim-mobilenet_1	1	1,485.31	1,523.34	-2.50%	✅
slim-inceptionv4_1	1	101.96	101.62	0.34%	✅
onnx-taau-downsample	1	316.36	316.09	0.08%	✅
dlrm-criteoterabyte	1	21.42	21.42	0.01%	✅
dlrm-criteoterabyte_fp16	1	39.95	39.96	-0.03%	✅
agentmodel	1	5,413.13	5,733.29	-5.58%	🔴
unet_fp16	2	52.63	52.83	-0.37%	✅

This build is not recommended to merge 🔴

migraphx-bot · 2023-06-29T15:12:39Z

:white_check_mark:bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

:white_check_mark:bert-mrpc-tf: PASSED: MIGraphX meets tolerance

:white_check_mark:pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

:white_check_mark:pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

:white_check_mark:torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

:white_check_mark:torchvision-inceptionv3_1: PASSED: MIGraphX meets tolerance

🔴cadene-dpn92_1: FAILED: MIGraphX is not within tolerance - check verbose output

:white_check_mark:cadene-resnext101_1: PASSED: MIGraphX meets tolerance

:white_check_mark:slim-vgg16_1: PASSED: MIGraphX meets tolerance

:white_check_mark:slim-mobilenet_1: PASSED: MIGraphX meets tolerance

:white_check_mark:slim-inceptionv4_1: PASSED: MIGraphX meets tolerance

:white_check_mark:dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

:white_check_mark:agentmodel: PASSED: MIGraphX meets tolerance

:white_check_mark:unet: PASSED: MIGraphX meets tolerance

pfultz2 · 2023-06-29T16:02:18Z

@turneram Any feedback?

turneram

If we want to leave int32 enabled, then adding the requirement that m, n, and k are all divisible by 4 appears to maintain correctness. We could also investigate further to get more precise criteria and then add it back in another PR.

…nto ci-ck

pfultz2 added 8 commits June 19, 2023 16:56

Run CK job on jenkins

1758944

Format

8dd49c1

Handle compile failures

674b844

Format

13b42a2

Fix parallel compile

149e783

Format

9b856dd

Drop int32 since its broken

60a860b

Add half versions for gemm to better test ck

58f60f2

pfultz2 requested review from umangyadav and turneram June 20, 2023 20:27

Remove extra print

1c968dc

pfultz2 added 8 commits June 21, 2023 15:18

Print the picked solution

53895e9

Add ck_gemm

56153e0

Use tuning_for function

7a0d743

Update CK hash

e7b24d8

Merge branch 'develop' into ci-ck

8313ab0

Update hash

5e6483b

Use IsValid method instead

9dad05c

Update hash

7a99c37

umangyadav reviewed Jun 26, 2023

View reviewed changes

pfultz2 added 2 commits June 28, 2023 16:10

Add extra access

4ae9a20

Disable check since it has too many FPs

177c4fe

Merge branch 'develop' into ci-ck

1ca9aff

umangyadav approved these changes Jun 29, 2023

View reviewed changes

pfultz2 added 2 commits June 30, 2023 09:58

Merge branch 'develop' into ci-ck

617a414

space

cc8fae7

pfultz2 added 2 commits June 30, 2023 09:59

Format

1a98eda

Merge branch 'develop' into ci-ck

7ff2b82

turneram approved these changes Jun 30, 2023

View reviewed changes

pfultz2 and others added 5 commits June 30, 2023 17:26

Enable int32 again

68fbaf2

Format

b2494f2

Merge branch 'ci-ck' of github.com:ROCmSoftwarePlatform/AMDMIGraphX i…

e9db78f

…nto ci-ck

Merge branch 'develop' into ci-ck

a9b0d94

Merge branch 'develop' into ci-ck

6696d44

causten merged commit 3c9df3b into develop Jul 2, 2023
11 checks passed

causten deleted the ci-ck branch July 2, 2023 18:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvement to ck integration #1859

Improvement to ck integration #1859

pfultz2 commented Jun 20, 2023 •

edited

Loading

codecov bot commented Jun 20, 2023 •

edited

Loading

umangyadav Jun 26, 2023

pfultz2 Jun 26, 2023

migraphx-bot commented Jun 29, 2023 •

edited

Loading

migraphx-bot commented Jun 29, 2023

pfultz2 commented Jun 29, 2023

turneram left a comment

	auto d = value_of(MIGRAPHX_GPU_COMPILE_PARALLEL{});
	auto d = value_of(MIGRAPHX_GPU_COMPILE_PARALLEL{}, n);

Improvement to ck integration #1859

Improvement to ck integration #1859

Conversation

pfultz2 commented Jun 20, 2023 • edited Loading

codecov bot commented Jun 20, 2023 • edited Loading

Codecov Report

umangyadav Jun 26, 2023

Choose a reason for hiding this comment

pfultz2 Jun 26, 2023

Choose a reason for hiding this comment

migraphx-bot commented Jun 29, 2023 • edited Loading

migraphx-bot commented Jun 29, 2023

pfultz2 commented Jun 29, 2023

turneram left a comment

Choose a reason for hiding this comment

pfultz2 commented Jun 20, 2023 •

edited

Loading

codecov bot commented Jun 20, 2023 •

edited

Loading

migraphx-bot commented Jun 29, 2023 •

edited

Loading