Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DirectML EP gives wrong results on both Integrated and Discrete GPUs #19837

Open
AdarshAcharya5 opened this issue Mar 9, 2024 · 4 comments
Open
Assignees
Labels
ep:DML issues related to the DirectML execution provider

Comments

@AdarshAcharya5
Copy link

AdarshAcharya5 commented Mar 9, 2024

I'm trying to run inference on a model in C++, but as it turns out, I get completely wrong results when I run it on DirectML EP, but running it on CPU works just fine.
Sample outputs:

CPU (Correct):

7.9641705378890038e-03
5.1060058176517487e-03
1.6217185184359550e-03
3.2245237380266190e-03
1.9704271107912064e-03
-2.6769749820232391e-03
-3.4305416047573090e-03
-5.9413574635982513e-03
-8.1969611346721649e-03
-5.6667793542146683e-03
-4.0454901754856110e-03
-4.5984257012605667e-03
-2.4918280541896820e-03
-5.0726905465126038e-04
-1.3493187725543976e-03

GPU | DirectML EP(Incorrect):

7.63121
2.10706
-15.587
-8.67914
-20.8112
-37.6199
-15.6217
-13.4909
45.5337
82.2263
95.6195
45.5667
124.225
188.378
167.306
142.281

Init Code for reference

        mEnv = new Ort::Env(ORT_LOGGING_LEVEL_VERBOSE, "test");
        mSessionOptions = new Ort::SessionOptions;
        mSessionOptions->SetExecutionMode(ORT_SEQUENTIAL);
        mSessionOptions->DisableMemPattern();
        Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_DML(*mSessionOptions, 0));
        mSession = new Ort::Session((*mEnv), inModelPath, (*mSessionOptions));

When I run inference, it throws this warning :

2024-03-09 17:31:07.4370295 [W:onnxruntime:, session_state.cc:1166 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Some nodes were not assigned to the preferred execution providers which may or may not have an negative impact on performance. e.g. ORT explicitly assigns shape related ops to CPU to improve perf.
2024-03-09 17:31:07.4474067 [W:onnxruntime:, session_state.cc:1168 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Rerunning with verbose output on a non-minimal build will show node assignments.

I logged the run with verbose and some nodes are mapped into CPU :

2024-03-09 17:19:38.1964077 [V:onnxruntime:, session_state.cc:1146 onnxruntime::VerifyEachNodeIsAssignedToAnEp] Node placements
2024-03-09 17:19:38.1987859 [V:onnxruntime:, session_state.cc:1152 onnxruntime::VerifyEachNodeIsAssignedToAnEp]  Node(s) placed on [DmlExecutionProvider]. Number of nodes: 77
2024-03-09 17:19:38.2023258 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   LSTM (/encoder.4/dconv/layers.0/layers.0.3/lstm/LSTM)
2024-03-09 17:19:38.2054732 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   LSTM (/encoder.4/dconv/layers.0/layers.0.3/lstm/LSTM_1)
2024-03-09 17:19:38.2087406 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.4/dconv/layers.0/layers.0.4/Einsum)
2024-03-09 17:19:38.2130963 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Div (/encoder.4/dconv/layers.0/layers.0.4/Div)
2024-03-09 17:19:38.2163550 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Add (/encoder.4/dconv/layers.0/layers.0.4/Add)
2024-03-09 17:19:38.2193638 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Where (/encoder.4/dconv/layers.0/layers.0.4/Where_3)
2024-03-09 17:19:38.2225705 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Softmax (/encoder.4/dconv/layers.0/layers.0.4/Softmax)
2024-03-09 17:19:38.2268574 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Reshape (/encoder.4/dconv/layers.0/layers.0.4/Reshape_5)
2024-03-09 17:19:38.2329546 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Conv (/encoder.4/dconv/layers.0/layers.0.4/proj/Conv)
2024-03-09 17:19:38.2371909 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   LSTM (/encoder.4/dconv/layers.1/layers.1.3/lstm/LSTM)
2024-03-09 17:19:38.2404647 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   LSTM (/encoder.4/dconv/layers.1/layers.1.3/lstm/LSTM_1)
2024-03-09 17:19:38.2437037 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.4/dconv/layers.1/layers.1.4/Einsum)
2024-03-09 17:19:38.2497929 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Div (/encoder.4/dconv/layers.1/layers.1.4/Div)
2024-03-09 17:19:38.2528811 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Add (/encoder.4/dconv/layers.1/layers.1.4/Add)
2024-03-09 17:19:38.2569731 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Where (/encoder.4/dconv/layers.1/layers.1.4/Where_3)
2024-03-09 17:19:38.2604761 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Softmax (/encoder.4/dconv/layers.1/layers.1.4/Softmax)
2024-03-09 17:19:38.2638485 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Reshape (/encoder.4/dconv/layers.1/layers.1.4/Reshape_5)
2024-03-09 17:19:38.2670311 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Conv (/encoder.4/dconv/layers.1/layers.1.4/proj/Conv)
2024-03-09 17:19:38.2702370 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   LSTM (/encoder.5/dconv/layers.0/layers.0.3/lstm/LSTM)
2024-03-09 17:19:38.2733169 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   LSTM (/encoder.5/dconv/layers.0/layers.0.3/lstm/LSTM_1)
2024-03-09 17:19:38.2770412 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.5/dconv/layers.0/layers.0.4/Einsum)
2024-03-09 17:19:38.2805019 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Div (/encoder.5/dconv/layers.0/layers.0.4/Div)
2024-03-09 17:19:38.2834022 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Add (/encoder.5/dconv/layers.0/layers.0.4/Add)
2024-03-09 17:19:38.2946036 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Where (/encoder.5/dconv/layers.0/layers.0.4/Where_3)
2024-03-09 17:19:38.2978596 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Softmax (/encoder.5/dconv/layers.0/layers.0.4/Softmax)
2024-03-09 17:19:38.3011489 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Reshape (/encoder.5/dconv/layers.0/layers.0.4/Reshape_5)
2024-03-09 17:19:38.3043391 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Conv (/encoder.5/dconv/layers.0/layers.0.4/proj/Conv)
2024-03-09 17:19:38.3097470 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   LSTM (/encoder.5/dconv/layers.1/layers.1.3/lstm/LSTM)
2024-03-09 17:19:38.3128849 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   LSTM (/encoder.5/dconv/layers.1/layers.1.3/lstm/LSTM_1)
2024-03-09 17:19:38.3161082 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.5/dconv/layers.1/layers.1.4/Einsum)
2024-03-09 17:19:38.3191046 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Div (/encoder.5/dconv/layers.1/layers.1.4/Div)
2024-03-09 17:19:38.3219691 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Add (/encoder.5/dconv/layers.1/layers.1.4/Add)
2024-03-09 17:19:38.3285357 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Where (/encoder.5/dconv/layers.1/layers.1.4/Where_3)
2024-03-09 17:19:38.3314117 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Softmax (/encoder.5/dconv/layers.1/layers.1.4/Softmax)
2024-03-09 17:19:38.3355328 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Reshape (/encoder.5/dconv/layers.1/layers.1.4/Reshape_5)
2024-03-09 17:19:38.3397122 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Conv (/encoder.5/dconv/layers.1/layers.1.4/proj/Conv)
2024-03-09 17:19:38.3429634 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_0_3 (DmlFusedNode_0_3)
2024-03-09 17:19:38.3465694 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_1_5 (DmlFusedNode_1_5)
2024-03-09 17:19:38.3491821 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_2_7 (DmlFusedNode_2_7)
2024-03-09 17:19:38.3522404 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_3_9 (DmlFusedNode_3_9)
2024-03-09 17:19:38.3549790 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_4_10 (DmlFusedNode_4_10)
2024-03-09 17:19:38.3590378 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_5_19 (DmlFusedNode_5_19)
2024-03-09 17:19:38.3619231 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_6_21 (DmlFusedNode_6_21)
2024-03-09 17:19:38.3647269 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_7_23 (DmlFusedNode_7_23)
2024-03-09 17:19:38.3675955 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_8_25 (DmlFusedNode_8_25)
2024-03-09 17:19:38.3711477 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_9_26 (DmlFusedNode_9_26)
2024-03-09 17:19:38.3737687 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_10_35 (DmlFusedNode_10_35)
2024-03-09 17:19:38.3775225 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_11_37 (DmlFusedNode_11_37)
2024-03-09 17:19:38.3801207 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_12_39 (DmlFusedNode_12_39)
2024-03-09 17:19:38.3833644 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_13_41 (DmlFusedNode_13_41)
2024-03-09 17:19:38.3930719 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_14_42 (DmlFusedNode_14_42)
2024-03-09 17:19:38.3961912 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_15_51 (DmlFusedNode_15_51)
2024-03-09 17:19:38.3993214 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_16_53 (DmlFusedNode_16_53)
2024-03-09 17:19:38.4025940 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_17_55 (DmlFusedNode_17_55)
2024-03-09 17:19:38.4061956 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_18_57 (DmlFusedNode_18_57)
2024-03-09 17:19:38.4088969 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_19_58 (DmlFusedNode_19_58)
2024-03-09 17:19:38.4125093 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   DmlFusedNode_20_67 (DmlFusedNode_20_67)
2024-03-09 17:19:38.4153944 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyFromHost (Memcpy)
2024-03-09 17:19:38.4187650 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyFromHost (Memcpy_token_752)
2024-03-09 17:19:38.4218997 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyFromHost (Memcpy_token_753)
2024-03-09 17:19:38.4247973 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyFromHost (Memcpy_token_754)
2024-03-09 17:19:38.4277472 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyFromHost (Memcpy_token_755)
2024-03-09 17:19:38.4313953 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyFromHost (Memcpy_token_756)
2024-03-09 17:19:38.4667697 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyFromHost (Memcpy_token_757)
2024-03-09 17:19:38.5304254 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyFromHost (Memcpy_token_758)
2024-03-09 17:19:38.5347031 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_759)
2024-03-09 17:19:38.5379085 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_760)
2024-03-09 17:19:38.5421093 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_761)
2024-03-09 17:19:38.5465824 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_762)
2024-03-09 17:19:38.5495877 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_763)
2024-03-09 17:19:38.5537753 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_764)
2024-03-09 17:19:38.5569797 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_765)
2024-03-09 17:19:38.5606784 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_766)
2024-03-09 17:19:38.5636080 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_767)
2024-03-09 17:19:38.5666540 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_768)
2024-03-09 17:19:38.5707909 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_769)
2024-03-09 17:19:38.5738488 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   MemcpyToHost (Memcpy_token_770)
2024-03-09 17:19:38.5775146 [V:onnxruntime:, session_state.cc:1152 onnxruntime::VerifyEachNodeIsAssignedToAnEp]  Node(s) placed on [CPUExecutionProvider]. Number of nodes: 8
2024-03-09 17:19:38.5821310 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.4/dconv/layers.0/layers.0.4/Einsum_1)
2024-03-09 17:19:38.5856312 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.4/dconv/layers.0/layers.0.4/Einsum_2)
2024-03-09 17:19:38.5928337 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.4/dconv/layers.1/layers.1.4/Einsum_1)
2024-03-09 17:19:38.5973647 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.4/dconv/layers.1/layers.1.4/Einsum_2)
2024-03-09 17:19:38.6101708 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.5/dconv/layers.0/layers.0.4/Einsum_1)
2024-03-09 17:19:38.6148915 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.5/dconv/layers.0/layers.0.4/Einsum_2)
2024-03-09 17:19:38.6182724 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.5/dconv/layers.1/layers.1.4/Einsum_1)
2024-03-09 17:19:38.6216925 [V:onnxruntime:, session_state.cc:1154 onnxruntime::VerifyEachNodeIsAssignedToAnEp]   Einsum (/encoder.5/dconv/layers.1/layers.1.4/Einsum_2)

Additional Info:
CPU: Intel i9-12900H
Integrated GPU : Intel(R) Iris(R) Xe Graphics
GPU: RTX 3080Ti 16GB

To reproduce

Unfortunately I can't reveal the model. :(

Urgency

It is kinda urgent due to a project deadline..

Platform

Windows

OS Version

11

ONNX Runtime Installation

Released Package

ONNX Runtime Version or Commit ID

1.17.0

ONNX Runtime API

C++

Architecture

X64

Execution Provider

DirectML

Execution Provider Library Version

1.13.1

Model File

:(

Is this a quantized model?

No

@github-actions github-actions bot added ep:CUDA issues related to the CUDA execution provider ep:DML issues related to the DirectML execution provider platform:windows issues related to the Windows platform labels Mar 9, 2024
@pranavsharma
Copy link
Contributor

@fdwr can you take a look?

@sophies927 sophies927 removed ep:CUDA issues related to the CUDA execution provider platform:windows issues related to the Windows platform labels Mar 14, 2024
@fdwr
Copy link
Contributor

fdwr commented Mar 16, 2024

@smk2007 @martinb35 (who are focused on this more so than I am now)

Unfortunately I can't reveal the model. :(

Without the model, it will be a challenge to isolate, and so we can only offer ideas on how you might be able to find it. Some approaches I take include:

  • Unregister DML ops in OperatorRegistration.cpp so those fall back to CPU, to narrow down which operator is causing it. If you are able to build ORT yourself, you could do a binary search by commenting out operators.
  • Print the output by building with onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS to compare CPU and GPU outputs along the way (this is a challenge because of all the graph transformations that differ between them, but it's doable).
    set(onnxruntime_DEBUG_NODE_INPUTS_OUTPUTS ON CACHE BOOL "Record and log extra diagnostic info")
    set ORT_DEBUG_NODE_IO_DUMP_SHAPE_DATA=1
    set ORT_DEBUG_NODE_IO_DUMP_OUTPUT_DATA=1
    set ORT_DEBUG_NODE_IO_DUMP_DATA_TO_FILES=1
    set ORT_DEBUG_NODE_IO_OUTPUT_DIR=o:\logs
    set ORT_DEBUG_NODE_IO_OP_TYPE_FILTER=
    set ORT_DEBUG_NODE_IO_DUMPING_DATA_TO_FILES_FOR_ALL_NODES_IS_OK=1
  • Split the model into smaller parts until finding the divergence. I used an internal tool for this (which I've been meaning to clean up and publish someday), but you should be able to achieve it with some other tool too like ONNX GraphSurgeon.

(updated 2024-03-20)

  • Set ortEnvironment to ORT_LOGGING_LEVEL_VERBOSE.
  • SetGraphOptimizationLevel to GraphOptimizationLevel::ORT_ENABLE_NONE.

@AdarshAcharya5
Copy link
Author

Hi @fdwr . Thanks for the reply!. How can I unregister ops in OperatorRegistration.cpp?. Do I just set DmlGraphSupport::Supported to DmlGraphSupport::NotSupported?

@fdwr
Copy link
Contributor

fdwr commented Mar 18, 2024

Hi @fdwr . Thanks for the reply!. How can I unregister ops in OperatorRegistration.cpp?. Do I just set DmlGraphSupport::Supported to DmlGraphSupport::NotSupported?

Easiest is just to comment the potential lines // (you know which operators are found in your model).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ep:DML issues related to the DirectML execution provider
Projects
None yet
Development

No branches or pull requests

4 participants