Runtime error when using multiple fusion output of `native_dropout` op #2440

ftxj · 2023-02-09T09:12:59Z

🐛 Describe the bug

The following test currently fails:

TEST_F(NVFuserTest, FusionParserBug_CUDA) {
  auto g = std::make_shared<Graph>();
  const auto graph0_string =
      R"IR(
  graph(%6 : Float(2708, 64, strides=[64, 1], requires_grad=0, device=cuda:0),
        %1 : float):
    %2 : bool = prim::Constant[value=0]()
    %7 : Float(2708, 64, strides=[64, 1], requires_grad=0, device=cuda:0) = aten::relu(%6)
    %res.2 : Float(2708, 64, strides=[64, 1], requires_grad=0, device=cuda:0), %mask.8 : Bool(2708, 64, strides=[64, 1], requires_grad=0, device=cuda:0) = aten::native_dropout(%7, %1, %2) 
    return (%res.2, %mask.8, %7))IR";
  parseIR(graph0_string, g.get());
  auto fusion = parseJitIR(g);
  FusionGuard fg(fusion.get());
  auto options = at::TensorOptions().dtype(at::kFloat).device(at::kCUDA, 0);
  at::Tensor input1 = at::randn({2708, 64}, options);

  FusionExecutor fe;
  fe.compileFusion(fusion.get(), {input1, 0.1});
}

If we don't return the %mask.8, then this test will success.
The error indicates that we doesn't making thread_predicates_:

C++ exception with description "thread_predicates_.find(tv_inp) != thread_predicates_.end() INTERNAL ASSERT FAILED at "/workspace/pytorch/third_party/nvfuser/csrc/lower_thread_predicate.cpp":220, please report a bug to PyTorch. Thread predicate map was not initialized, couldn't find T2_l[ 0 ]

Versions

PyTorch version: N/A
Is debug build: N/A
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: N/A

OS: Ubuntu 20.04.4 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04) 9.4.0
Clang version: Could not collect
CMake version: version 3.22.3
Libc version: glibc-2.31

Python version: 3.8.12 | packaged by conda-forge | (default, Jan 30 2022, 23:42:07) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-4.15.0-201-generic-x86_64-with-glibc2.10
Is CUDA available: N/A
CUDA runtime version: 11.6.112
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3080 Ti
Nvidia driver version: 525.85.05
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.3.3
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.3.3
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.3.3
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.3.3
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.3.3
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.3.3
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.3.3
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: N/A

CPU:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 43 bits physical, 48 bits virtual
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 24
Socket(s): 1
NUMA node(s): 1
Vendor ID: AuthenticAMD
CPU family: 23
Model: 49
Model name: AMD Ryzen Threadripper 3960X 24-Core Processor
Stepping: 0
Frequency boost: enabled
CPU MHz: 2195.367
CPU max MHz: 3800.0000
CPU min MHz: 2200.0000
BogoMIPS: 7585.58
Virtualization: AMD-V
L1d cache: 768 KiB
L1i cache: 768 KiB
L2 cache: 12 MiB
L3 cache: 128 MiB
NUMA node0 CPU(s): 0-47
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Retpolines, IBPB conditional, STIBP conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw ibs skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb cat_l3 cdp_l3 hw_pstate sme ssbd ibpb stibp vmmcall fsgsbase bmi1 avx2 smep bmi2 cqm rdt_a rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif umip rdpid overflow_recov succor smca

Versions of relevant libraries:
[pip3] numpy==1.22.3
[pip3] pytorch-quantization==2.1.2
[pip3] torch==1.12.0a0+2c916ef
[pip3] torch-tensorrt==1.1.0a0
[pip3] torchtext==0.12.0a0
[pip3] torchvision==0.13.0a0
[conda] magma-cuda110 2.5.2 5 local
[conda] mkl 2019.5 281 conda-forge
[conda] mkl-include 2019.5 281 conda-forge
[conda] numpy 1.22.3 py38h05e7239_0 conda-forge
[conda] pytorch-quantization 2.1.2 pypi_0 pypi
[conda] torch 1.12.0a0+2c916ef pypi_0 pypi
[conda] torch-tensorrt 1.1.0a0 pypi_0 pypi
[conda] torchtext 0.12.0a0 pypi_0 pypi
[conda] torchvision 0.13.0a0 pypi_0 pypi

The text was updated successfully, but these errors were encountered:

ftxj changed the title ~~Runtime error when using multiple fusion output for native_dropout op~~ Runtime error when using multiple fusion output of native_dropout op Feb 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Runtime error when using multiple fusion output of `native_dropout` op #2440

Runtime error when using multiple fusion output of `native_dropout` op #2440

ftxj commented Feb 9, 2023 •

edited

Loading

Runtime error when using multiple fusion output of native_dropout op #2440

Runtime error when using multiple fusion output of native_dropout op #2440

Comments

ftxj commented Feb 9, 2023 • edited Loading

🐛 Describe the bug

Versions

Runtime error when using multiple fusion output of `native_dropout` op #2440

Runtime error when using multiple fusion output of `native_dropout` op #2440

ftxj commented Feb 9, 2023 •

edited

Loading