segfault with gpu enabled on python3.9 and ubuntu 21 #10210

mic-p · 2021-08-15T15:11:11Z

Hi all,
I'm trying tensorflow into my laptop with:

ubuntu 21 desktop just installed
python3.9, tensorflow installed with pip3 install tensorflow (v. 2.6.0)
cuDNN just downloaded from nvidia (8.2.2 (July 6th, 2021), for CUDA 11.4)
GPU: product: GM108M [GeForce MX130]

but when I try to use the gpu, I receive a segfault.
The same code, executed with the same machine but with gpu disabled (export CUDA_VISIBLE_DEVICES="" ; python3 ts_yah.py), works like a charm

Tried to debug the script with gdb, here you can find the bt output:

Thread 38 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffeacff9640 (LWP 3124)]
0x00007ffff7fd6ec0 in ?? () from /lib64/ld-linux-x86-64.so.2
(gdb) bt
#0 0x00007ffff7fd6ec0 in ?? () from /lib64/ld-linux-x86-64.so.2
#1 0x00007ffff7fdef96 in ?? () from /lib64/ld-linux-x86-64.so.2
#2 0x00007ffff7d4e288 in __GI__dl_catch_exception (exception=0x7ffeacff72a0, operate=0x7ffff7fdece0, args=0x7ffeacff72c0) at dl-error-skeleton.c:208
#3 0x00007ffff7fde6ed in ?? () from /lib64/ld-linux-x86-64.so.2
#4 0x00007ffff7fa634c in dlopen_doit (a=a@entry=0x7ffeacff74f0) at dlopen.c:66
#5 0x00007ffff7d4e288 in __GI__dl_catch_exception (exception=exception@entry=0x7ffeacff7490, operate=0x7ffff7fa62f0 <dlopen_doit>, args=0x7ffeacff74f0) at dl-error-skeleton.c:208
#6 0x00007ffff7d4e353 in __GI__dl_catch_error (objname=0x7ffe90007200, errstring=0x7ffe90007208, mallocedp=0x7ffe900071f8, operate=, args=) at dl-error-skeleton.c:227
#7 0x00007ffff7fa6b89 in _dlerror_run (operate=operate@entry=0x7ffff7fa62f0 <dlopen_doit>, args=args@entry=0x7ffeacff74f0) at dlerror.c:170
#8 0x00007ffff7fa63d8 in __dlopen (file=, mode=) at dlopen.c:87
#9 0x00007fff4c2e974b in cudnnCreate () from /usr/lib/cuda/lib64/libcudnn.so.8
#10 0x00007fff8129f770 in cudnnCreate () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#11 0x00007fff8126b7c2 in stream_executor::gpu::CudnnSupport::Init() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#12 0x00007fff8126c2d7 in stream_executor::initialize_cudnn()::{lambda(stream_executor::internal::StreamExecutorInterface*)#1}::operator()(stream_executor::internal::StreamExecutorInterface*) const [clone .isra.587] ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#13 0x00007fff8613b283 in stream_executor::gpu::GpuExecutor::CreateDnn() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#14 0x00007fff91c4d189 in stream_executor::StreamExecutor::AsDnn() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#15 0x00007fff91c4d361 in stream_executor::StreamExecutor::createRnnDescriptor(int, int, int, int, int, stream_executor::dnn::RnnInputMode, stream_executor::dnn::RnnDirectionMode, stream_executor::dnn::RnnMode, stream_executor::dnn::DataType, stream_executor::dnn::AlgorithmConfig const&, float, unsigned long, stream_executor::ScratchAllocator*, bool) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#16 0x00007fff8adb2f13 in tensorflow::Status tensorflow::CudnnRNNKernelCommon::GetCachedRnnDescriptor(tensorflow::OpKernelContext*, tensorflow::(anonymous namespace)::CudnnRnnModelShapes const&, stream_executor::dnn::RnnInputMode const&, stream_executor::dnn::AlgorithmConfig const&, tensorflow::gtl::FlatMap<std::pair<tensorflow::(anonymous namespace)::CudnnRnnModelShapes, absl::lts_20210324::optional<stream_executor::dnn::AlgorithmDesc> >, tensorflow::(anonymous namespace)::RnnScratchSpace, tensorflow::(anonymous namespace)::CudnnRnnConfigHasher, tensorflow::(anonymous namespace)::CudnnRnnConfigComparator>, stream_executor::dnn::RnnDescriptor**, bool) [clone .constprop.477] ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#17 0x00007fff8adb3791 in tensorflow::CudnnRNNForwardOp<Eigen::GpuDevice, float>::ComputeAndReturnAlgorithm(tensorflow::OpKernelContext, stream_executor::dnn::AlgorithmConfig*, bool, bool, int) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#18 0x00007fff8adabb96 in tensorflow::CudnnRNNForwardOp<Eigen::GpuDevice, float>::Compute(tensorflow::OpKernelContext*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#19 0x00007fff8081a3b9 in tensorflow::BaseGPUDevice::Compute(tensorflow::OpKernel*, tensorflow::OpKernelContext*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#20 0x00007fff80910b73 in tensorflow::(anonymous namespace)::ExecutorStatetensorflow::SimplePropagatorState::Process(tensorflow::SimplePropagatorState::TaggedNode, long) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#21 0x00007fff85dfa1b1 in Eigen::ThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#22 0x00007fff85df6ec3 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#23 0x00007fff80dd9665 in tensorflow::(anonymous namespace)::PThread::ThreadFn(void*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#24 0x00007ffff7ded450 in start_thread (arg=0x7ffeacff9640) at pthread_create.c:473
#25 0x00007ffff7d0dd53 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

without gdb:
2021-08-15 17:07:58.551144: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:58.556926: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:58.557276: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2021-08-15 17:07:59.388972: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-15 17:07:59.389485: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.389844: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.390101: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856256: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856605: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856890: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.857142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1258 MB memory: -> device: 0, name: NVIDIA GeForce MX130, pci bus id: 0000:01:00.0, compute capability: 5.0
2021-08-15 17:08:00.342971: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
2021-08-15 17:08:00.343008: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
2021-08-15 17:08:00.343036: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
2021-08-15 17:08:00.499252: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
2021-08-15 17:08:00.501123: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed
2021-08-15 17:08:00.560773: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/300
Segmentation fault (core dumped)

What's now?
Thanks

ymodak · 2021-08-17T18:12:56Z

TF 2.6 prebuilt binaries support cuda 11.2 and cudnn 8.1
https://www.tensorflow.org/install/source#gpu
Have you build TF from source fo cuda 11.4? If using pip install can you please switch back to cuda 11.2 and check if the issue persists? Thanks!

mic-p · 2021-08-18T09:04:45Z

Hi,
thanks for your comment.
I installed all the packages from the already build sources: pip in the case of TF, cuda from the ubuntu archives and cuDNN from nvidia sites.

I haven't seen that the TF is compatible only with cuDNN 8.1 and I installed the last found on the nvidia website.
Now, with the right version 8.1 (cuda 11.2), all are flight like a concorde ;)

Thanks a lot and please close the issue

Michele

kumariko assigned ymodak Aug 17, 2021

ymodak added the stat:awaiting response Waiting on input from the contributor label Aug 17, 2021

ymodak closed this as completed Aug 19, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

segfault with gpu enabled on python3.9 and ubuntu 21 #10210

segfault with gpu enabled on python3.9 and ubuntu 21 #10210

mic-p commented Aug 15, 2021

ymodak commented Aug 17, 2021

mic-p commented Aug 18, 2021

segfault with gpu enabled on python3.9 and ubuntu 21 #10210

segfault with gpu enabled on python3.9 and ubuntu 21 #10210

Comments

mic-p commented Aug 15, 2021

ymodak commented Aug 17, 2021

mic-p commented Aug 18, 2021