You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
but when I try to use the gpu, I receive a segfault.
The same code, executed with the same machine but with gpu disabled (export CUDA_VISIBLE_DEVICES="" ; python3 ts_yah.py), works like a charm
Tried to debug the script with gdb, here you can find the bt output:
Thread 38 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffeacff9640 (LWP 3124)]
0x00007ffff7fd6ec0 in ?? () from /lib64/ld-linux-x86-64.so.2
(gdb) bt
#0 0x00007ffff7fd6ec0 in ?? () from /lib64/ld-linux-x86-64.so.2 #1 0x00007ffff7fdef96 in ?? () from /lib64/ld-linux-x86-64.so.2 #2 0x00007ffff7d4e288 in __GI__dl_catch_exception (exception=0x7ffeacff72a0, operate=0x7ffff7fdece0, args=0x7ffeacff72c0) at dl-error-skeleton.c:208 #3 0x00007ffff7fde6ed in ?? () from /lib64/ld-linux-x86-64.so.2 #4 0x00007ffff7fa634c in dlopen_doit (a=a@entry=0x7ffeacff74f0) at dlopen.c:66 #5 0x00007ffff7d4e288 in __GI__dl_catch_exception (exception=exception@entry=0x7ffeacff7490, operate=0x7ffff7fa62f0 <dlopen_doit>, args=0x7ffeacff74f0) at dl-error-skeleton.c:208 #6 0x00007ffff7d4e353 in __GI__dl_catch_error (objname=0x7ffe90007200, errstring=0x7ffe90007208, mallocedp=0x7ffe900071f8, operate=, args=) at dl-error-skeleton.c:227 #7 0x00007ffff7fa6b89 in _dlerror_run (operate=operate@entry=0x7ffff7fa62f0 <dlopen_doit>, args=args@entry=0x7ffeacff74f0) at dlerror.c:170 #8 0x00007ffff7fa63d8 in __dlopen (file=, mode=) at dlopen.c:87 #9 0x00007fff4c2e974b in cudnnCreate () from /usr/lib/cuda/lib64/libcudnn.so.8 #10 0x00007fff8129f770 in cudnnCreate () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2 #11 0x00007fff8126b7c2 in stream_executor::gpu::CudnnSupport::Init() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2 #12 0x00007fff8126c2d7 in stream_executor::initialize_cudnn()::{lambda(stream_executor::internal::StreamExecutorInterface*)#1}::operator()(stream_executor::internal::StreamExecutorInterface*) const [clone .isra.587] ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2 #13 0x00007fff8613b283 in stream_executor::gpu::GpuExecutor::CreateDnn() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so #14 0x00007fff91c4d189 in stream_executor::StreamExecutor::AsDnn() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so #15 0x00007fff91c4d361 in stream_executor::StreamExecutor::createRnnDescriptor(int, int, int, int, int, stream_executor::dnn::RnnInputMode, stream_executor::dnn::RnnDirectionMode, stream_executor::dnn::RnnMode, stream_executor::dnn::DataType, stream_executor::dnn::AlgorithmConfig const&, float, unsigned long, stream_executor::ScratchAllocator*, bool) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so #16 0x00007fff8adb2f13 in tensorflow::Status tensorflow::CudnnRNNKernelCommon::GetCachedRnnDescriptor(tensorflow::OpKernelContext*, tensorflow::(anonymous namespace)::CudnnRnnModelShapes const&, stream_executor::dnn::RnnInputMode const&, stream_executor::dnn::AlgorithmConfig const&, tensorflow::gtl::FlatMap<std::pair<tensorflow::(anonymous namespace)::CudnnRnnModelShapes, absl::lts_20210324::optional<stream_executor::dnn::AlgorithmDesc> >, tensorflow::(anonymous namespace)::RnnScratchSpace, tensorflow::(anonymous namespace)::CudnnRnnConfigHasher, tensorflow::(anonymous namespace)::CudnnRnnConfigComparator>, stream_executor::dnn::RnnDescriptor**, bool) [clone .constprop.477] ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so #17 0x00007fff8adb3791 in tensorflow::CudnnRNNForwardOp<Eigen::GpuDevice, float>::ComputeAndReturnAlgorithm(tensorflow::OpKernelContext, stream_executor::dnn::AlgorithmConfig*, bool, bool, int) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so #18 0x00007fff8adabb96 in tensorflow::CudnnRNNForwardOp<Eigen::GpuDevice, float>::Compute(tensorflow::OpKernelContext*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so #19 0x00007fff8081a3b9 in tensorflow::BaseGPUDevice::Compute(tensorflow::OpKernel*, tensorflow::OpKernelContext*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2 #20 0x00007fff80910b73 in tensorflow::(anonymous namespace)::ExecutorStatetensorflow::SimplePropagatorState::Process(tensorflow::SimplePropagatorState::TaggedNode, long) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2 #21 0x00007fff85dfa1b1 in Eigen::ThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so #22 0x00007fff85df6ec3 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so #23 0x00007fff80dd9665 in tensorflow::(anonymous namespace)::PThread::ThreadFn(void*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2 #24 0x00007ffff7ded450 in start_thread (arg=0x7ffeacff9640) at pthread_create.c:473 #25 0x00007ffff7d0dd53 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
without gdb:
2021-08-15 17:07:58.551144: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:58.556926: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:58.557276: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2021-08-15 17:07:59.388972: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-15 17:07:59.389485: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.389844: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.390101: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856256: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856605: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856890: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.857142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1258 MB memory: -> device: 0, name: NVIDIA GeForce MX130, pci bus id: 0000:01:00.0, compute capability: 5.0
2021-08-15 17:08:00.342971: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
2021-08-15 17:08:00.343008: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
2021-08-15 17:08:00.343036: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
2021-08-15 17:08:00.499252: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
2021-08-15 17:08:00.501123: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed
2021-08-15 17:08:00.560773: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/300
Segmentation fault (core dumped)
What's now?
Thanks
The text was updated successfully, but these errors were encountered:
TF 2.6 prebuilt binaries support cuda 11.2 and cudnn 8.1 https://www.tensorflow.org/install/source#gpu
Have you build TF from source fo cuda 11.4? If using pip install can you please switch back to cuda 11.2 and check if the issue persists? Thanks!
Hi,
thanks for your comment.
I installed all the packages from the already build sources: pip in the case of TF, cuda from the ubuntu archives and cuDNN from nvidia sites.
I haven't seen that the TF is compatible only with cuDNN 8.1 and I installed the last found on the nvidia website.
Now, with the right version 8.1 (cuda 11.2), all are flight like a concorde ;)
Hi all,
I'm trying tensorflow into my laptop with:
root@mic:~# nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... Off | 00000000:01:00.0 Off | N/A |
| N/A 48C P0 N/A / N/A | 419MiB / 2004MiB | 5% Default |
| | | N/A |
but when I try to use the gpu, I receive a segfault.
The same code, executed with the same machine but with gpu disabled (export CUDA_VISIBLE_DEVICES="" ; python3 ts_yah.py), works like a charm
Tried to debug the script with gdb, here you can find the bt output:
Thread 38 "python3" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffeacff9640 (LWP 3124)]
0x00007ffff7fd6ec0 in ?? () from /lib64/ld-linux-x86-64.so.2
(gdb) bt
#0 0x00007ffff7fd6ec0 in ?? () from /lib64/ld-linux-x86-64.so.2
#1 0x00007ffff7fdef96 in ?? () from /lib64/ld-linux-x86-64.so.2
#2 0x00007ffff7d4e288 in __GI__dl_catch_exception (exception=0x7ffeacff72a0, operate=0x7ffff7fdece0, args=0x7ffeacff72c0) at dl-error-skeleton.c:208
#3 0x00007ffff7fde6ed in ?? () from /lib64/ld-linux-x86-64.so.2
#4 0x00007ffff7fa634c in dlopen_doit (a=a@entry=0x7ffeacff74f0) at dlopen.c:66
#5 0x00007ffff7d4e288 in __GI__dl_catch_exception (exception=exception@entry=0x7ffeacff7490, operate=0x7ffff7fa62f0 <dlopen_doit>, args=0x7ffeacff74f0) at dl-error-skeleton.c:208
#6 0x00007ffff7d4e353 in __GI__dl_catch_error (objname=0x7ffe90007200, errstring=0x7ffe90007208, mallocedp=0x7ffe900071f8, operate=, args=) at dl-error-skeleton.c:227
#7 0x00007ffff7fa6b89 in _dlerror_run (operate=operate@entry=0x7ffff7fa62f0 <dlopen_doit>, args=args@entry=0x7ffeacff74f0) at dlerror.c:170
#8 0x00007ffff7fa63d8 in __dlopen (file=, mode=) at dlopen.c:87
#9 0x00007fff4c2e974b in cudnnCreate () from /usr/lib/cuda/lib64/libcudnn.so.8
#10 0x00007fff8129f770 in cudnnCreate () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#11 0x00007fff8126b7c2 in stream_executor::gpu::CudnnSupport::Init() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#12 0x00007fff8126c2d7 in stream_executor::initialize_cudnn()::{lambda(stream_executor::internal::StreamExecutorInterface*)#1}::operator()(stream_executor::internal::StreamExecutorInterface*) const [clone .isra.587] ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#13 0x00007fff8613b283 in stream_executor::gpu::GpuExecutor::CreateDnn() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#14 0x00007fff91c4d189 in stream_executor::StreamExecutor::AsDnn() () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#15 0x00007fff91c4d361 in stream_executor::StreamExecutor::createRnnDescriptor(int, int, int, int, int, stream_executor::dnn::RnnInputMode, stream_executor::dnn::RnnDirectionMode, stream_executor::dnn::RnnMode, stream_executor::dnn::DataType, stream_executor::dnn::AlgorithmConfig const&, float, unsigned long, stream_executor::ScratchAllocator*, bool) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#16 0x00007fff8adb2f13 in tensorflow::Status tensorflow::CudnnRNNKernelCommon::GetCachedRnnDescriptor(tensorflow::OpKernelContext*, tensorflow::(anonymous namespace)::CudnnRnnModelShapes const&, stream_executor::dnn::RnnInputMode const&, stream_executor::dnn::AlgorithmConfig const&, tensorflow::gtl::FlatMap<std::pair<tensorflow::(anonymous namespace)::CudnnRnnModelShapes, absl::lts_20210324::optional<stream_executor::dnn::AlgorithmDesc> >, tensorflow::(anonymous namespace)::RnnScratchSpace, tensorflow::(anonymous namespace)::CudnnRnnConfigHasher, tensorflow::(anonymous namespace)::CudnnRnnConfigComparator>, stream_executor::dnn::RnnDescriptor**, bool) [clone .constprop.477] ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#17 0x00007fff8adb3791 in tensorflow::CudnnRNNForwardOp<Eigen::GpuDevice, float>::ComputeAndReturnAlgorithm(tensorflow::OpKernelContext, stream_executor::dnn::AlgorithmConfig*, bool, bool, int) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#18 0x00007fff8adabb96 in tensorflow::CudnnRNNForwardOp<Eigen::GpuDevice, float>::Compute(tensorflow::OpKernelContext*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#19 0x00007fff8081a3b9 in tensorflow::BaseGPUDevice::Compute(tensorflow::OpKernel*, tensorflow::OpKernelContext*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#20 0x00007fff80910b73 in tensorflow::(anonymous namespace)::ExecutorStatetensorflow::SimplePropagatorState::Process(tensorflow::SimplePropagatorState::TaggedNode, long) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#21 0x00007fff85dfa1b1 in Eigen::ThreadPoolTempltensorflow::thread::EigenEnvironment::WorkerLoop(int) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#22 0x00007fff85df6ec3 in std::_Function_handler<void (), tensorflow::thread::EigenEnvironment::CreateThread(std::function<void ()>)::{lambda()#1}>::_M_invoke(std::_Any_data const&) ()
from /usr/local/lib/python3.9/dist-packages/tensorflow/python/_pywrap_tensorflow_internal.so
#23 0x00007fff80dd9665 in tensorflow::(anonymous namespace)::PThread::ThreadFn(void*) () from /usr/local/lib/python3.9/dist-packages/tensorflow/python/../libtensorflow_framework.so.2
#24 0x00007ffff7ded450 in start_thread (arg=0x7ffeacff9640) at pthread_create.c:473
#25 0x00007ffff7d0dd53 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
without gdb:
2021-08-15 17:07:58.551144: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:58.556926: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:58.557276: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
[PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
2021-08-15 17:07:59.388972: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-08-15 17:07:59.389485: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.389844: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.390101: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856256: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856605: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.856890: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-08-15 17:07:59.857142: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1510] Created device /job:localhost/replica:0/task:0/device:GPU:0 with 1258 MB memory: -> device: 0, name: NVIDIA GeForce MX130, pci bus id: 0000:01:00.0, compute capability: 5.0
2021-08-15 17:08:00.342971: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing.
2021-08-15 17:08:00.343008: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started.
2021-08-15 17:08:00.343036: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1614] Profiler found 1 GPUs
2021-08-15 17:08:00.499252: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down.
2021-08-15 17:08:00.501123: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed
2021-08-15 17:08:00.560773: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)
Epoch 1/300
Segmentation fault (core dumped)
What's now?
Thanks
The text was updated successfully, but these errors were encountered: