Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to find dynamic library: libwarpctc.so ( dlopen: cannot load any more object with static TLS ) #7352

Closed
endy-see opened this issue Aug 1, 2019 · 1 comment

Comments

@endy-see
Copy link

endy-see commented Aug 1, 2019

My local environment:
CentOS: release 6.9
NCCL: v2.4.7
cuda: 9.0.176
cudnn: 7.3.1
Paddle: 1.5.1
Python: 3.7.3


When i start training ocr_recognition model with crnn_ctc model, paddle occured error as follow:

(paddle) [ocr_recognition]# env CUDA_VISIBLE_DEVICES=0 python train.py --train_images dataset/public_data_english/train_images --train_list dataset/public_data_english/train.list --test_images dataset/public_data_english/test_images --test_list dataset/public_data_english/test.list
----------- Configuration Arguments -----------
average_window: 0.15
batch_size: 32
eval_period: 15000
init_model: None
log_period: 1000
max_average_window: 12500
min_average_window: 10000
model: crnn_ctc
parallel: False
profile: False
save_model_dir: ./models
save_model_period: 15000
skip_batch_num: 0
skip_test: False
test_images: dataset/public_data_english/test_images
test_list: dataset/public_data_english/test.list
total_step: 720000
train_images: dataset/public_data_english/train_images
train_list: dataset/public_data_english/train.list
use_gpu: True

/home/work/software/anaconda2/envs/paddle/lib/python3.7/site-packages/paddle/fluid/evaluator.py:71: Warning: The EditDistance is deprecated, because maintain a modified program inside evaluator cause bug easily, please use fluid.metrics.EditDistance instead.
% (self.class.name, self.class.name), Warning)
finish batch shuffle
W0801 21:22:58.187352 37850 device_context.cc:259] Please NOTE: device: 0, CUDA Capability: 61, Driver API Version: 9.2, Runtime API Version: 9.0
W0801 21:22:58.192481 37850 device_context.cc:267] device: 0, cuDNN Version: 7.3.
W0801 21:22:59.779482 37850 dynamic_loader.cc:140] Failed to find dynamic library: /paddle/build/third_party/install/warpctc/lib/libwarpctc.so (dlopen: cannot load any more object with static TLS)
W0801 21:22:59.779705 37850 dynamic_loader.cc:109] Can not find library: libwarpctc.so. The process maybe hang. Please try to add the lib path to LD_LIBRARY_PATH.
Traceback (most recent call last):
File "train.py", line 222, in
main()
File "train.py", line 218, in main
train(args)
File "train.py", line 151, in train
results = train_one_batch(data)
File "train.py", line 112, in train_one_batch
fetch_list=fetch_vars)
File "/home/work/software/anaconda2/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 651, in run
use_program_cache=use_program_cache)
File "/home/work/software/anaconda2/envs/paddle/lib/python3.7/site-packages/paddle/fluid/executor.py", line 749, in run
exe.run(program.desc, scope, 0, True, True, fetch_var_name)
paddle.fluid.core_avx.EnforceNotMet: Invoke operator warpctc error.
Python Callstacks:
File "/home/work/software/anaconda2/envs/paddle/lib/python3.7/site-packages/paddle/fluid/framework.py", line 1771, in append_op
attrs=kwargs.get("attrs", None))
File "/home/work/software/anaconda2/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layer_helper.py", line 43, in append_op
return self.main_program.current_block().append_op(args, kwargs)
File "/home/work/software/anaconda2/envs/paddle/lib/python3.7/site-packages/paddle/fluid/layers/nn.py", line 5573, in warpctc
'use_cudnn': use_cudnn
File "/home/zhaoyanmei/models/PaddleCV/ocr_recognition/crnn_ctc_model.py", line 189, in ctc_train_net
input=fc_out, label=label, blank=num_classes, norm_by_times=True)
File "train.py", line 61, in train
args, data_shape, num_classes)
File "train.py", line 218, in main
train(args)
File "train.py", line 222, in
main()
C++ Callstacks:
Failed to find dynamic library: libwarpctc.so ( dlopen: cannot load any more object with static TLS )
Please specify its path correctly using following ways:
Method. set environment variable LD_LIBRARY_PATH on Linux or DYLD_LIBRARY_PATH on Mac OS.
For instance, issue command: export LD_LIBRARY_PATH=...
Note: After Mac OS 10.11, using the DYLD_LIBRARY_PATH is impossible unless System Integrity Protection (SIP) is disabled. at [/paddle/paddle/fluid/platform/dynload/dynamic_loader.cc:166]
PaddlePaddle Call Stacks:
0 0x7fe93ff05830p void paddle::platform::EnforceNotMet::Init<char const
>(char const
, char const
, int) + 352
1 0x7fe93ff05ba9p paddle::platform::EnforceNotMet::EnforceNotMet(std::exception_ptr::exception_ptr, char const*, int) + 137
2 0x7fe941f09f9bp paddle::platform::dynload::GetWarpCTCDsoHandle() + 1835
3 0x7fe940177be9p void std::once_call_impl<std::Bind_simple<paddle::platform::dynload::DynLoad__get_warpctc_version::operator()<>()::{lambda()#1} ()> >() + 9
4 0x7fe9b196fbe0p pthread_once + 80
5 0x7fe9401809b8p paddle::operators::WarpCTCFunctorpaddle::platform::CUDADeviceContext::operator()(paddle::framework::ExecutionContext const&, float const*, float*, int const*, int const*, int const*, unsigned long, unsigned long, unsigned long, float*) + 136
6 0x7fe940183206p paddle::operators::WarpCTCKernel<paddle::platform::CUDADeviceContext, float>::Compute(paddle::framework::ExecutionContext const&) const + 2390
7 0x7fe940184ab3p std::Function_handler<void (paddle::framework::ExecutionContext const&), paddle::framework::OpKernelRegistrarFunctor<paddle::platform::CUDAPlace, false, 0ul, paddle::operators::WarpCTCKernel<paddle::platform::CUDADeviceContext, float> >::operator()(char const*, char const*, int) const::{lambda(paddle::framework::ExecutionContext const&)#1}>::M_invoke(std::Any_data const&, paddle::framework::ExecutionContext const&) + 35
8 0x7fe941e6bf07p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void
, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&, paddle::framework::RuntimeContext*) const + 375
9 0x7fe941e6c2e1p paddle::framework::OperatorWithKernel::RunImpl(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) const + 529
10 0x7fe941e698dcp paddle::framework::OperatorBase::Run(paddle::framework::Scope const&, boost::variant<paddle::platform::CUDAPlace, paddle::platform::CPUPlace, paddle::platform::CUDAPinnedPlace, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_, boost::detail::variant::void_> const&) + 332
11 0x7fe94009061ep paddle::framework::Executor::RunPreparedContext(paddle::framework::ExecutorPrepareContext*, paddle::framework::Scope*, bool, bool, bool) + 382
12 0x7fe9400936bfp paddle::framework::Executor::Run(paddle::framework::ProgramDesc const&, paddle::framework::Scope*, int, bool, bool, std::vector<std::string, std::allocatorstd::string > const&, bool) + 143
13 0x7fe93fef6ebdp
14 0x7fe93ff38166p
15 0x7fe9b1f1b6e4p _PyMethodDef_RawFastCallKeywords + 612
16 0x7fe9b1f1b801p _PyCFunction_FastCallKeywords + 33
17 0x7fe9b1f777aep _PyEval_EvalFrameDefault + 21374
18 0x7fe9b1eb84f9p _PyEval_EvalCodeWithName + 761
19 0x7fe9b1f1aa27p _PyFunction_FastCallKeywords + 903
20 0x7fe9b1f738fep _PyEval_EvalFrameDefault + 5326
21 0x7fe9b1eb84f9p _PyEval_EvalCodeWithName + 761
22 0x7fe9b1f1aa27p _PyFunction_FastCallKeywords + 903
23 0x7fe9b1f738fep _PyEval_EvalFrameDefault + 5326
24 0x7fe9b1eb8db9p _PyEval_EvalCodeWithName + 3001
25 0x7fe9b1f1aa27p _PyFunction_FastCallKeywords + 903
26 0x7fe9b1f72846p _PyEval_EvalFrameDefault + 1046
27 0x7fe9b1eb8db9p _PyEval_EvalCodeWithName + 3001
28 0x7fe9b1f1aa27p _PyFunction_FastCallKeywords + 903
29 0x7fe9b1f72846p _PyEval_EvalFrameDefault + 1046
30 0x7fe9b1f1a79bp _PyFunction_FastCallKeywords + 251
31 0x7fe9b1f72846p _PyEval_EvalFrameDefault + 1046
32 0x7fe9b1eb84f9p _PyEval_EvalCodeWithName + 761
33 0x7fe9b1eb93c4p PyEval_EvalCodeEx + 68
34 0x7fe9b1eb93ecp PyEval_EvalCode + 28
35 0x7fe9b1fd1874p
36 0x7fe9b1fdbb81p PyRun_FileExFlags + 161
37 0x7fe9b1fdbd73p PyRun_SimpleFileExFlags + 451
38 0x7fe9b1fdce5fp
39 0x7fe9b1fdcf7cp _Py_UnixMain + 60
40 0x7fe9b15c3b45p __libc_start_main + 245
41 0x7fe9b1f82122p

(paddle) [ocr_recognition]#


Can anyone help me? Thank you in advance!

@endy-see endy-see closed this as completed Aug 2, 2019
@endy-see
Copy link
Author

endy-see commented Aug 2, 2019

I am sorry~

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant