Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeroed result returned by an elementwise function #1264

Open
antonwolfy opened this issue Jun 26, 2023 · 4 comments
Open

Zeroed result returned by an elementwise function #1264

antonwolfy opened this issue Jun 26, 2023 · 4 comments

Comments

@antonwolfy
Copy link
Collaborator

antonwolfy commented Jun 26, 2023

Sometimes a small test below

import dpctl, dpctl.tensor as dpt
import numpy


list_of_backend_str = [
    "host",
    "level_zero",
    "opencl",
]

list_of_device_type_str = [
    "host",
    "gpu",
    "cpu",
]

available_devices = [
    d for d in dpctl.get_devices() if not getattr(d, "has_aspect_host", False)
]

valid_devices = []
for device in available_devices:
    if device.default_selector_score < 0:
        pass
    elif device.backend.name not in list_of_backend_str:
        pass
    elif device.device_type.name not in list_of_device_type_str:
        pass
    else:
        valid_devices.append(device)


num = 100
for iter in range(num):
    for device_x in valid_devices:
        for device_y in valid_devices:
            arr_seq = [1, 2, 3, 4]

            x_orig = numpy.array(arr_seq)
            y_orig = (4 - x_orig) / 4

            x = dpt.asarray(arr_seq, device=device_x)
            _x = dpt.asarray(x, device=device_y)
            y = dpt.divide(dpt.subtract(4, _x), 4)

            # a filler step, sometimes requiered to reproduce the issue
            z = dpt.arange(0, stop=4, step=1, sycl_queue=y.sycl_queue)

            numpy.testing.assert_allclose(y_orig, dpt.asnumpy(y))

print("--------------------> Done <--------------------")

failes since dpctl.tensor returns an array of zeroes, like:

$ python dpctl_repr.py
Traceback (most recent call last):
  File "/localdisk/work/antonvol/code/dpnp_dev/dpnp/dpctl_repr.py", line 49, in <module>
    numpy.testing.assert_allclose(y_orig, dpt.asnumpy(y))
  File "/localdisk/work/antonvol/soft/miniconda3/envs/dpnp_py39_ext/lib/python3.9/site-packages/numpy/testing/_private/utils.py", line 1592, in assert_allclose
    assert_array_compare(compare, actual, desired, err_msg=str(err_msg),
  File "/localdisk/work/antonvol/soft/miniconda3/envs/dpnp_py39_ext/lib/python3.9/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/localdisk/work/antonvol/soft/miniconda3/envs/dpnp_py39_ext/lib/python3.9/site-packages/numpy/testing/_private/utils.py", line 862, in assert_array_compare
    raise AssertionError(msg)
AssertionError:
Not equal to tolerance rtol=1e-07, atol=0

Mismatched elements: 3 / 4 (75%)
Max absolute difference: 0.75
Max relative difference: inf
 x: array([0.75, 0.5 , 0.25, 0.  ])
 y: array([0., 0., 0., 0.])

where the list of devices and versions is

$ sycl-ls
[opencl:cpu:0] Intel(R) OpenCL, Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz 3.0 [2023.16.6.0.22_223734]
[opencl:gpu:1] Intel(R) OpenCL HD Graphics, Intel(R) UHD Graphics 3.0 [23.05.25593.11]
[opencl:acc:2] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2023.16.6.0.22_223734]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics [0x9bca] 1.3 [1.3.23726]

$ conda list dpctl
# packages in environment at /localdisk/work/antonvol/soft/miniconda3/envs/dpnp_py39_ext:
#
# Name                    Version                   Build  Channel
dpctl                     0.14.4           py39h7bf5fec_8    dppy/label/dev

The issue originally came from dpnp verification scope which spontaneous failed in internal CI.

While on a laptop with Irix Xe the test might crashes or causes an abortion, like:

$ python dpctl_repr.py
malloc_consolidate(): invalid chunk size
Aborted

$ python dpctl_repr.py
Segmentation fault

from gdb output:


Thread 1 "python" received signal SIGSEGV, Segmentation fault.
0x00007ffff7cd9d8b in _int_malloc (av=av@entry=0x7ffff7e2eb80 <main_arena>, bytes=bytes@entry=24) at malloc.c:3608
3608    malloc.c: No such file or directory.
(gdb) bt
#0  0x00007ffff7cd9d8b in _int_malloc (av=av@entry=0x7ffff7e2eb80 <main_arena>, bytes=bytes@entry=24) at malloc.c:3608
#1  0x00007ffff7cdc299 in __GI___libc_malloc (bytes=24) at malloc.c:3066
#2  0x00007ffff7271a40 in operator new (sz=24) at ../../../../libstdc++-v3/libsupc++/new_op.cc:50
#3  0x00007fffce0bff3a in ?? () from /usr/lib/x86_64-linux-gnu/intel-opencl/libigdrcl.so
#4  0x00007fffce09a20f in ?? () from /usr/lib/x86_64-linux-gnu/intel-opencl/libigdrcl.so
#5  0x00007fffcdf2c185 in ?? () from /usr/lib/x86_64-linux-gnu/intel-opencl/libigdrcl.so
#6  0x00007fffcde8954d in ?? () from /usr/lib/x86_64-linux-gnu/intel-opencl/libigdrcl.so
#7  0x00007fffcda8f81a in ?? () from /usr/lib/x86_64-linux-gnu/intel-opencl/libigdrcl.so
#8  0x00007ffff758d75a in _pi_result sycl::_V1::detail::plugin::call_nocheck<(sycl::_V1::detail::PiApiKind)63, _pi_event*, _pi_event_info, unsigned long, _pi_event_status*, decltype(nullptr)>(_pi_event*, _pi_event_info, unsigned long, _pi_event_status*, decltype(nullptr)) const ()
   from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#9  0x00007ffff758a1e5 in sycl::_V1::detail::event_impl::flushIfNeeded(std::shared_ptr<sycl::_V1::detail::queue_impl> const&) () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#10 0x00007ffff762b5e5 in sycl::_V1::detail::ExecCGCommand::enqueueImp() () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#11 0x00007ffff76182d7 in sycl::_V1::detail::Command::enqueue(sycl::_V1::detail::EnqueueResultT&, sycl::_V1::detail::BlockingT, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&) ()
   from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#12 0x00007ffff7642dc9 in sycl::_V1::detail::Scheduler::GraphProcessor::enqueueCommand(sycl::_V1::detail::Command*, std::shared_lock<std::shared_timed_mutex>&, sycl::_V1::detail::EnqueueResultT&, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::Command*, sycl::_V1::detail::BlockingT) () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#13 0x00007ffff763c474 in sycl::_V1::detail::Scheduler::enqueueCommandForCG(std::shared_ptr<sycl::_V1::detail::event_impl>, std::vector<sycl::_V1::detail::Command*, std::allocator<sycl::_V1::detail::Command*> >&, sycl::_V1::detail::BlockingT) ()
   from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#14 0x00007ffff763bcda in sycl::_V1::detail::Scheduler::addCG(std::unique_ptr<sycl::_V1::detail::CG, std::default_delete<sycl::_V1::detail::CG> >, std::shared_ptr<sycl::_V1::detail::queue_impl> const&) () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#15 0x00007ffff7679b0b in sycl::_V1::handler::finalize() () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#16 0x00007ffff76ab35e in void sycl::_V1::detail::queue_impl::finalizeHandler<sycl::_V1::handler>(sycl::_V1::handler&, sycl::_V1::detail::CG::CGTYPE const&, sycl::_V1::event&) () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#17 0x00007ffff76aafdc in sycl::_V1::detail::queue_impl::submit_impl(std::function<void (sycl::_V1::handler&)> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::code_location const&, std::function<void (bool, bool, sycl::_V1::event&)> const*) () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#18 0x00007ffff76aa406 in sycl::_V1::detail::queue_impl::submit(std::function<void (sycl::_V1::handler&)> const&, std::shared_ptr<sycl::_V1::detail::queue_impl> const&, sycl::_V1::detail::code_location const&, std::function<void (bool, bool, sycl::_V1::event&)> const*) ()
   from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#19 0x00007ffff76aa3c5 in sycl::_V1::queue::submit_impl(std::function<void (sycl::_V1::handler&)>, sycl::_V1::detail::code_location const&) () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#20 0x00007fffe1b719ae in ?? () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/tensor/_tensor_impl.cpython-39-x86_64-linux-gnu.so
#21 0x00007fffe1b6b3ce in ?? () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/tensor/_tensor_impl.cpython-39-x86_64-linux-gnu.so
#22 0x00007fffe0e5f5f7 in ?? () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/tensor/_tensor_impl.cpython-39-x86_64-linux-gnu.so
#23 0x00005555556b8f76 in cfunction_call (func=0x7fffe0a8c540, args=<optimized out>, kwargs=<optimized out>) at /usr/local/src/conda/python-3.9.16/Objects/methodobject.c:543
#24 0x00005555556a055c in _PyObject_MakeTpCall (tstate=0x55555593d6e0, callable=0x7fffe0a8c540, args=<optimized out>, nargs=<optimized out>, keywords=0x7fffe0a46770) at /usr/local/src/conda/python-3.9.16/Objects/typeobject.c:3876
#25 0x000055555569cb76 in _PyObject_VectorcallTstate (kwnames=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, nargsf=<optimized out>, args=<optimized out>, callable=0x7fffe0a8c540, tstate=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>)
    at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:116
#26 _PyObject_VectorcallTstate (kwnames=0x7fffe0a46770, nargsf=<optimized out>, args=<optimized out>, callable=0x7fffe0a8c540, tstate=<optimized out>) at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:103
#27 PyObject_Vectorcall (kwnames=0x7fffe0a46770, nargsf=<optimized out>, args=<optimized out>, callable=0x7fffe0a8c540) at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:127
#28 call_function (kwnames=0x7fffe0a46770, oparg=<optimized out>, pp_stack=<synthetic pointer>, tstate=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:5078
#29 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x5555570caa50, throwflag=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:3538
#30 0x0000555555696b0e in _PyEval_EvalFrame (throwflag=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, f=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
    tstate=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at /usr/local/src/conda/python-3.9.16/Include/internal/pycore_ceval.h:40
#31 _PyEval_EvalCode (tstate=0x55555593d6e0, _co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=<optimized out>, kwcount=<optimized out>, kwstep=1, defs=<optimized out>, defcount=<optimized out>, kwdefs=0x0,
    closure=0x0, name=0x7ffff78e71f0, qualname=0x7fffe0a46800) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:4330
#32 0x000055555569fcd2 in _PyFunction_Vectorcall (kwnames=0x0, nargsf=<optimized out>, stack=<optimized out>, func=0x7fffe08dc790) at /usr/local/src/conda/python-3.9.16/Objects/call.c:396
#33 _PyObject_FastCallDictTstate (tstate=0x55555593d6e0, callable=0x7fffe08dc790, args=<optimized out>, nargsf=<optimized out>, kwargs=<optimized out>) at /usr/local/src/conda/python-3.9.16/Objects/call.c:118
#34 0x00005555556b47e9 in _PyObject_Call_Prepend (tstate=0x55555593d6e0, callable=0x7fffe08dc790, obj=0x7fffe08db2b0, args=<optimized out>, kwargs=0x0) at /usr/local/src/conda/python-3.9.16/Objects/call.c:489
#35 0x000055555578d129 in slot_tp_call (self=0x7fffe08db2b0, args=0x7fff78065940, kwds=0x0) at /usr/local/src/conda/python-3.9.16/Objects/typeobject.c:6731
#36 0x00005555556a055c in _PyObject_MakeTpCall (tstate=0x55555593d6e0, callable=0x7fffe08db2b0, args=<optimized out>, nargs=<optimized out>, keywords=0x0) at /usr/local/src/conda/python-3.9.16/Objects/typeobject.c:3876
#37 0x000055555569c9cb in _PyObject_VectorcallTstate (kwnames=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, nargsf=<optimized out>, args=<optimized out>, callable=0x7fffe08db2b0, tstate=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>)
    at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:116
#38 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x55555599af20, callable=0x7fffe08db2b0, tstate=<optimized out>) at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:103
#39 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x55555599af20, callable=0x7fffe08db2b0) at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:127
#40 call_function (kwnames=0x0, oparg=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, pp_stack=<synthetic pointer>, tstate=0x55555593d6e0) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:5078
#41 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x55555599ad80, throwflag=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:3490
#42 0x0000555555696b0e in _PyEval_EvalFrame (throwflag=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, f=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
    tstate=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at /usr/local/src/conda/python-3.9.16/Include/internal/pycore_ceval.h:40
#43 _PyEval_EvalCode (tstate=0x55555593d6e0, _co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=<optimized out>, kwcount=<optimized out>, kwstep=2, defs=<optimized out>, defcount=<optimized out>, kwdefs=0x0,
    closure=0x0, name=0x0, qualname=0x0) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:4330
#44 0x0000555555696759 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0)
    at /usr/local/src/conda/python-3.9.16/Python/ceval.c:4362
#45 0x00005555556966d5 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:4378
#46 0x000055555574dbcb in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:828
#47 0x0000555555781510 in run_eval_code_obj (tstate=0x55555593d6e0, co=0x7ffff77ec660, globals=0x7ffff7855d80, locals=0x7ffff7855d80) at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:1221
#48 0x000055555577d025 in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7ffff7855d80, locals=0x7ffff7855d80, flags=<optimized out>, arena=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:1242
#49 0x00005555555ede2b in pyrun_file (fp=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, filename=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, start=<optimized out>,
    globals=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, locals=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, closeit=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, flags=0x7fffffffd9f8)
    at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:1140
#50 0x00005555557763d1 in pyrun_simple_file (flags=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, closeit=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, filename=0x7ffff7798390, fp=0x55555593a340)
    at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:450
#51 PyRun_SimpleFileExFlags (fp=0x55555593a340, filename=<optimized out>, closeit=1, flags=0x7fffffffd9f8) at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:483
#52 0x000055555577309d in pymain_run_file (cf=0x7fffffffd9f8, config=0x55555593e250) at /usr/local/src/conda/python-3.9.16/Modules/main.c:380
#53 pymain_run_python (exitcode=0x7fffffffd9f0) at /usr/local/src/conda/python-3.9.16/Modules/main.c:605
#54 Py_RunMain () at /usr/local/src/conda/python-3.9.16/Modules/main.c:684
#55 0x0000555555740287 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/conda/python-3.9.16/Modules/main.c:1104
#56 0x00007ffff7c66083 in __libc_start_main (main=0x555555740220 <main>, argc=2, argv=0x7fffffffdc08, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdbf8) at ../csu/libc-start.c:308
#57 0x000055555574016d in _start () at /usr/local/src/conda/python-3.9.16/Include/object.h:422

and also another try with gdb:

malloc_consolidate(): invalid chunk size

Thread 1 "python" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50      ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7c64859 in __GI_abort () at abort.c:79
#2  0x00007ffff7ccf26e in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7ffff7df9298 "%s\n") at ../sysdeps/posix/libc_fatal.c:155
#3  0x00007ffff7cd72fc in malloc_printerr (str=str@entry=0x7ffff7dfb278 "malloc_consolidate(): invalid chunk size") at malloc.c:5347
#4  0x00007ffff7cd7ad8 in malloc_consolidate (av=av@entry=0x7ffff7e2eb80 <main_arena>) at malloc.c:4477
#5  0x00007ffff7cd9c83 in _int_malloc (av=av@entry=0x7ffff7e2eb80 <main_arena>, bytes=bytes@entry=1321) at malloc.c:3699
#6  0x00007ffff7cdc299 in __GI___libc_malloc (bytes=1321) at malloc.c:3066
#7  0x00007ffff7271a40 in operator new (sz=1321) at ../../../../libstdc++-v3/libsupc++/new_op.cc:50
#8  0x00007ffff7575f24 in sycl::_V1::detail::device_impl::has_extension(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) const () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#9  0x00007ffff757869d in sycl::_V1::detail::device_impl::has(sycl::_V1::aspect) const () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/../../../libsycl.so.6
#10 0x00007ffff76cda3e in DPCTLDevice_HasAspect () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/libDPCTLSyclInterface.so.0
#11 0x00007ffff771b76d in __pyx_getprop_5dpctl_12_sycl_device_10SyclDevice_has_aspect_fp16(_object*, void*) () from /home/xantvol/miniconda3/envs/dpnp_dev/lib/python3.9/site-packages/dpctl/_sycl_device.cpython-39-x86_64-linux-gnu.so
#12 0x00005555556a6e63 in _PyObject_GenericGetAttrWithDict (obj=0x7fff53d64e30, name=0x7ffff776acf0, dict=0x0, suppress=0) at /usr/local/src/conda/python-3.9.16/Objects/object.c:1201
#13 0x000055555569818b in _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x5555570c9360, throwflag=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:2997
#14 0x00005555556a8da2 in _PyEval_EvalFrame (throwflag=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, f=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
    tstate=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at /usr/local/src/conda/python-3.9.16/Include/internal/pycore_ceval.h:40
#15 function_code_fastcall (tstate=0x55555593d6e0, co=0x7fffe0a3dc90, args=<optimized out>, nargs=4, globals=0x7fffe0a3bd80) at /usr/local/src/conda/python-3.9.16/Objects/call.c:330
#16 0x0000555555697e12 in _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x5555570c9300, callable=0x7fffe0a41310, tstate=0x55555593d6e0) at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:118
#17 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x5555570c9300, callable=0x7fffe0a41310) at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:127
#18 call_function (kwnames=0x0, oparg=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, pp_stack=<synthetic pointer>, tstate=0x55555593d6e0) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:5078
#19 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x5555570c9090, throwflag=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:3521
#20 0x0000555555696b0e in _PyEval_EvalFrame (throwflag=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, f=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
    tstate=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at /usr/local/src/conda/python-3.9.16/Include/internal/pycore_ceval.h:40
#21 _PyEval_EvalCode (tstate=0x55555593d6e0, _co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=<optimized out>, kwcount=<optimized out>, kwstep=1, defs=<optimized out>, defcount=<optimized out>, kwdefs=0x0,
    closure=0x0, name=0x7ffff78e71f0, qualname=0x7fffe0a47710) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:4330
#22 0x000055555569fcd2 in _PyFunction_Vectorcall (kwnames=0x0, nargsf=<optimized out>, stack=<optimized out>, func=0x7fffe08dc790) at /usr/local/src/conda/python-3.9.16/Objects/call.c:396
#23 _PyObject_FastCallDictTstate (tstate=0x55555593d6e0, callable=0x7fffe08dc790, args=<optimized out>, nargsf=<optimized out>, kwargs=<optimized out>) at /usr/local/src/conda/python-3.9.16/Objects/call.c:118
#24 0x00005555556b47e9 in _PyObject_Call_Prepend (tstate=0x55555593d6e0, callable=0x7fffe08dc790, obj=0x7fffe0a44640, args=<optimized out>, kwargs=0x0) at /usr/local/src/conda/python-3.9.16/Objects/call.c:489
#25 0x000055555578d129 in slot_tp_call (self=0x7fffe0a44640, args=0x7fffe08de3c0, kwds=0x0) at /usr/local/src/conda/python-3.9.16/Objects/typeobject.c:6731
#26 0x00005555556a055c in _PyObject_MakeTpCall (tstate=0x55555593d6e0, callable=0x7fffe0a44640, args=<optimized out>, nargs=<optimized out>, keywords=0x0) at /usr/local/src/conda/python-3.9.16/Objects/typeobject.c:3876
#27 0x000055555569c9cb in _PyObject_VectorcallTstate (kwnames=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, nargsf=<optimized out>, args=<optimized out>, callable=0x7fffe0a44640, tstate=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>)
    at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:116
#28 _PyObject_VectorcallTstate (kwnames=0x0, nargsf=<optimized out>, args=0x55555599af10, callable=0x7fffe0a44640, tstate=<optimized out>) at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:103
#29 PyObject_Vectorcall (kwnames=0x0, nargsf=<optimized out>, args=0x55555599af10, callable=0x7fffe0a44640) at /usr/local/src/conda/python-3.9.16/Include/cpython/abstract.h:127
#30 call_function (kwnames=0x0, oparg=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, pp_stack=<synthetic pointer>, tstate=0x55555593d6e0) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:5078
#31 _PyEval_EvalFrameDefault (tstate=<optimized out>, f=0x55555599ad80, throwflag=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:3490
#32 0x0000555555696b0e in _PyEval_EvalFrame (throwflag=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, f=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>,
    tstate=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>) at /usr/local/src/conda/python-3.9.16/Include/internal/pycore_ceval.h:40
#33 _PyEval_EvalCode (tstate=0x55555593d6e0, _co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=<optimized out>, kwcount=<optimized out>, kwstep=2, defs=<optimized out>, defcount=<optimized out>, kwdefs=0x0,
    closure=0x0, name=0x0, qualname=0x0) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:4330
#34 0x0000555555696759 in _PyEval_EvalCodeWithName (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kwnames=<optimized out>, kwargs=0x0, kwcount=0, kwstep=2, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0, name=0x0, qualname=0x0)
    at /usr/local/src/conda/python-3.9.16/Python/ceval.c:4362
#35 0x00005555556966d5 in PyEval_EvalCodeEx (_co=<optimized out>, globals=<optimized out>, locals=<optimized out>, args=<optimized out>, argcount=<optimized out>, kws=<optimized out>, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0, closure=0x0) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:4378
#36 0x000055555574dbcb in PyEval_EvalCode (co=<optimized out>, globals=<optimized out>, locals=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/ceval.c:828
#37 0x0000555555781510 in run_eval_code_obj (tstate=0x55555593d6e0, co=0x7ffff77ec660, globals=0x7ffff7855d80, locals=0x7ffff7855d80) at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:1221
#38 0x000055555577d025 in run_mod (mod=<optimized out>, filename=<optimized out>, globals=0x7ffff7855d80, locals=0x7ffff7855d80, flags=<optimized out>, arena=<optimized out>) at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:1242
#39 0x00005555555ede2b in pyrun_file (fp=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, filename=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, start=<optimized out>,
    globals=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, locals=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, closeit=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, flags=0x7fffffffd9f8)
    at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:1140
#40 0x00005555557763d1 in pyrun_simple_file (flags=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, closeit=<error reading variable: dwarf2_find_location_expression: Corrupted DWARF expression.>, filename=0x7ffff7798390, fp=0x55555593a340)
    at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:450
#41 PyRun_SimpleFileExFlags (fp=0x55555593a340, filename=<optimized out>, closeit=1, flags=0x7fffffffd9f8) at /usr/local/src/conda/python-3.9.16/Python/pythonrun.c:483
#42 0x000055555577309d in pymain_run_file (cf=0x7fffffffd9f8, config=0x55555593e250) at /usr/local/src/conda/python-3.9.16/Modules/main.c:380
#43 pymain_run_python (exitcode=0x7fffffffd9f0) at /usr/local/src/conda/python-3.9.16/Modules/main.c:605
#44 Py_RunMain () at /usr/local/src/conda/python-3.9.16/Modules/main.c:684
#45 0x0000555555740287 in Py_BytesMain (argc=<optimized out>, argv=<optimized out>) at /usr/local/src/conda/python-3.9.16/Modules/main.c:1104
#46 0x00007ffff7c66083 in __libc_start_main (main=0x555555740220 <main>, argc=2, argv=0x7fffffffdc08, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7fffffffdbf8) at ../csu/libc-start.c:308
#47 0x000055555574016d in _start () at /usr/local/src/conda/python-3.9.16/Include/object.h:422

where the list of devices and versions on the laptop is

$ sycl-ls
[opencl:cpu:0] Intel(R) OpenCL, 11th Gen Intel(R) Core(TM) i7-1185G7 @ 3.00GHz 3.0 [2023.16.6.0.06_160000]
[opencl:gpu:1] Intel(R) OpenCL HD Graphics, Intel(R) Graphics [0x9a49] 3.0 [22.28.23726.1]
[opencl:acc:2] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2023.16.6.0.06_160000]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) Graphics [0x9a49] 1.3 [1.3.23726]

$ conda list dpctl
# packages in environment at /home/xantvol/miniconda3/envs/dpnp_dev:
#
# Name                    Version                   Build  Channel
dpctl                     0.14.4           py39h7bf5fec_9    dppy/label/dev
@oleksandr-pavlyk
Copy link
Collaborator

This sounds like a problem with a driver. I tried on Gen9 with

> $ sycl-ls
[opencl:acc:0] Intel(R) FPGA Emulation Platform for OpenCL(TM), Intel(R) FPGA Emulation Device 1.2 [2023.16.6.0.22_223734]
[opencl:cpu:1] Intel(R) OpenCL, Intel(R) Core(TM) i7-10710U CPU @ 1.10GHz 3.0 [2023.16.6.0.22_223734]
[opencl:gpu:2] Intel(R) OpenCL Graphics, Intel(R) UHD Graphics 3.0 [23.22.26516.18]
[ext_oneapi_level_zero:gpu:0] Intel(R) Level-Zero, Intel(R) UHD Graphics 1.3 [1.3.26516]

and the script passed each time I tried.

The script does run into trouble on WSL with Iris Xe, with the driver behavior a suspect.

@oleksandr-pavlyk
Copy link
Collaborator

The issue @antonwolfy had originally encountered with Gen 9 on a Linux OS was fixed by Linux kernel update to pick up the most recent i915 kernel mode driver for the integrated GPU.

The issue with WSL is also likely due to the driver, but more work is needed to pare it down to an actionable bug report to be filed against GPU stack projects.

@oleksandr-pavlyk
Copy link
Collaborator

@antonwolfy is this issue still relevant?

@antonwolfy
Copy link
Collaborator Author

antonwolfy commented Apr 17, 2024

@oleksandr-pavlyk , the issue isn't visible anymore with newer driver. Please feel free to close the issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants