Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Vulkan Decoder crashes on Windows #239

Closed
Sn0wCrack opened this issue Jan 31, 2024 · 15 comments
Closed

[BUG] Vulkan Decoder crashes on Windows #239

Sn0wCrack opened this issue Jan 31, 2024 · 15 comments

Comments

@Sn0wCrack
Copy link
Contributor

Describe the bug

When selecting vulkan as the decoder, the stream begins, however crashes shortly after starting.

A brief flash of the stream starts, and my system hitches and freezes for a brief moment, no audio appears to play in this time before the crash.

I'm connecting to a standard PS5 using the following stream settings:

20240131200438_chiaki_JXiNOD4BOP

image

Debug Log
Please attach a log with verbose logging enabled.

chiaki_session_2024-01-31_19-35-21-223223.log

This log is trimmed a bit to remove the session handshake.

Seems to crash after the first libplacebo pass from what I can tell, however not entirely familiar with how libplacebo actually works.

To Reproduce

Set the decoder to vulkan on Windows 11.

This may be specific to NVIDIA GPUs, however I'm unsure as I don't have access to a AMD or Intel dedicated GPU to test this.

Expected behavior

No crash, or the option to no select Vulkan on Windows if it does not work entirely on any dedicated GPU.

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop (please complete the following information):

  • OS: Windows
  • Version 11
  • Chiaki4deck Version: 1.6.2
  • CPU: Ryzen 7900X3D
  • GPU: NVIDIA RTX 3080, Driver Version 551.23
@nowrep
Copy link
Contributor

nowrep commented Feb 1, 2024

Does this also happen with 720p? Note there is a bug in libplacebo that makes 1080p render incorrectly (padding at bottom) and 10bit (hdr) as full green (it's patched in Linux builds, but not Windows).

@Sn0wCrack
Copy link
Contributor Author

I've tried Vulkan with H265 (HDR), H265, H264 at 720p and all crash.

As mentioned on my PR for adding CUDA and QSV (#238), using cuda as the value in registry, everything seems to run without issue during the streaming session.

I've tested with d3d11va and that appears to work fine at 1080p or 720p using H265 or H264 as well.

@Sn0wCrack
Copy link
Contributor Author

Just to note, I've also gone back and tried vulkan on a Chiaki build based on the upstream 2.2.0 and it also crashes there.

@nowrep
Copy link
Contributor

nowrep commented Feb 1, 2024

Seems like nvidia driver bug then.

@Sn0wCrack
Copy link
Contributor Author

This might be the case.

I spun up a debug build of chiaki4deck and reproduced the crash, seems to be a segmentation fault in avcodec-60.dll.

Interestingly Vulkan GPU decoding does work when using ffmpeg 6.1 direclty or the latest mpv-git Windows build.

@nowrep
Copy link
Contributor

nowrep commented Feb 5, 2024

Please post backtrace, with ffmpeg debug build. If it crashes in avcodec then it also may be ffmpeg bug.

@streetpea
Copy link
Owner

streetpea commented Feb 6, 2024

Maybe related to this https://trac.ffmpeg.org/ticket/9171 ... also vulkan decoding seems to work fine on my amd gpu using Windows so maybe specific to nvidia

@Sn0wCrack
Copy link
Contributor Author

Sn0wCrack commented Feb 6, 2024

I had thought that might have been related, as a similar thing was previously reported for mpv when I was looking for segmentation faults on Windows with ffmpeg.

Interestingly again, using filters in mpv doesn't cause a crash, however the filters don't actually work, so I imagine they may be handling the crash or filters in some fallback. However others have reported it working as intended on the latest NVIDIA drivers.


Finally got a debug build of ffmpeg n6.1.1 that matches the build that is currently in MSYS2 (MINGW64) and built chiaki4deck with it.

Settings are the same as my initial post. Once again I've tried various combinations of resolution, FPS and video codec and all crash when using the vulkan decoder.

I've confirmed this debug build of mine actually works using d3d11va as well just in case that might have caused a problem.

I've noticed this crash can take a couple code paths, but it always triggers when avcodec_send_packet is invoked.

In all cases I've tested av_vorbis_parse_init seems to be the last function in the stacktrace before the segmentation fault occurs.

Here's a backtrace for 1080p, 60fps, H265:

[avcodec-60.dll] av_vorbis_parse_init 0x00007ffe43202f4b
[avcodec-60.dll] av_vorbis_parse_init 0x00007ffe4320b65d
[avcodec-60.dll] av_vorbis_parse_init 0x00007ffe43214e49
[avcodec-60.dll] avpriv_h264_has_num_reorder_frames 0x00007ffe42b11b79
[avcodec-60.dll] avpriv_dca_parse_core_frame_header 0x00007ffe42826ba5
[avcodec-60.dll] avpriv_dca_parse_core_frame_header 0x00007ffe42827108
[avcodec-60.dll] avpriv_dca_parse_core_frame_header 0x00007ffe42827254
[avcodec-60.dll] avcodec_send_packet 0x00007ffe428275fa
[chiaki.exe] chiaki_ffmpeg_decoder_video_sample_cb ffmpegdecoder.c:125
[chiaki.exe] chiaki_video_receiver_flush_frame videoreceiver.c:204
[chiaki.exe] chiaki_video_receiver_av_packet videoreceiver.c:144
[chiaki.exe] stream_connection_takion_av streamconnection.c:1018
[chiaki.exe] stream_connection_takion_cb streamconnection.c:380
[chiaki.exe] takion_handle_packet_av takion.c:1265
[chiaki.exe] takion_handle_packet takion.c:922
[chiaki.exe] takion_thread_func takion.c:792
[chiaki.exe] win32_thread_func thread.c:20
[kernel32.dll] BaseThreadInitThunk 0x00007fff4f03257d
[ntdll.dll] RtlUserThreadStart 0x00007fff4fa2aa58
<unknown> 0x0000000000000000

@nowrep
Copy link
Contributor

nowrep commented Feb 6, 2024

There are no filters used in chiaki.

The backtrace doesn't look right, anything above avcodec_send_packet is garbage. Those functions can never be called in this sequence, also all of those are for different codecs (dca, h264, vorbis) and you are decoding h265.

@Sn0wCrack
Copy link
Contributor Author

I did think myself that was a little weird, but I am very out of my depth with some of this stuff so I'm learning as I go here.

I've managed to get what looks like a much more believable stacktrace:

[avcodec-60.dll] ff_vk_exec_add_dep_frame vulkan.c:610
[avcodec-60.dll] ff_vk_decode_frame vulkan_decode.c:527
[avcodec-60.dll] vk_hevc_end_frame vulkan_hevc.c:909
[avcodec-60.dll] hevc_decode_frame hevcdec.c:3360
[avcodec-60.dll] decode_simple_internal decode.c:430
[avcodec-60.dll] decode_simple_receive_frame decode.c:609
[avcodec-60.dll] decode_receive_frame_internal decode.c:637
[avcodec-60.dll] avcodec_send_packet decode.c:734
[chiaki.exe] chiaki_ffmpeg_decoder_video_sample_cb ffmpegdecoder.c:125
[chiaki.exe] chiaki_video_receiver_flush_frame videoreceiver.c:204
[chiaki.exe] chiaki_video_receiver_av_packet videoreceiver.c:144
[chiaki.exe] stream_connection_takion_av streamconnection.c:1018
[chiaki.exe] stream_connection_takion_cb streamconnection.c:380
[chiaki.exe] takion_handle_packet_av takion.c:1265
[chiaki.exe] takion_handle_packet takion.c:922
[chiaki.exe] takion_thread_func takion.c:792
[chiaki.exe] win32_thread_func thread.c:20
[kernel32.dll] BaseThreadInitThunk 0x00007fff4f03257d
[ntdll.dll] RtlUserThreadStart 0x00007fff4fa2aa58
<unknown> 0x0000000000000000

@nowrep
Copy link
Contributor

nowrep commented Feb 6, 2024

This looks good now, thanks. Just to make sure, is this ffmpeg 6.1.1 (official release) or did you build from git? (I see, it's 6.1.1)

@nowrep
Copy link
Contributor

nowrep commented Feb 6, 2024

I can't reproduce even if I force use of layered dpb. Can you please try this ffmpeg patch:

diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c
index bf8456b..2c9ff96 100644
--- a/libavutil/vulkan.c
+++ b/libavutil/vulkan.c
@@ -607,6 +607,9 @@ int ff_vk_exec_add_dep_frame(FFVulkanContext *s, FFVkExecContext *e, AVFrame *f,
     uint32_t *queue_family_dst;
     VkAccessFlagBits *access_dst;
 
+    if (!f || !f->hw_frames_ctx)
+        return 1;
+
     AVHWFramesContext *hwfc = (AVHWFramesContext *)f->hw_frames_ctx->data;
     AVVulkanFramesContext *vkfc = hwfc->hwctx;
     AVVkFrame *vkf = (AVVkFrame *)f->data[0];

@Sn0wCrack
Copy link
Contributor Author

I can't reproduce even if I force use of layered dpb. Can you please try this ffmpeg patch:

diff --git a/libavutil/vulkan.c b/libavutil/vulkan.c
index bf8456b..2c9ff96 100644
--- a/libavutil/vulkan.c
+++ b/libavutil/vulkan.c
@@ -607,6 +607,9 @@ int ff_vk_exec_add_dep_frame(FFVulkanContext *s, FFVkExecContext *e, AVFrame *f,
     uint32_t *queue_family_dst;
     VkAccessFlagBits *access_dst;
 
+    if (!f || !f->hw_frames_ctx)
+        return 1;
+
     AVHWFramesContext *hwfc = (AVHWFramesContext *)f->hw_frames_ctx->data;
     AVVulkanFramesContext *vkfc = hwfc->hwctx;
     AVVkFrame *vkf = (AVVkFrame *)f->data[0];

Applied that patch, recompiled and yup, seems to be working fine my testing so far.

I'm not noticing any hitching, stuttering or any artifacting. And I can confirm it does seem to be using my GPU successfully too.

@nowrep
Copy link
Contributor

nowrep commented Feb 7, 2024

Thanks, please open ffmpeg bug report with the backtrace.

@Sn0wCrack
Copy link
Contributor Author

Just a heads up I made the ticket upstream a couple days ago: https://trac.ffmpeg.org/ticket/10847

archlinux-github pushed a commit to archlinux/aur that referenced this issue May 1, 2024
archlinux-github pushed a commit to archlinux/aur that referenced this issue May 1, 2024
archlinux-github pushed a commit to archlinux/aur that referenced this issue May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants