Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lwt.pause deadlock on spt #97

Open
Firobe opened this issue Sep 19, 2024 · 5 comments
Open

Lwt.pause deadlock on spt #97

Firobe opened this issue Sep 19, 2024 · 5 comments

Comments

@Firobe
Copy link
Member

Firobe commented Sep 19, 2024

open Lwt.Syntax

module Hello (T : Mirage_time.S) = struct
    let f1c = ref 0
    let f2c = ref 0

    let rec f1 () = 
        Logs.info (fun f -> f "f1 (%d)" !f1c);
        incr f1c;
        let* () = Lwt.pause () in
        f1 ()

    let rec f2 () = 
        Logs.info (fun f -> f "f2 (%d)" !f2c);
        incr f2c;
        let* () = Lwt.pause () in
        f2 ()

    let start _ =
        Logs.info (fun f -> f "Start");
        let* _ = Lwt.all [f1 (); f2 ()] in Lwt.return_unit
end

The above unikernel, when compiled and executed for the hvt target, produces the expected output:

2024-09-19T09:27:21-00:00: [INFO] [application] Start
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (0)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (0)
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (1)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (1)
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (2)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (2)
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (3)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (3)
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (4)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (4)
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (5)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (5)
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (6)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (6)
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (7)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (7)
2024-09-19T09:27:21-00:00: [INFO] [application] f2 (8)
2024-09-19T09:27:21-00:00: [INFO] [application] f1 (8)
[...]

However when compiled and executed for the spt target, fibers that have have "paused" are never woken up, and the application deadlocks:

2024-09-19T09:32:32-00:00: [INFO] [application] Start
2024-09-19T09:32:32-00:00: [INFO] [application] f2 (0)
2024-09-19T09:32:32-00:00: [INFO] [application] f1 (0)
[no further logs]
@Firobe Firobe changed the title Lwt.pause deadlock on sp Lwt.pause deadlock on spt Sep 19, 2024
@reynir
Copy link
Member

reynir commented Sep 19, 2024

It seems to me in this case we end up here with a timeout of 0L and call solo5_yield:

mirage-solo5/lib/main.ml

Lines 51 to 58 in 0d9b87b

let timeout =
if Lwt.paused_count () > 0 then 0L
else
match Time.select_next () with
| None -> Int64.add (Time.time ()) (Duration.of_day 1)
| Some tm -> tm
in
let ready_set = solo5_yield timeout in

Then here I lose track of what happens (it calls some assembly)
https://github.com/Solo5/solo5/blob/5478c255911e1dae6c9ab1f448daddf0fb46ea37/bindings/spt/net.c#L91-L114

@reynir
Copy link
Member

reynir commented Sep 19, 2024

Ok, it seems if you call timerfd_settime() with all zeroes it disarms the timer. I'll look into a fix in solo5 spt bindings...

@hannesm
Copy link
Member

hannesm commented Sep 19, 2024

What does hvt do in contrast? Shouldn't that be similar code?

@reynir
Copy link
Member

reynir commented Sep 19, 2024

So hvt had a similar issue:
Solo5/solo5@97a66f6

I tried applying the same fix in reynir/solo5@683529b but in my test it doesn't seem to fix it - but then again when I run strace it seems to call with a zero it_value so I'm doubting my test...

@reynir
Copy link
Member

reynir commented Sep 19, 2024

It was dune cache that was causing trouble...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants