Skip to content

Commit

Permalink
container stop: handle race
Browse files Browse the repository at this point in the history
There is an inherent race when stopping/killing a container with other
processes attempting to do the same and also with the container having
exited in the meantime.  In those cases, the OCI runtime may fail to
kill the container as it has already exited.

Handle those races by first checking if the container state has changed
before returning the error.

[NO NEW TESTS NEEDED] - as it's a hard to test race.

Fixes: containers#18452
Signed-off-by: Valentin Rothberg <vrothberg@redhat.com>
  • Loading branch information
vrothberg committed May 4, 2023
1 parent 4acc1a1 commit 8d0050a
Showing 1 changed file with 11 additions and 6 deletions.
17 changes: 11 additions & 6 deletions libpod/container_internal.go
Original file line number Diff line number Diff line change
Expand Up @@ -1369,12 +1369,6 @@ func (c *Container) stop(timeout uint) error {
}
}

// We have to check stopErr *after* we lock again - otherwise, we have a
// change of panicking on a double-unlock. Ref: GH Issue 9615
if stopErr != nil {
return stopErr
}

// Since we're now subject to a race condition with other processes who
// may have altered the state (and other data), let's check if the
// state has changed. If so, we should return immediately and leave
Expand All @@ -1387,6 +1381,17 @@ func (c *Container) stop(timeout uint) error {
return nil
}

// We have to check stopErr *after* we lock again - otherwise, we have a
// chance of panicking on a double-unlock (see #9615).
//
// If the state has changed, stopErr is ignored (see #18452) since we
// should optimistically assume that another process has already
// stopped/killed the container or that the container has exited in the
// meantime.
if stopErr != nil {
return stopErr
}

c.newContainerEvent(events.Stop)
return c.waitForConmonToExitAndSave()
}
Expand Down

0 comments on commit 8d0050a

Please sign in to comment.