-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kola: Add RHCOS LUKS upgrade test #1467
Conversation
I really think osmet is key for the RHCOS Live ISO experience and we don't have that right now because of the LUKS container thing. I thought then about putting the LUKS header in `/boot` and dropping the LUKS container by default (that came up before) so we could also skip the dm-linear thing. That way it'd be the same as FCOS. I started looking at that code but then I was reminded of a moment of fear I had when something related came up last week - we have absolutely no testing of upgrades around it, and that is *highly* likely to break if we start trying to make any fundamental changes. This paves some of the infrastructure for a bit more "real" upgrade testing where we boot the old OS with a target Ignition (in this case the LUKS config), then reboot into the new OSTree commit. This is in contrast to the "synthetic" `offline-upgrade` which directly overlays the new OSTree commit. Run this test via e.g.: `kola run-upgrade --ignition-version v2 -b rhcos -v --find-parent-image --qemu-image-dir tmp/` which needs to be plumbed into the pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, happy to see an upgrade test for RHCOS! Just some minor comments, though LGTM as is too.
req, err := http.NewRequest("GET", url, nil) | ||
if err != nil { | ||
return "", err | ||
} | ||
|
||
resp, err := http.DefaultClient.Do(req) | ||
if err != nil { | ||
return "", err | ||
} | ||
defer resp.Body.Close() | ||
if resp.StatusCode != http.StatusOK { | ||
return "", fmt.Errorf("url %s status: %s", url, resp.Status) | ||
} | ||
gzr, err := gzip.NewReader(resp.Body) | ||
if err != nil { | ||
return "", err | ||
} | ||
dst, err := os.OpenFile(decompressedDest, os.O_WRONLY|os.O_CREATE, 0666) | ||
if err != nil { | ||
return "", err | ||
} | ||
defer dst.Close() | ||
prefix := filepath.Base(decompressedDest) | ||
if _, err := util.CopyProgress(capnslog.INFO, prefix, dst, gzr, resp.ContentLength); err != nil { | ||
return "", err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this in the sdk as e.g. DownloadCompressedFile
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked at this and yes it would be a good cleanup but I think we want to rip all this out and just fork off coreos-installer download
or so, right? Can I just do that as a followup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, will coreos-installer download
eventually learn to download RHCOS artifacts too?
Hmm, looks like kola flakes. I restarted CI. /lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, jlebon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pulled the PR and ran locally:
- When I specify the
--qemu-image-dir builds/latest/x86_64
, it seems to download the image from the remote (which makes sense when there's no local build), though I was expecting it to pick up the already built local image. Not a blocker just my thought when I'm running the test. - The test seems to hang at the
rhcos.upgrade.luks/upgrade-from-previous
stage for me..
[coreos-assembler]$ /coreos-assembler/mantle/bin/kola run-upgrade --ignition-version v2 -b rhcos -v --find-parent-image --qemu-image-dir builds/latest/x86_64
2020-05-21T19:08:49Z cli: Started logging at level INFO
2020-05-21T19:08:49Z cli: Started logging at level INFO
rhcos-43.81.202003111353.0-qemu.x86_64.qcow2: [ ] 2.15 GB/804 MB
=== RUN rhcos.upgrade.luks
=== RUN rhcos.upgrade.luks/setup
=== RUN rhcos.upgrade.luks/upgrade-from-previous
- Minor:
2.15 GB/804 MB
->2.15 GB/2.15 GB
The goal here is explicitly to test upgrades from an old image (in this case, RHCOS 4.3).
Yeah that one should be fixed when we use
Hmm, it might take 20s-30s depending on your I/O speed but shouldn't be minutes. Can you look in |
That makes sense 👍 The possible cause is:
full log on `rhcos.upgrade.luks/upgrade-from-previous`
|
Nah that one is a red herring - see ostreedev/ostree#2093 The real error is:
Which is is actually concerning because that's the exact failure mode we'd see if we'd somehow broken LUKS during upgrades. Hum...just a sanity check, did you perhaps test this out from a fcos (not rhcos) buildroot? Then you'd be trying to upgrade from rhcos+luks to focs which...oh yes, it's right there in the output:
Yeah that's just not going to work 😄 It probably works without LUKS though - supporting "rebasing" between e.g. RHEL and Fedora was a top-level goal of ostree from the start, we just broke that because the LUKS stuff is currently RHCOS specific but eventually we'll unify! |
Or to restate, please give this a try from a rhcos build! |
Exactly the issue, sorry for the noise :^(
|
I really think osmet is key for the RHCOS Live ISO experience and
we don't have that right now because of the LUKS container
thing.
I thought then about putting the LUKS header in
/boot
and dropping the LUKS container by default
(that came up before) so we could also skip the dm-linear
thing. That way it'd be the same as FCOS.
I started looking at that code but then I was reminded of a
moment of fear I had when something related came up last week -
we have absolutely no testing of upgrades around it, and that
is highly likely to break if we start trying to make any
fundamental changes.
This paves some of the infrastructure for a bit more "real" upgrade
testing where we boot the old OS with a target Ignition (in
this case the LUKS config), then reboot into the new OSTree
commit. This is in contrast to the "synthetic"
offline-upgrade
which directly overlays the new OSTree commit.
Run this test via e.g.:
kola run-upgrade --ignition-version v2 -b rhcos -v --find-parent-image --qemu-image-dir tmp/
which needs to be plumbed into the pipeline.