Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teal #119

Merged
merged 6 commits into from
May 13, 2022
Merged

Teal #119

merged 6 commits into from
May 13, 2022

Conversation

mudler
Copy link
Contributor

@mudler mudler commented May 12, 2022

  • Don't be scared from the diff, it's mostly drops
  • Splits os2 into ros-installer, golang code is gone
  • Base image switched to sle micro for rancher + elemental binaries included
  • framework files are now tracked individually and statically (we could go with git submodules, but wanted to keep it simple for now) allowing sandboxed builds
  • Adds a CI workflow which keeps the framework static files mentioned above in sync with cos-toolkit, opening up PRs
  • Should be ready to go to be built with obs/ibs @kkaempf - it also replaces the os2-framework package, with a unique Dockerfile that can be built directly from obs
  • Drop temporarly selinux as SLE Micro for Rancher has supports for it, but as we don't have profiles for it, fails booting
  • Might need a follow-up, the PR pipeline should work, but yeah :)
  • All binaries are implied to be provided as part of the base image. now this repo will be the "end" dockerfile which just applies the customizations from the framework - so for testing, it is enough to provide a different base image with different binaries (e.g. a pinpointed elemental-cli version)

It is better to browse it directly: https://github.com/rancher-sandbox/os2/tree/teal as most of things got simplified and dropped

Draft as gotta test this locally still and trying to wire up the CI

Supersedes #115
Part of #94

Signed-off-by: Ettore Di Giacinto <edigiacinto@suse.com>
@mudler
Copy link
Contributor Author

mudler commented May 12, 2022

Argh local iso works, boots, install fine, but the packer builds on qemu when it runs has no network.. @Itxaka does that ring any bell?!

@Itxaka
Copy link
Contributor

Itxaka commented May 12, 2022

What packer version? I see 1.7.4 in https://github.com/rancher-sandbox/os2/blob/master/ros-image-build#L32 but I only tested properly up to 1.7.3...weird that it worked before...

@Itxaka
Copy link
Contributor

Itxaka commented May 12, 2022

Does this works locally? I mean, its only failing on CI? currently cloning....

@Itxaka
Copy link
Contributor

Itxaka commented May 12, 2022

weird, also fails locally with packer 1.7.3 and 1.7.4 ......

Also if you vnc into the machine and manually add the address (ip address add 10.0.2.15/255.255.255.0 dev eth0) then packer can connect correctly.

I would have a look at the slirp package which provides the user networking for qemu (libslirp0) as the versions between the package in sle micro and opensuse are different

@mudler
Copy link
Contributor Author

mudler commented May 12, 2022

yeap indeed it's weird, locally running with qemu or vbox just works, it is only with packer/qemu (networking setting I guess?) that it fails.. it reminded me your issue when it was not getting the IP (as it looks likes the same here)

gotta check, for sure it has to do with the base image, as master it's green and tests passed just today.. so the image somehow isn't playing well with packer settings

@mudler mudler mentioned this pull request May 13, 2022
@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

weird, also fails locally with packer 1.7.3 and 1.7.4 ......

Also if you vnc into the machine and manually add the address (ip address add 10.0.2.15/255.255.255.0 dev eth0) then packer can connect correctly.

I would have a look at the slirp package which provides the user networking for qemu (libslirp0) as the versions between the package in sle micro and opensuse are different

rancher-822:~ # rpm -qa | grep libslirp
rancher-822:~ # 

:O

@Itxaka
Copy link
Contributor

Itxaka commented May 13, 2022

libslirp for 15.4 received a dhcp fix patch a few days ago: https://build.opensuse.org/package/show/openSUSE:Leap:15.4/libslirp

The same lib doesnt have any patches on 15.3 :/

I found out that if you downgrade libslirp to libslirp0-4.3.1-150300.2.7.1 instead of libslirp0-4.3.1-150300.3.3.1 then packer works, at least locally :/

diff --git a/ros-image-build b/ros-image-build
index 3e0d51f..6222e8c 100755
--- a/ros-image-build
+++ b/ros-image-build
@@ -24,6 +24,7 @@ RUN sed -i -s 's/^# rpm.install.excludedocs/rpm.install.excludedocs/' /etc/zypp/
 RUN zypper ref
 ENV LUET_NOLOCK=true
 # Copy luet from the official images
+RUN zypper in -y libslirp0-4.3.1-150300.2.7.1
 RUN zypper in -y squashfs xorriso curl unzip git qemu-arm qemu-x86 qemu-tools tar pigz go1.16 qemu-uefi-aarch64 mtools rsync
 RUN cd /usr/sbin && \
     rm packer && \

@Itxaka
Copy link
Contributor

Itxaka commented May 13, 2022

Im talking about make ci here btw, but I dont see what has really changed in your PR that could affect that part? meanwhile master seems to be passing :/

@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

libslirp for 15.4 received a dhcp fix patch a few days ago: https://build.opensuse.org/package/show/openSUSE:Leap:15.4/libslirp

The same lib doesnt have any patches on 15.3 :/

I found out that if you downgrade libslirp to libslirp0-4.3.1-150300.2.7.1 instead of libslirp0-4.3.1-150300.3.3.1 then packer works, at least locally :/

diff --git a/ros-image-build b/ros-image-build
index 3e0d51f..6222e8c 100755
--- a/ros-image-build
+++ b/ros-image-build
@@ -24,6 +24,7 @@ RUN sed -i -s 's/^# rpm.install.excludedocs/rpm.install.excludedocs/' /etc/zypp/
 RUN zypper ref
 ENV LUET_NOLOCK=true
 # Copy luet from the official images
+RUN zypper in -y libslirp0-4.3.1-150300.2.7.1
 RUN zypper in -y squashfs xorriso curl unzip git qemu-arm qemu-x86 qemu-tools tar pigz go1.16 qemu-uefi-aarch64 mtools rsync
 RUN cd /usr/sbin && \
     rm packer && \

Thanks for the inputs! blargh. that's so... weird that doesn't fail with an openSUSE based image.. it's all pinned to leap 15.3 on master.. maybe that's the reason, old packages..

@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

Im talking about make ci here btw, but I dont see what has really changed in your PR that could affect that part? meanwhile master seems to be passing :/

yep yep make ci only fails now. the PR mostly changes the base image, for the rest it keeps everything still the same (settings, config, etc). It just drops the go code part

ISO boots fine, everything seems to work.. it's just packer which is failing on setting up network properly

@Itxaka
Copy link
Contributor

Itxaka commented May 13, 2022

very weird becuase I can see it installing the proper version of libslirp on the ci build on master: https://github.com/rancher-sandbox/os2/runs/6405828614?check_suite_focus=true#step:4:2854

So on this patch the base image has changed somehow. In my tests the base image which run packer was based of opensuse 15.4, while on master seems to be 15.3? Is this because now we use the elemental image as TOOLS and that may be based on opensuse 15.4?

EDIT: yeah, elemental image is based of 15.4 so it may well as be that libslirp is broken on 15.4 https://github.com/rancher-sandbox/elemental/blob/main/Dockerfile#L3

@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

very weird becuase I can see it installing the proper version of libslirp on the ci build on master: https://github.com/rancher-sandbox/os2/runs/6405828614?check_suite_focus=true#step:4:2854

So on this patch the base image has changed somehow. In my tests the base image which run packer was based of opensuse 15.4, while on master seems to be 15.3? Is this because now we use the elemental image as TOOLS and that may be based on opensuse 15.4?

yup most likely so, I'm using the elemental image here as tools to keep it tight and simply bump it if we ever need to change tooling (at the end, this makes sense just for CI runs and local development, so..). If we had a latest tag I'd even have attached to it 😅

@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

mmm packer still time outs here, but I've bumped it.. maybe I was too optimistic...

@Itxaka
Copy link
Contributor

Itxaka commented May 13, 2022

yep, even with a fixed libslirp, the qemu plugin on 1.8.0 seems to be broken for user networking.

Also I just noticed..the zypper in stanza in the elemental Dockerfile is cached....so we dont get any system updates on the docker image build...so we may have a broken libslirp in there, while the newest version has a fix for dhcp but we are not bundling it. ARGH.

@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

yep, even with a fixed libslirp, the qemu plugin on 1.8.0 seems to be broken for user networking.

Also I just noticed..the zypper in stanza in the elemental Dockerfile is cached....so we dont get any system updates on the docker image build...so we may have a broken libslirp in there, while the newest version has a fix for dhcp but we are not bundling it. ARGH.

I'll just stick with leap 15.3 now, at the end this could be done in so many ways... 🤷‍♂️

@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

Basing on leap 15.3 seems to work, thanks @Itxaka !

@kkaempf
Copy link
Contributor

kkaempf commented May 13, 2022

The same lib doesnt have any patches on 15.3 :/

I found out that if you downgrade libslirp to libslirp0-4.3.1-150300.2.7.1 instead of libslirp0-4.3.1-150300.3.3.1 then packer works, at least locally :/

That should be raised via bugzilla.opensuse.org

@mudler mudler marked this pull request as ready for review May 13, 2022 14:15
@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

Looks good here, good for review @mjura @Itxaka @davidcassany

Signed-off-by: Ettore Di Giacinto <edigiacinto@suse.com>
SLE Micro for Rancher has selinux supports, but we don't have profiles
for it yet.

Signed-off-by: Ettore Di Giacinto <edigiacinto@suse.com>
Signed-off-by: Ettore Di Giacinto <edigiacinto@suse.com>
Signed-off-by: Ettore Di Giacinto <edigiacinto@suse.com>
Signed-off-by: Ettore Di Giacinto <edigiacinto@suse.com>
@mudler mudler enabled auto-merge (squash) May 13, 2022 14:38
@davidcassany
Copy link
Contributor

  • framework files are now tracked individually and statically (we could go with git submodules, but wanted to keep it simple for now) allowing sandboxed builds

Shouldn't it be enough to consume the latest luet packages in the Dockerfile instead? Is all this to avoid luet install in the Dockerfile so it can be built in OBS?

Copy link
Contributor

@davidcassany davidcassany left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me even I have doubts the framework data exposed this as is here is sustainable.

I'd like to understand under which constraints this static bunch of files is required and discuss possible alternatives. In any case I believe this is disconnected form the teal flavor, so to start moving on with teal this is good.

@mudler
Copy link
Contributor Author

mudler commented May 13, 2022

  • framework files are now tracked individually and statically (we could go with git submodules, but wanted to keep it simple for now) allowing sandboxed builds

Shouldn't it be enough to consume the latest luet packages in the Dockerfile instead? Is all this to avoid luet install in the Dockerfile so it can be built in OBS?

yes exactly, note this is not strictly required here, we could have used OBS services or other means like rancher/elemental-toolkit#1305 . I actually wanted to do a separate repo with the static files as I'm a fan of git submodules, but I do understand it might be a stopper so I preferred to vendor them here and treat them as a static asset to keep it KISS for now. I wanted to make this repository usable in a way that https://github.com/rancher-sandbox/rancher-node-image can be replaced directly with this repository, avoid maintaining both.

Note there is a mechanism to sync back to cOS so there is no need to do anything on our side besides merging PRs afterwards

Also... for local iteration we could have a switch that fetches the assets from cOS in the Dockerfile, but I'd prefer to do that in a follow up

@mudler
Copy link
Contributor Author

mudler commented May 14, 2022

See #125 for an example of sync-up PR

mudler pushed a commit that referenced this pull request Jun 3, 2022
Signed-off-by: David Cassany <dcassany@suse.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants