Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignition: Fails to connect to external server to fetch other config to replace itself #474

Open
vboufleur opened this issue May 5, 2020 · 9 comments

Comments

@vboufleur
Copy link

vboufleur commented May 5, 2020

Hi all!

I'm working on a bash script that will takeover cloud Ubuntu 16.04 instances and install CoreOS on top of them. I'm Live ISO booting CoreOS on a VPS (OVH) with a (1) base ignition config embedded with a bash script that will call coreos-installer and pass to it another (2) ignition config that will source a (3) external ignition config.

But this second config file is failing to fetch the third, networked, external one. First it was set to source the file with a direct IP:

variant: fcos
version: 1.0.0
ignition:
  config:
    replace:
      source: http://54.39.179.16/ignition.network.json

The link works: http://54.39.179.16/ignition.network.json

But this failed:
Screen Shot 2020-05-05 at 15 50 02

Then I tried with DNS:

variant: fcos
version: 1.0.0
ignition:
  config:
    replace:
      source: http://devops.ipbdev.com/ignition.network.json

It fails too:
Screen Shot 2020-05-05 at 16 36 47

Here's the source file.

Based Live ISO embedded config (1) and (2):

variant: fcos
version: 1.0.0
systemd:
  units:
  - name: run-coreos-installer.service
    enabled: true
    contents: |
      [Unit]
      After=network-online.target
      Wants=network-online.target
      Before=systemd-user-sessions.service
      OnFailure=emergency.target
      OnFailureJobMode=replace-irreversibly
      [Service]
      RemainAfterExit=yes
      Type=oneshot
      ExecStart=/usr/local/bin/run-coreos-installer
      ExecStartPost=/usr/bin/systemctl --no-block reboot
      StandardOutput=kmsg+console
      StandardError=kmsg+console
      [Install]
      WantedBy=multi-user.target
storage:
  files:
    - path: /home/core/config.ign
      # A basic Ignition config that will replace itself with our network ignition file
      contents:
        inline: |
          {
            "ignition": {
              "config": {
                "replace": {
                  "source": "http://devops.ipbdev.com/ignition.network.json",
                  "verification": {}
                }
              },
              "security": {
                "tls": {}
              },
              "timeouts": {},
              "version": "3.0.0"
            },
            "passwd": {},
            "storage": {},
            "systemd": {}
          }
    - path: /usr/local/bin/run-coreos-installer
      mode: 0755
      contents:
        inline: |
          #!/usr/bin/bash
          set -x
          main() {
                      # Some custom arguments for firstboot
            firstboot_args="console=tty0"

                      ignition_file="/home/core/config.ign"

                        # TODO: Change using stream 'stable' for a defined image, that we host, like below.
                        # image_url="https://54.39.179.16/modified.iso"

            # Dynamically detect which device to install to.
            # This represents something an admin may want to do to share the
            # same installer automation across various hardware.
                        # TODO: For the takeover script this value would need be dynamically set to where / is mounted on the system
            if [ -b /dev/sda ]; then
              install_device='/dev/sda'
            elif [ -b /dev/nvme0 ]; then
              install_device='/dev/nvme0'
            else
              echo "Can't find appropriate device to install to" 1>&2
              echo 'failure'
              return 1
            fi

            # Call out to the installer
            cmd="coreos-installer install --firstboot-args=${firstboot_args}"
            cmd+=" --stream=stable --ignition=${ignition_file}"
            cmd+=" ${install_device}"
            if $cmd; then
              echo "Install Succeeded!"
              echo 'success'
              return 0
            else
              echo "Install Failed!"
              echo 'failure'
              return 1
            fi
          }
          main

Any help would be dearly appreciated.

Shoutout to @dustymabe who made this wonderful article that inspired me to make the script above: https://dustymabe.com/2020/04/04/automating-a-custom-install-of-fedora-coreos/

@vboufleur
Copy link
Author

vboufleur commented May 5, 2020

I'm serving the files over the Web with Nginx, default settings.

@jlebon
Copy link
Member

jlebon commented May 5, 2020

Hmm, does the output show whether NetworkManager tries to bring up networking? What version of the FCOS live ISO are you using? Might be a regression from coreos/fedora-coreos-config#326. One sanity-check is (if you have access to the kernel cmdline) to add rd.neednet=1 and see if it works.

@vboufleur
Copy link
Author

Adding rd.neednet=1 to the kernel command line solved it for me. Thanks!

A doc page detailing all possible kernel command line options for Fedora CoreOS would be great. It would help other people that stumble on this issue.

@jlebon
Copy link
Member

jlebon commented May 6, 2020

Re-opening. We need to double check that one doesn't have to add rd.neednet=1 if an Ignition config is embedded.

@jlebon jlebon reopened this May 6, 2020
@dustymabe
Copy link
Member

hey @vboufleur - what version of the LiveISO are you using? A filename should suffice.

@vboufleur
Copy link
Author

@dustymabe this is the ISO version: fedora-coreos-31.20200407.3.0-live.x86_64.iso

@ingobecker
Copy link

I'm having a similar problem. I have embedded an ignition into an image that looks like this:

variant: fcos
version: 1.0.0
ignition:
  config:
    replace:
      source: http://169.254.169.254/hetzner/v1/user-data

In my case rd.neednet=1 is present in the kernel cmdline. I'm not sure if it is possible to use an ipv4ll address here, but using this pattern would make it possible to use hetzners user_data endpoint without modifying the ignition code. The errors are similar to those of @vboufleur

@ingobecker
Copy link

Ok, i debugged my problem. It was just a typo in the source url (should end with userdata instead of user-data). Sorry for that.

@dustymabe
Copy link
Member

Re-opening. We need to double check that one doesn't have to add rd.neednet=1 if an Ignition config is embedded.

OK I looked at this a bit today (sorry for the delay). From what I understand the problem isn't actually the install boot that needs the network, but rather the subsequent first boot (ignition boot) of the installed system. I think the tricky part here is that passing any --firstboot-args to coreos-installer will overwrite the default networking kargs (defaulting to ip=dhcp,dhcp6 rd.neednet=1). We need to decide if this is a bug or not, though I will note the problem will probably go away once we implement #460 .

@vboufleur a workaround for now is to add ip=dhcp,dhcp6 rd.neednet=1 to your firstboot kargs so they'll get added. Be careful doing that in the script in the fcct from my blog post, though, as the quoting gets tricky in bash. I probably should have used more than one arg in that example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants