kola: Add --no-net flag #1478

cgwalters · 2020-05-24T15:17:30Z

The Docker Hub recently decided that us pulling the nginx image
over and over many times a day was abusive and started rate limiting
us.

For OSTree's test suite at least, there's no good reason for us
to run podman's tests at all. Similarly for e.g. Ignition.

And just as a general rule I think it's useful to cleanly separate
tests that can be run fully offline from those that require
Internet access.

sohankunkerkar · 2020-05-25T00:00:20Z

And the CI failed due to:

--- FAIL: podman.workflow (274.41s)

    --- FAIL: podman.workflow/run (251.35s)

            cluster.go:141: Trying to pull docker.io/library/nginx...

            cluster.go:141:   too many requests to registry

            cluster.go:141: Error: unable to pull docker.io/library/nginx: unable to pull image: Error parsing image configuration: too many requests to registry

            cluster.go:162: "sudo podman run -d -p 80:80 -v /tmp/tmp.kedL9Z7bVf/index.html:/usr/share/nginx/html/index.html:z docker.io/library/nginx" failed: output , status Process exited with status 125

sohankunkerkar · 2020-05-25T00:07:03Z

For OSTree's test suite at least, there's no good reason for us
to run podman's tests at all. Similarly for e.g. Ignition.

+1
I've seen this error a couple of times in Ignition CI but wasn't sure about the fix. Thanks for putting up this PR.

sohankunkerkar · 2020-05-25T00:08:21Z

/approve

Temporarily blacklist this test for now until we're off Docker Hub's naughty list. This is blocking CI and pipelines. See discussions in coreos/coreos-assembler#1478 for a longer-term approach on tests that require Internet access in general.

jlebon · 2020-05-25T16:05:08Z

Hmm actually... since this is (hopefully) temporary, maybe let's just blacklist the tests for now? That way we don't have to spread this new switch everywhere it's needed. OK, did that in: coreos/fedora-coreos-config#425.

But I agree with the motivation of this PR though.

Temporarily blacklist this test for now until we're off Docker Hub's naughty list. This is blocking CI and pipelines. See discussions in coreos/coreos-assembler#1478 for a longer-term approach on tests that require Internet access in general.

Temporarily blacklist these tests for now until we're off Docker Hub's naughty list. This is blocking CI and pipelines. See discussions in coreos/coreos-assembler#1478 for a longer-term approach on tests that require Internet access in general.

cgwalters · 2020-05-26T12:55:24Z

I'm happy with the CI workaround for now, but:

My suggestion is to tag those tests with "net.dockerio.unauth" and to change the test-runner to always run the default tests (i.e. those with empty tag set) plus any tags specified on the CLI.
That could also allow us in the future to have something like "--fallible-tags" to allow soft-failures/flakes.

So far kola itself has not attached any semantic meaning to tags. I'm not opposed to that, but it'd be a notable change.

We also already have the "requires Internet" metadata on these tests note.

I also laid out a rationale why I think we want to support offline tests and not just specifically this Docker Hub issue:

And just as a general rule I think it's useful to cleanly separate tests that can be run fully offline from those that require Internet access.

To elaborate on that, all tests that require Internet are inherently flaky to some degree. Also I want to be able to e.g. run coreos-assembler while I'm on an airplane too.

(There has also been parallel discussions with enhancing the Kubernetes and OpenShift test suites to support being run disconnected, so one can validate a disconnected environment - this is particularly tricky for OpenShift's tests which talk to Github and Docker Hub and quay and...)

lucab · 2020-05-26T13:12:16Z

Ah, I missed the fact that RequiresInternetAccess fields are already in place. Then it looks like all relevant parties already agreed on this direction while plumbing coreos/mantle#967. I'll thus retire my comment above, sorry for the noise.

cgwalters · 2020-05-26T13:24:11Z

/test sanity

jlebon · 2020-05-26T13:57:39Z

mantle/kola/harness.go

 	var blacklisted bool
 	noPattern := hasString("*", patterns)
 	for name, t := range tests {
+		for _, flag := range t.Flags {


Hmm, shouldn't we reset noNetFiltered to false before this? Otherwise once it's set to true, it'll always remain true, no? Which means it would skip every test after a networking test when --no-net is specified.

Or better yet, just move the variable declaration into the for-loop. I guess we could do the same for blacklisted too.

Oooh great catch! I'd tested that we didn't run the network tests and that other tests were run, but not that we still listed all of the tests...

Happened to still have this tab open and one thing I wanted to say here is that this is an excellent example of why we have code review - nothing would be checking today that we're silently not running some tests and it'd be painful to find later.

The Docker Hub recently decided that us pulling the `nginx` image over and over many times a day was abusive and started rate limiting us. For OSTree's test suite at least, there's no good reason for us to run podman's tests at all. Similarly for e.g. Ignition. And just as a general rule I think it's useful to cleanly separate tests that can be run fully offline from those that require Internet access.

jlebon · 2020-05-26T15:17:25Z

/lgtm

openshift-ci-robot · 2020-05-26T15:17:28Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cgwalters, jlebon, sohankunkerkar

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [cgwalters,jlebon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cgwalters · 2020-05-27T21:53:40Z

(There has also been parallel discussions with enhancing the Kubernetes and OpenShift test suites to support being run disconnected, so one can validate a disconnected environment - this is particularly tricky for OpenShift's tests which talk to Github and Docker Hub and quay and...)

Just to xref, openshift/origin#24887

In coreos#1478 we discussed Internet access and tests, adding a flag to disable tests which are flagged as requiring it. This flips things around and *enforces* no Internet access for qemu tests by default unless the test is flagged as requiring it. Unsurprisingly it turns out several tests are missing the flag. (And we also don't presently have a way to flag external tests as requiring Internet, which is another issue)

In coreos#1478 we discussed Internet access and tests, adding a flag to disable tests which are flagged as requiring it. This flips things around and *enforces* no Internet access for qemu tests by default unless the test is flagged as requiring it. Unsurprisingly it turns out several tests are missing the flag. Another issue here is that external tests don't have a way to influcence flags. Continuing our "mimic Debian autopkgtest where applicable" trend, support a `needs-internet` tag, which works for external tests since they can provide flags.

In #1478 we discussed Internet access and tests, adding a flag to disable tests which are flagged as requiring it. This flips things around and *enforces* no Internet access for qemu tests by default unless the test is flagged as requiring it. Unsurprisingly it turns out several tests are missing the flag. Another issue here is that external tests don't have a way to influcence flags. Continuing our "mimic Debian autopkgtest where applicable" trend, support a `needs-internet` tag, which works for external tests since they can provide flags.

Sometime between podman 1.x and 2.x podman started putting full 64 character IDs into the json output. Dynamically detect the length of the ID and compare that number of characters. We haven't noticed this test failing because we've been denylisting the test in the pipeline: coreos#1478

Sometime between podman 1.x and 2.x podman started putting full 64 character IDs into the json output. Dynamically detect the length of the ID and compare that number of characters. We haven't noticed this test failing because we've been denylisting the test in the pipeline: #1478

cgwalters mentioned this pull request May 24, 2020

pull: Add support for sign-verify=<list> ostreedev/ostree#2105

Merged

openshift-ci-robot added the approved label May 25, 2020

This comment has been minimized.

Sign in to view

jlebon mentioned this pull request May 25, 2020

kola-blacklist: add fcos.internet and podman.workflow coreos/fedora-coreos-config#425

Merged

jlebon reviewed May 26, 2020

View reviewed changes

cgwalters force-pushed the nonet-tests branch from 88a658a to 7ec7aef Compare May 26, 2020 15:02

openshift-ci-robot assigned jlebon May 26, 2020

openshift-ci-robot added the lgtm label May 26, 2020

openshift-merge-robot merged commit 4f2a1e2 into coreos:master May 26, 2020

cgwalters mentioned this pull request Jun 4, 2020

tests: Enforce no Internet in qemu unless flagged #1514

Merged

This was referenced Oct 5, 2020

mantle: fix podman.workflow stop test #1758

Merged

Define and ship a "testutils" container #1645

Closed

Adam0Brien mentioned this pull request Mar 6, 2023

F38 Changes: Kola test for shorter shutdown timer coreos/fedora-coreos-config#2247

Merged

c4rt0 mentioned this pull request Feb 14, 2024

rawhide: ext.config.var-mount.scsi-id fails coreos/fedora-coreos-tracker#1670

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kola: Add --no-net flag #1478

kola: Add --no-net flag #1478

cgwalters commented May 24, 2020

sohankunkerkar commented May 25, 2020

sohankunkerkar commented May 25, 2020

sohankunkerkar commented May 25, 2020

This comment has been minimized.

jlebon commented May 25, 2020

cgwalters commented May 26, 2020

lucab commented May 26, 2020 •

edited

Loading

cgwalters commented May 26, 2020

jlebon May 26, 2020

cgwalters May 26, 2020

cgwalters Jun 4, 2020

jlebon commented May 26, 2020

openshift-ci-robot commented May 26, 2020

cgwalters commented May 27, 2020

kola: Add --no-net flag #1478

kola: Add --no-net flag #1478

Conversation

cgwalters commented May 24, 2020

sohankunkerkar commented May 25, 2020

sohankunkerkar commented May 25, 2020

sohankunkerkar commented May 25, 2020

This comment has been minimized.

jlebon commented May 25, 2020

cgwalters commented May 26, 2020

lucab commented May 26, 2020 • edited Loading

cgwalters commented May 26, 2020

jlebon May 26, 2020

Choose a reason for hiding this comment

cgwalters May 26, 2020

Choose a reason for hiding this comment

cgwalters Jun 4, 2020

Choose a reason for hiding this comment

jlebon commented May 26, 2020

openshift-ci-robot commented May 26, 2020

cgwalters commented May 27, 2020

lucab commented May 26, 2020 •

edited

Loading