Skip to content

Commit

Permalink
Support running e2e tests on AWS (#45)
Browse files Browse the repository at this point in the history
  • Loading branch information
eaudetcobello authored Oct 23, 2024
1 parent 1012cfc commit 9c70379
Show file tree
Hide file tree
Showing 14 changed files with 572 additions and 14 deletions.
3 changes: 2 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,8 @@ GINKGO_NODES ?= 1 # GINKGO_NODES is the number of parallel nodes to run
GINKGO_TIMEOUT ?= 2h
GINKGO_POLL_PROGRESS_AFTER ?= 60m
GINKGO_POLL_PROGRESS_INTERVAL ?= 5m
E2E_CONF_FILE ?= $(TEST_DIR)/e2e/config/ck8s-docker.yaml
E2E_INFRA ?= docker
E2E_CONF_FILE ?= $(TEST_DIR)/e2e/config/ck8s-$(E2E_INFRA).yaml
SKIP_RESOURCE_CLEANUP ?= false
USE_EXISTING_CLUSTER ?= false
GINKGO_NOCOLOR ?= false
Expand Down
67 changes: 67 additions & 0 deletions test/e2e/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,73 @@ To run a specific e2e test, such as `[PR-Blocking]`, use the `GINKGO_FOCUS` envi
make GINKGO_FOCUS="\\[PR-Blocking\\]" test-e2e # only run e2e test with `[PR-Blocking]` in its spec name
```

### Use an existing cluster as the management cluster

This is useful if you want to use a cluster managed by Tilt.

```shell
make USE_EXISTING_CLUSTER=true test-e2e
```

### Run e2e tests on AWS

To run the tests on AWS you will need to set the AWS_B64ENCODED_CREDENTIALS environment variable.

Then, you can run:

```shell
make E2E_INFRA=aws test-e2e
```

Note: The remediation tests currently do not pass on cloud providers. We recommend excluding these tests from your test runs.

For more information, please refer to the following:

[Kubernetes Slack Discussion](kubernetes.slack.com/archives/C8TSNPY4T/p1680525266510109)

[Github Issue #4198](github.com/kubernetes-sigs/cluster-api-provider-aws/issues/4198)

### Running the tests with Tilt

This section explains how to run the E2E tests on AWS using a management cluster run by Tilt.

This section assumes you have *kind* and *Docker* installed. (See [Prerequisites](https://cluster-api.sigs.k8s.io/developer/tilt#prerequisites))

First, clone the upstream cluster-api and cluster-api-provider-aws repositories.
```shell
git clone https://github.com/kubernetes-sigs/cluster-api.git
git clone https://github.com/kubernetes-sigs/cluster-api-provider-aws.git
```

Next, you need to create a `tilt-settings.yaml` file inside the `cluster-api` directory.
The kustomize_substitutions you provide here are automatically applied to the *management cluster*.
```shell
default_registry: "ghcr.io/canonical/cluster-api-k8s"
provider_repos:
- ../cluster-api-k8s
- ../cluster-api-provider-aws
enable_providers:
- aws
- ck8s-bootstrap
- ck8s-control-plane
```

Tilt will know how to run the aws provider controllers because the `cluster-api-provider-aws` repository has a `tilt-provider.yaml` file at it's root. Canonical Kubernetes also provides this file at the root of the repository. The CK8s provider names, ck8s-bootstrap and ck8s-control-plane, are defined in CK8's `tilt-provider.yaml` file.

Next, you have to customize the variables that will be substituted into the cluster templates applied by the tests (these are under `test/e2e/data/infrastructure-aws`). You can customize the variables in the `test/e2e/config/ck8s-aws.yaml` file under the `variables` key.

Finally, in one terminal, go into the `cluster-api` directory and run `make tilt-up`. You should see a kind cluster be created, and finally a message indicating that Tilt is available at a certain address.

In a second terminal in the `cluster-api-k8s` directory, run `make USE_EXISTING_CLUSTER=true test-e2e`.

### Cleaning up after an e2e test

The test framework tries it's best to cleanup resources after a test suite, but it is possible that
cloud resources are left over. This can be very problematic especially if you run the tests multiple times
while iterating on development (see [Cluster API Book - Tear down](https://cluster-api.sigs.k8s.io/developer/e2e#tear-down)).

You can use a tool like [aws-nuke](https://github.com/eriksten/aws-nuke) to cleanup your AWS account after a test.

## Develop an e2e test

Refer to [Developing E2E tests](https://cluster-api.sigs.k8s.io/developer/e2e) for a complete guide for developing e2e tests.
Expand Down
3 changes: 2 additions & 1 deletion test/e2e/cluster_upgrade_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,7 @@ package e2e
import (
. "github.com/onsi/ginkgo/v2"
"k8s.io/utils/ptr"
"sigs.k8s.io/cluster-api/test/framework/clusterctl"
)

var _ = Describe("Workload cluster upgrade [CK8s-Upgrade]", func() {
Expand All @@ -33,7 +34,7 @@ var _ = Describe("Workload cluster upgrade [CK8s-Upgrade]", func() {
BootstrapClusterProxy: bootstrapClusterProxy,
ArtifactFolder: artifactFolder,
SkipCleanup: skipCleanup,
InfrastructureProvider: ptr.To("docker"),
InfrastructureProvider: ptr.To(clusterctl.DefaultInfrastructureProvider),
ControlPlaneMachineCount: ptr.To[int64](3),
WorkerMachineCount: ptr.To[int64](1),
}
Expand Down
128 changes: 128 additions & 0 deletions test/e2e/config/ck8s-aws.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
---
managementClusterName: capi-test

# E2E test scenario using local dev images and manifests built from the source tree for following providers:
# - bootstrap ck8s
# - control-plane ck8s
images:
# Use local dev images built source tree;
- name: ghcr.io/canonical/cluster-api-k8s/controlplane-controller:dev
loadBehavior: mustLoad
- name: ghcr.io/canonical/cluster-api-k8s/bootstrap-controller:dev
loadBehavior: mustLoad

# List of providers that will be installed into the management cluster
# See InitManagementClusterAndWatchControllerLogs function call
providers:
- name: cluster-api
type: CoreProvider
versions:
- name: v1.8.4
value: https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.8.4/core-components.yaml
type: url
contract: v1beta1
files:
- sourcePath: "../data/shared/v1beta1/metadata.yaml"
replacements:
- old: "imagePullPolicy: Always"
new: "imagePullPolicy: IfNotPresent"
- name: aws
type: InfrastructureProvider
versions:
- name: v2.6.1
value: "https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases/download/v2.6.1/infrastructure-components.yaml"
type: url
contract: v1beta2
files:
- sourcePath: "../data/shared/v1beta1_aws/metadata.yaml"
replacements:
- old: "imagePullPolicy: Always"
new: "imagePullPolicy: IfNotPresent"

# when bootstrapping with tilt, it will use
# https://github.com/kubernetes-sigs/cluster-api/blob/main/hack/tools/internal/tilt-prepare/main.go
# name here should match defaultProviderVersion
- name: v1.9.99
value: "https://github.com/kubernetes-sigs/cluster-api-provider-aws/releases/download/v2.6.1/infrastructure-components.yaml"
type: url
contract: v1beta2
files:
- sourcePath: "../data/shared/v1beta1_aws/metadata.yaml"
replacements:
- old: "imagePullPolicy: Always"
new: "imagePullPolicy: IfNotPresent"
files:
- sourcePath: "../data/infrastructure-aws/cluster-template.yaml"
- name: ck8s
type: BootstrapProvider
versions:
# Could add older release version for upgrading test, but
# by default, will only use the latest version defined in
# ${ProjectRoot}/metadata.yaml to init the management cluster
# this version should be updated when ${ProjectRoot}/metadata.yaml
# is modified
- name: v0.1.99 # next; use manifest from source files
value: "../../../bootstrap/config/default"
replacements:
- old: "ghcr.io/canonical/cluster-api-k8s/bootstrap-controller:latest"
new: "ghcr.io/canonical/cluster-api-k8s/bootstrap-controller:dev"
files:
- sourcePath: "../../../metadata.yaml"
- name: ck8s
type: ControlPlaneProvider
versions:
- name: v0.1.99 # next; use manifest from source files
value: "../../../controlplane/config/default"
replacements:
- old: "ghcr.io/canonical/cluster-api-k8s/controlplane-controller:latest"
new: "ghcr.io/canonical/cluster-api-k8s/controlplane-controller:dev"
files:
- sourcePath: "../../../metadata.yaml"

# These variables replace the variables in test/e2e/data/infrastructure-aws manifests
# They are used during clusterctl generate cluster
variables:
KUBERNETES_VERSION_MANAGEMENT: "v1.30.0"
KUBERNETES_VERSION: "v1.30.0"
KUBERNETES_VERSION_UPGRADE_TO: "v1.30.1"
IP_FAMILY: "IPv4"
KIND_IMAGE_VERSION: "v1.30.0"
AWS_CONTROL_PLANE_INSTANCE_TYPE: t3.large
AWS_NODE_INSTANCE_TYPE: t3.large
AWS_PUBLIC_IP: true
AWS_CREATE_BASTION: true
AWS_SSH_KEY_NAME: "default"
AWS_AMI_ID: "ami-01b139e6226d65e4f"
AWS_CONTROL_PLANE_ROOT_VOLUME_SIZE: 16
AWS_NODE_ROOT_VOLUME_SIZE: 16
AWS_REGION: "us-east-2"
AWS_CCM_IMAGE: "registry.k8s.io/provider-aws/cloud-controller-manager:v1.28.3"
# https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/test/e2e/data/e2e_conf.yaml#L203C1-L205C27
# There is some work to be done here on figuring out which experimental features
# we want to enable/disable.
EXP_CLUSTER_RESOURCE_SET: "true"
EXP_MACHINE_SET_PREFLIGHT_CHECKS: "false"
CLUSTER_TOPOLOGY: "true"
CAPA_LOGLEVEL: "4"

intervals:
# Ref: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/test/e2e/data/e2e_conf.yaml
default/wait-machines: [ "35m", "10s" ]
default/wait-cluster: [ "35m", "10s" ]
default/wait-control-plane: [ "35m", "10s" ]
default/wait-worker-nodes: [ "35m", "10s" ]
conformance/wait-control-plane: [ "35m", "10s" ]
conformance/wait-worker-nodes: [ "35m", "10s" ]
default/wait-controllers: [ "35m", "10s" ]
default/wait-delete-cluster: [ "35m", "10s" ]
default/wait-machine-upgrade: [ "35m", "10s" ]
default/wait-contolplane-upgrade: [ "35m", "10s" ]
default/wait-machine-status: [ "35m", "10s" ]
default/wait-failed-machine-status: [ "35m", "10s" ]
default/wait-infra-subnets: [ "5m", "30s" ]
default/wait-machine-pool-nodes: [ "35m", "10s" ]
default/wait-machine-pool-upgrade: [ "35m", "10s" ]
default/wait-create-identity: [ "3m", "10s" ]
default/wait-job: [ "35m", "10s" ]
default/wait-deployment-ready: [ "35m", "10s" ]
default/wait-loadbalancer-ready: [ "5m", "30s" ]
8 changes: 4 additions & 4 deletions test/e2e/config/ck8s-docker.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ providers:
- name: cluster-api
type: CoreProvider
versions:
- name: v1.6.2
value: https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/core-components.yaml
- name: v1.8.4
value: https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.8.4/core-components.yaml
type: url
files:
- sourcePath: "../data/shared/v1beta1/metadata.yaml"
Expand All @@ -28,8 +28,8 @@ providers:
versions:
# By default, will use the latest version defined in ../data/shared/v1beta1/metadata.yaml
# to init the management cluster
- name: v1.6.2 # used during e2e-test
value: https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.6.2/infrastructure-components-development.yaml
- name: v1.8.4 # used during e2e-test
value: https://github.com/kubernetes-sigs/cluster-api/releases/download/v1.8.4/infrastructure-components-development.yaml
type: url
files:
- sourcePath: "../data/shared/v1beta1/metadata.yaml"
Expand Down
2 changes: 1 addition & 1 deletion test/e2e/create_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ var _ = Describe("Workload cluster creation", func() {
Expect(e2eConfig.Variables).To(HaveKey(KubernetesVersion))

clusterName = fmt.Sprintf("capick8s-create-%s", util.RandomString(6))
infrastructureProvider = "docker"
infrastructureProvider = clusterctl.DefaultInfrastructureProvider

// Setup a Namespace where to host objects for this spec and create a watcher for the namespace events.
namespace, cancelWatches = setupSpecNamespace(ctx, specName, bootstrapClusterProxy, artifactFolder)
Expand Down
Loading

0 comments on commit 9c70379

Please sign in to comment.