Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start rendering assets using cluster-kube-apiserver operator renderer #322

Merged

Conversation

mfojtik
Copy link
Contributor

@mfojtik mfojtik commented Sep 25, 2018

This PR will start generating assets using the new cluster-kube-apiserver-operator image.
Along with manifests and bootstrap static pods that should be used in future by bootkube start, it will provide secrets and configmaps we can feed the openshift-api-operator.

[mfojtik@dev-centos generated]$ docker run -v $(pwd)/tls:/assets --rm openshift/origin-cluster-kube-apiserver-operator:latest /usr/bin/cluster-kube-apiserver-operator render --asset-input-dir=/assets --asset-output-dir=/assets/kube-apiserver-bootstrap --config-output-file=/assets/kube-apiserver-bootstrap/config --config-override-file=/usr/share/bootkube/manifests/config/config-overrides.yaml
Writing asset: /assets/kube-apiserver-bootstrap/bootstrap-manifests/kube-apiserver-pod.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/etcd-service.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/openshift-kube-apiserver-ns.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/secret-aggregator-client.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/secret-etcd-client.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/secret-serving-cert.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/configmap-client-ca.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/configmap-kubelet-serving-ca.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/configmap-sa-token-signing-certs.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/kube-apiserver-daemonset.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/secret-kubelet-client.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/configmap-aggregator-client-ca.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/configmap-etcd-serving-ca.yaml
Writing asset: /assets/kube-apiserver-bootstrap/manifests/configmap-kube-apiserver-config.yaml

/cc @deads2k
/cc @juanvallejo
/cc @sttts

@openshift-ci-robot openshift-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 25, 2018

# shellcheck disable=SC2154
podman run \
--volume "$PWD:/assets:z" \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assumes the $PWD has the generated/tls secrets, is that assumption correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this assumes the $PWD has the generated/tls secrets, is that assumption correct?

The working directory is set here, so the generated TLS will be in ${PWD}/tls. So no generated directory, but I think you're handling this correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, ${PWD}/tls sounds what I want, thanks!

@mfojtik mfojtik force-pushed the add-kube-api-server branch 4 times, most recently from db98924 to 74084ed Compare September 26, 2018 18:36
@openshift-bot openshift-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 30, 2018
@openshift-ci-robot openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Oct 1, 2018
@openshift-bot openshift-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 1, 2018
@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 1, 2018

@smarterclayton @deads2k I updated this to fit into new go templates and also added the controller manager render. Once we are confident that the our operators provide same experience as kube-core control plane, we can switch it over just by copying the manifests we need.

Also this demonstrates that our operator rendering functionality works (we probably should think about the image versions in bootkube.go as currently the :latest is used.

@deads2k
Copy link
Contributor

deads2k commented Oct 1, 2018

@abhinavdahiya this is needed to set up the resources used by our operators. Can you have a look?

/assign @abhinavdahiya

@abhinavdahiya
Copy link
Contributor

This PR is more useful

  1. if we drop the corresponding things from tectonic-operators kube-core-operator bump it here
  2. use the new rendered assets for bootstrapping.

Otherwise this is unused code-path.
cc @crawford

@deads2k
Copy link
Contributor

deads2k commented Oct 1, 2018

This PR is more useful

  1. if we drop the corresponding things from tectonic-operators kube-core-operator bump it here
  2. use the new rendered assets for bootstrapping.

Otherwise this is unused code-path.
cc @crawford

We create additional/different resources. Doesn't this start producing those? We want to enable new things and see them work before removing old.

@abhinavdahiya
Copy link
Contributor

We create additional/different resources. Doesn't this start producing those? We want to enable new things and see them work before removing old.

I meant to say these new files are rendered on disk but not actually used.

@deads2k
Copy link
Contributor

deads2k commented Oct 1, 2018

I meant to say these new files are rendered on disk but not actually used.

Where do we put them to have them created?

@abhinavdahiya
Copy link
Contributor

@deads2k

if [ ! -d kco-bootstrap ]
then
echo "Rendering Kubernetes core manifests..."
# shellcheck disable=SC2154
podman run \
--volume "$PWD:/assets:z" \
--volume /etc/kubernetes:/etc/kubernetes:z \
"{{.KubeCoreRenderImage}}" \
--config=/assets/kco-config.yaml \
--output=/assets/kco-bootstrap
cp --recursive kco-bootstrap/bootstrap-configs /etc/kubernetes/bootstrap-configs
cp --recursive kco-bootstrap/bootstrap-manifests .
cp --recursive kco-bootstrap/manifests .
fi

kube-core-operator renders it assets in 3 dirs:

$ ls -l /opt/tectonic/kco-bootstrap/
bootstrap-configs
bootstrap-manifests
manifests
  • bootstrap-configs
    This is copied to /etc/kubernetes/bootstrap-configs. And is used by bootstrap control plane.

  • bootstrap-manifests
    This is copied to /opt/tectonic; this is then used by bootkube start as source of bootstrap control plane static pods.

  • manifests
    This merged using cp with already existing /opt/tectonic/manifets dir. This /opt/tectonic/manifets is used by bootkube start to push manifests in cluster when api is up.

  • /opt/tectonic/tls and /opt/tectonic/auth
    These directories have the tls assets and kubeconfig for bootstrap control plane respectively. These are also used by bootkube start.

https://github.com/kubernetes-incubator/bootkube/blob/master/pkg/bootkube/bootstrap.go#L28

@abhinavdahiya
Copy link
Contributor

abhinavdahiya commented Oct 3, 2018

@deads2k @mfojtik any progress on this?

You can now get image for your operator using https://github.com/openshift/installer/blob/master/pkg/asset/ignition/content/bootkube.go#L32

@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 3, 2018

@abhinavdahiya updated, i think this can merge (even if it is a no-op for now) and we can figure out what manifest to copy where as second step.

/cc @smarterclayton
/cc @deads2k

@abhinavdahiya
Copy link
Contributor

/approve

--config-output-file=/assets/kube-controller-manager-bootstrap/config

# TODO: copy the bootstrap manifests to replace kube-core-operator
cp --recursive kube-apiserver-bootstrap/manifests/openshift-kube-controller-manager-ns.yaml manifests/00_openshift-kube-controller-manager-ns.yaml
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cp --recursive kube-controller-manager-bootstrap/manifests/openshift-kube-controller-manager-ns.yaml manifests/00_openshift-kube-controller-manager-ns.yaml

@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 3, 2018

still having bootkube fail the first time with

bootkube.sh[808]: cp: cannot stat ‘kube-apiserver-bootstrap/manifests/openshift-kube-controller-manager-ns.yaml’: No such file or directory

@sjenning fixed, I hate bash...

@abhinavdahiya can you re-tag please, hopefully last time...

@abhinavdahiya
Copy link
Contributor

/lgtm

@openshift-ci-robot openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 3, 2018
@openshift-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: abhinavdahiya, mfojtik

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@abhinavdahiya
Copy link
Contributor

/retest

1 similar comment
@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 3, 2018

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@wking
Copy link
Member

wking commented Oct 3, 2018

With multiple folks (including the bot ;) banging away on /retest, it's helpful (for me at least) to drop few lines from the error you saw into a comment here (e.g. here). That makes it easier to see if we're hitting the same error each time, in which case it's likely to be a real bug and not a temporary flake.

@wking
Copy link
Member

wking commented Oct 3, 2018

e2e:

error: .status.conditions accessor error: Failure is of the type string, expected map[string]interface{}
timeout waiting for router to be available
2018/10/03 20:56:49 Container test in pod e2e-aws failed, exit code 1, reason Error
2018

We've seen that before, e.g. here. It's a wait flake.

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 4, 2018

smoke:

could not wait for pod to complete: could not wait for pod completion: the pod ci-op-ytkl853l/e2e-aws-smoke failed after 2h3m33s (failed containers: setup, test):  unknown

Container setup exited with code 1, reason Error
Container test exited with code 1, reason Error
Waiting for API at https://ci-op-0gmd4k3x-1d3f3-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Waiting for API at https://ci-op-0gmd4k3x-1d3f3-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Waiting for API at https://ci-op-0gmd4k3x-1d3f3-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Waiting for API at https://ci-op-0gmd4k3x-1d3f3-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Waiting for API at https://ci-op-0gmd4k3x-1d3f3-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Waiting for API at https://ci-op-0gmd4k3x-1d3f3-api.origin-ci-int-aws.dev.rhcloud.com:6443 to respond ...
Another process exited
2018/10/04 06:28:21 Container test in pod e2e-aws failed, exit code 1, reason Error

i suspect this is this PR fault?

@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 4, 2018

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 4, 2018

failure was:

Error: Error applying plan:

3 error(s) occurred:

* module.bootstrap.aws_iam_role.bootstrap: 1 error(s) occurred:

* aws_iam_role.bootstrap: Error creating IAM Role ci-op-0gmd4k3x-1d3f3-bootstrap-role: EntityAlreadyExists: Role with name ci-op-0gmd4k3x-1d3f3-bootstrap-role already exists.
	status code: 409, request id: 64ced276-c7af-11e8-b362-6b3d06569f8f
* module.masters.aws_iam_role.master_role: 1 error(s) occurred:

* aws_iam_role.master_role: Error creating IAM Role ci-op-0gmd4k3x-1d3f3-master-role: EntityAlreadyExists: Role with name ci-op-0gmd4k3x-1d3f3-master-role already exists.
	status code: 409, request id: 64ceab1c-c7af-11e8-9915-c37dbaece9fa
* module.iam.aws_iam_role.worker_role: 1 error(s) occurred:

* aws_iam_role.worker_role: Error creating IAM Role ci-op-0gmd4k3x-1d3f3-worker-role: EntityAlreadyExists: Role with name ci-op-0gmd4k3x-1d3f3-worker-role already exists.
	status code: 409, request id: 64ce5d08-c7af-11e8-b78d-41b07c69b921

/retest

@mfojtik
Copy link
Contributor Author

mfojtik commented Oct 4, 2018

failed with:

which: no extended.test in (/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin)
/bin/bash: line 93: ginkgo: command not found

/retest

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

2 similar comments
@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

@wking
Copy link
Member

wking commented Oct 4, 2018

/hold

Waiting on #415 to unstick CI.

@openshift-ci-robot openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 4, 2018
@wking
Copy link
Member

wking commented Oct 4, 2018

#415 is in.

/hold cancel
/retest

@openshift-ci-robot openshift-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 4, 2018
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented Oct 4, 2018

@mfojtik: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
ci/prow/e2e-aws-smoke 74084ed link /test e2e-aws-smoke

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest

Please review the full test history for this PR and help us cut down flakes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants