Link etcd certificates for calico-node error #3464

forkballpitch · 2018-10-06T14:07:44Z

i've got a error here

help me please.

failed: [node5] (item={u's': u'node-node5.pem', u'd': u'cert.crt'}) => {"changed": false, "item": {"d": "cert.crt", "s": "node-node5.pem"}, "msg": "Error while linking: [Errno 2] No such file or directory", "path": "/etc/calico/certs/cert.crt", "state": "absent"}

and my host.ini file is...

[k8s-cluster:children]
kube-master
kube-node

[all]
node1 ansible_host=209.XXX.188.XX ip=209.XXX.188.XX
node2 ansible_host=209.XXX.188.XXX ip=209.XXX.188.XX
node3 ansible_host=209.XXX.188.XXX ip=209.XXX.188.XX
node4 ansible_host=209.XXX.188.XXX ip=209.XXX.188.XX
node5 ansible_host=209.XXX.188.XXX ip=209.XXX.188.XX

[kube-master]
node1
node2
node3

[kube-node]
node4
node5

[etcd]
node1
node2
node3

[calico-rr]

[vault]
node1
node2
node3

mirwan · 2018-10-07T14:42:46Z

@forkballpitch Could you provide the information listed in the issue template (OS, distrib,.., command-line) and the task name that raises the error?

forkballpitch · 2018-10-08T00:10:48Z

@mirwan i just cloned this source "https://github.com/kubernetes-incubator/kubespray.git"
and added more worker server.
if you need more information please tell me.
thank you!

os : ubuntu 16.04.4
kubespray version: latest
command line : ansible-playbook -b -v -i inventory/prod/hosts.ini cluster.yml

bartlaarhoven · 2018-10-08T13:43:02Z

I'm having the same problem. Ubuntu 16.04 clean installs on both kubespray host and kube nodes, kubespray pulled from git, command line:

ansible-playbook -i inventory/kube-cluster-01/hosts.ini cluster.yml

mirwan · 2018-10-08T18:50:54Z

First can you confirm that:

the (last) failed task reported is "Calico | Link etcd certificates for calico-node"
cert_management is set to "vault"
ansible-playbook has been executed with "-b"
the source file for the link does not actually exist (e.g. /etc/ssl/etcd/ssl/node-node5.pem)

If so, could you check if there was any failed task before (on etcd servers during cert generation, memory checks...) ?

bartlaarhoven · 2018-10-08T19:18:08Z

For me:

I attached the output of the failed last task: kubespray-failed-last-task.txt
cert_management was unset (commented out in inventory/kube-cluster-01/group_vars/all/all.yml
the command was ansible-playbook -i inventory/kube-cluster-01/hosts.ini cluster.yml, so no -b
on the failed nodes, the failed source files do not exist indeed

Other possibly related errors or warnings are:

TASK [kubernetes/secrets : Check_certs | Set 'sync_certs' to true on nodes] ***********************************************************************************************************************
Monday 08 October 2018  17:03:51 +0200 (0:00:04.885)       0:05:34.603 ********
 [WARNING]: when statements should not include jinja2 templating delimiters such as {{ }} or {% %}. Found: inventory_hostname in groups['kube-node'] and inventory_hostname != groups['kube-
master'][0] and (not item in kubecert_node.files | map(attribute='path') | map("basename") | list or kubecert_node.files | selectattr("path", "equalto", "{{ kube_cert_dir }}/{{ item }}") |
map(attribute="checksum")|first|default('') != kubecert_master.files | selectattr("path", "equalto", "{{ kube_cert_dir }}/{{ item }}") | map(attribute="checksum")|first|default(''))

but also in the same task:

ok: [node5] => (item=node-node5-key.pem)

I didn't find any failed tasks.

Does this help?

bartlaarhoven · 2018-10-08T19:37:51Z

Additional notes:

node1, node2 and node3 have vault and etcd labels
the node-node5.pem file does exist on node1, node2 and node3 in /etc/ssl/etcd/ssl/ (and so do the other missing files)
on the other nodes like node5, the /etc/ssl/etcd/ssl directory contains ca.pem, node-node1-key.pem and node-node1.pem. That's it.

I'm completely new to ansible and trying kubespray for the first time, so I'd love to help out but I'm still figuring out how it works.

mirwan · 2018-10-08T21:59:33Z

First, I think you must used -b flag (the documentation is being updated that way).
Then, if cert_management is not set in group_vars, there is no need to populate the vault group as the cert management defaults to "script".
Anyway, if node5 cert and key do not exist, it certainly means that it was either not generated or not synced to node5. Can you look at the whole playbook output and see if the "Gen_certs | run cert generation script", "Gen_certs | Gather etcd node certs" and "Gen_certs | Write etcd node certs" tasks run properly?

forkballpitch · 2018-10-09T02:45:27Z

i have a somethin dont understand. first ini file is error file and second one has no error
error is "cat not find /etc/calico/certs/cert.crt"
i have kubespray pulled from git, command line:

ansible-playbook -b -v -i inventory/prod/hosts.ini cluster.yml

host.ini ( error in node4)

[k8s-cluster:children]
kube-master
kube-node

[all]
node1 ansible_host=~ ip=~
node2 ansible_host=~ ip=~
node3 ansible_host=~ ip=~
node4 ansible_host=~ ip=~

[kube-master]
node1
node2

[kube-node]
node1
node2
node3
node4

[etcd]
node1
node2
node3

[calico-rr]

host.ini (no error, i remove node1~3 in node part)

[k8s-cluster:children]
kube-master
kube-node

[all]
node1 ansible_host=~ ip=~
node2 ansible_host=~ ip=~
node3 ansible_host=~ ip=~
node4 ansible_host=~ ip=~

[kube-master]
node1
node2

[kube-node]

node4

[etcd]
node1
node2
node3

[calico-rr]

and it works~!
root@k-01:~/kubespray# kubectl get nodes
NAME STATUS ROLES AGE VERSION
node1 Ready master,node 14m v1.12.1
node2 Ready master,node 14m v1.12.1
node3 Ready node 14m v1.12.1
node4 Ready node 14m v1.12.1

mirwan · 2018-10-09T06:08:15Z

@forkballpitch I didn't think a server could in kube-node and in etcd/kube-master at the same time. The doc says it can, I will inquire.
@bartlaarhoven Maybe it is the same for you?

mirwan · 2018-10-09T08:57:19Z

Actually, mixing masters/etcd and workload (i.e. nodes) is not a best practice in production.
As far as you have enough servers, you should have nodes on one hand and masters and/or etcd on the other hand.
Our current CI only handles mixing master/etcd with nodes when deploying a less than or equal to 3 nodes cluster

mirwan · 2018-10-09T09:00:09Z

@forkballpitch Btw have you reset your servers (with reset.yml playbook) between your deployments with the 2 inventories? kubectl get nodes should not report node1 and node2 as nodes

bartlaarhoven · 2018-10-09T11:43:12Z

I've played around with Ansible and kubespray and opened #3486 as that is what fixed it for me.

dkozlov · 2018-10-10T05:57:40Z

I have reproduced this issue with ansible==2.7.0
As workaround you can install ansible==2.6.3

mirwan · 2018-10-11T07:34:35Z

@bartlaarhoven Regarding @dkozlov 's comment, what version of ansible are you using?

tadeugr · 2018-10-11T20:58:20Z

@bartlaarhoven Regarding @dkozlov 's comment, what version of ansible are you using?

@mirwa, I'm having the same problem and I could confirm Kubespray revision 3b750ca returns this error when using Ansible 2.7.0.

It works with Ansible 2.6.3 as dkozlov said.
It also works with Ansible 2.6.5.

bartlaarhoven · 2018-10-12T06:01:04Z

@dkozlov @mirwan I've used the most recent version of Ansible (fresh install)

ansible-playbook 2.7.0
  config file = None
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
  executable location = /usr/local/bin/ansible-playbook
  python version = 2.7.12 (default, Dec  4 2017, 14:50:18) [GCC 5.4.0 20160609]

mirwan · 2018-10-12T07:52:45Z

I was able to reproduce the issue with ansible 2.7.
It seems that ansible gets messed up at task "etcd : Gen_certs | Write etcd node certs" (cert from one node is written both on the node and another)
Btw, the "etcd : Gen_certs | Get etcd certificate serials" wrongly succeed for the node with the wrong cert.
I'm looking into it

mirwan · 2018-10-12T09:13:01Z

I think we currently hit that issue: ansible/ansible#46600
Maybe there is a fix consisting using another ansible module...

bartlaarhoven · 2018-10-12T09:15:04Z

I have issues signing the collaboration document (as it should be from my company etc.) but I'd like to point again to my PR #3486 as that fixed it for me in Ansible 2.7 and it uses the same way of distributing certificates as in other parts of kubespray.

mirwan · 2018-10-12T12:03:49Z

@bartlaarhoven I'm currently testing your branch ;-)

caruccio · 2018-10-15T20:52:16Z

hey @mirwan any news on this topic? This is a show stopper for me...

mirwan · 2018-10-15T21:42:09Z

@caruccio There's only one step left before merging the PR#3486 (and I guess you know what's left to be done and certainly why this step cannot be skipped). In the meantime, downgrading to ansible 2.6 could do the trick.

caruccio · 2018-10-15T21:46:39Z

I see... I live in Brazil and I really known what bureaucracy means for life on earth.

thiguetta · 2018-10-22T14:05:03Z

I'm still facing this problem on v2.7 and master
any updates on this?

bartlaarhoven · 2018-10-22T14:16:51Z

@mirwan Do you have a contact point for me at TLF to get me another agreement?

ant31 · 2018-10-22T15:46:40Z

@thiguetta as said, it's a bug in ansible 2.7, it's not something we can fix in kubespray.
The only update we have is to use ansible 2.6.x until the ansible team fixes the issue.

mirwan · 2018-10-22T18:42:17Z

@bartlaarhoven I don't have any contact point except the one mentioned by the bot (helpdesk@rt.linuxfoundation.org) :-/

itsecforu · 2019-11-25T07:54:58Z

I fixed it with this solution:
http://itisgood.ru/2019/11/25/ispravljaem-oshibku-error-error-accessing-the-calico-datastore-could-not-initialize-etcdv3-client-open-calico-secrets-cert-crt-no-such-file-or-directory/

mirwan added the triage/needs-information Indicates an issue needs more information in order to work on it. label Oct 7, 2018

bartlaarhoven mentioned this issue Oct 9, 2018

Distribute node etcd certificates like it's done in kubernetes/secrets #3486

Merged

mirwan mentioned this issue Oct 12, 2018

Replace shell with command in order to allow the task to fail when openssl x509 does not return zero #3516

Merged

mirwan removed the triage/needs-information Indicates an issue needs more information in order to work on it. label Oct 15, 2018

This was referenced Oct 22, 2018

Playbook fails on "Get etcd certificate serials" #3570

Closed

unable to load certificate (kubespray setup on AWS) #3571

Closed

Zefool mentioned this issue Oct 25, 2018

Limit ansible version to side-step templating bugs #3587

Closed

ant31 closed this as completed in #3486 Oct 29, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Link etcd certificates for calico-node error #3464

Link etcd certificates for calico-node error #3464

forkballpitch commented Oct 6, 2018 •

edited

Loading

mirwan commented Oct 7, 2018

forkballpitch commented Oct 8, 2018 •

edited

Loading

bartlaarhoven commented Oct 8, 2018

mirwan commented Oct 8, 2018

bartlaarhoven commented Oct 8, 2018

bartlaarhoven commented Oct 8, 2018

mirwan commented Oct 8, 2018

forkballpitch commented Oct 9, 2018 •

edited

Loading

mirwan commented Oct 9, 2018

mirwan commented Oct 9, 2018

mirwan commented Oct 9, 2018

bartlaarhoven commented Oct 9, 2018

dkozlov commented Oct 10, 2018

mirwan commented Oct 11, 2018

tadeugr commented Oct 11, 2018 •

edited

Loading

bartlaarhoven commented Oct 12, 2018

mirwan commented Oct 12, 2018

mirwan commented Oct 12, 2018

bartlaarhoven commented Oct 12, 2018

mirwan commented Oct 12, 2018

caruccio commented Oct 15, 2018

mirwan commented Oct 15, 2018

caruccio commented Oct 15, 2018

thiguetta commented Oct 22, 2018

bartlaarhoven commented Oct 22, 2018

ant31 commented Oct 22, 2018 •

edited

Loading

mirwan commented Oct 22, 2018

itsecforu commented Nov 25, 2019

Link etcd certificates for calico-node error #3464

Link etcd certificates for calico-node error #3464

Comments

forkballpitch commented Oct 6, 2018 • edited Loading

i've got a error here

help me please.

and my host.ini file is...

mirwan commented Oct 7, 2018

forkballpitch commented Oct 8, 2018 • edited Loading

bartlaarhoven commented Oct 8, 2018

mirwan commented Oct 8, 2018

bartlaarhoven commented Oct 8, 2018

bartlaarhoven commented Oct 8, 2018

mirwan commented Oct 8, 2018

forkballpitch commented Oct 9, 2018 • edited Loading

mirwan commented Oct 9, 2018

mirwan commented Oct 9, 2018

mirwan commented Oct 9, 2018

bartlaarhoven commented Oct 9, 2018

dkozlov commented Oct 10, 2018

mirwan commented Oct 11, 2018

tadeugr commented Oct 11, 2018 • edited Loading

bartlaarhoven commented Oct 12, 2018

mirwan commented Oct 12, 2018

mirwan commented Oct 12, 2018

bartlaarhoven commented Oct 12, 2018

mirwan commented Oct 12, 2018

caruccio commented Oct 15, 2018

mirwan commented Oct 15, 2018

caruccio commented Oct 15, 2018

thiguetta commented Oct 22, 2018

bartlaarhoven commented Oct 22, 2018

ant31 commented Oct 22, 2018 • edited Loading

mirwan commented Oct 22, 2018

itsecforu commented Nov 25, 2019

forkballpitch commented Oct 6, 2018 •

edited

Loading

forkballpitch commented Oct 8, 2018 •

edited

Loading

forkballpitch commented Oct 9, 2018 •

edited

Loading

tadeugr commented Oct 11, 2018 •

edited

Loading

ant31 commented Oct 22, 2018 •

edited

Loading