Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd v3.2.5 tls broken or a breaking change is introduced #8534

Closed
raoofm opened this issue Sep 8, 2017 · 30 comments
Closed

etcd v3.2.5 tls broken or a breaking change is introduced #8534

raoofm opened this issue Sep 8, 2017 · 30 comments
Milestone

Comments

@raoofm
Copy link
Contributor

raoofm commented Sep 8, 2017

carrying this over from coreos/etcd-operator#1384

May be related to #8268

@heyitsanthony
Copy link
Contributor

@raoofm what was the version when it was last working?

@heyitsanthony
Copy link
Contributor

OK, I see 3.1.8. Nevermind

@raoofm
Copy link
Contributor Author

raoofm commented Sep 8, 2017

@raoofm
Copy link
Contributor Author

raoofm commented Sep 11, 2017

@heyitsanthony did you get a chance to check this

@gyuho
Copy link
Contributor

gyuho commented Sep 11, 2017

etcdmain: rejected connection from "10.39.167.136:44976" (tls: "10.39.167.136" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])

etcdmain: rejected connection from "10.38.202.130:45086" (tls: "10.38.202.130" does not match any of DNSNames [".etcd-dev-cluster.etcd-operator.svc"])

Why some DNS names don't have *. (from your logs)?

@raoofm
Copy link
Contributor Author

raoofm commented Sep 11, 2017

@gyuho sorry it is the formatting issue. It italicized it from 1 star to the other

@heyitsanthony
Copy link
Contributor

@raoofm you filed the bug friday afternoon at 3:40PM, it's monday morning now. Did you expect me to work after hours on this?

@raoofm
Copy link
Contributor Author

raoofm commented Sep 11, 2017

you filed the bug friday afternoon at 3:40PM, it's monday morning now. Did you expect me to work after hours on this?

😄 no clue on what time zone you work on dude. It was just a friendly reminder to check if it is something that changed with etcd or have I messed up something with my certs. You can take your time and of-course your time and work is greatly appreciated.

@heyitsanthony heyitsanthony added this to the v3.3.0 milestone Sep 12, 2017
@heyitsanthony heyitsanthony self-assigned this Sep 12, 2017
@rockwyc992
Copy link

It could work on my machine with v3.2.7 & v3.1.8, but it could not work with v3.2.0

with v3.2.7

Sep 13 16:53:44 K8s-master etcd[11080]: Go OS/Arch: linux/amd64
Sep 13 16:53:44 K8s-master etcd[11080]: Go Version: go1.9
Sep 13 16:53:44 K8s-master etcd[11080]: Git SHA: GitNotFound
Sep 13 16:53:44 K8s-master etcd[11080]: etcd Version: 3.2.7

with v3.1.8

Sep 13 01:09:06 K8s-master etcd[8490]: Go OS/Arch: linux/amd64
Sep 13 01:09:06 K8s-master etcd[8490]: Go Version: go1.9
Sep 13 01:09:06 K8s-master etcd[8490]: Git SHA: GitNotFound
Sep 13 01:09:06 K8s-master etcd[8490]: etcd Version: 3.1.8

with v3.2.0

Sep 13 00:46:21 K8s-master etcd[6592]: 30f4389d6d822e9 became candidate at term 1304
Sep 13 00:46:21 K8s-master etcd[6592]: 30f4389d6d822e9 is starting a new election at term 1303
Sep 13 00:46:20 K8s-master etcd[6592]: rejected connection from "192.168.254.110:40518" (tls: "192.168.254.110" does not match any of DNSNames ["kubernetes.mine.tw" ...
Sep 13 00:46:20 K8s-master etcd[6592]: rejected connection from "192.168.254.110:40506" (tls: "192.168.254.110" does not match any of DNSNames ["kubernetes.mine.tw" ...
Sep 13 00:46:19 K8s-master etcd[6592]: rejected connection from "192.168.254.110:40468" (tls: "192.168.254.110" does not match any of DNSNames ["kubernetes.mine.tw" ...
Sep 13 00:46:19 K8s-master etcd[6592]: rejected connection from "192.168.254.110:40474" (tls: "192.168.254.110" does not match any of DNSNames ["kubernetes.mine.tw" ...
Sep 13 00:46:19 K8s-master etcd[6592]: 30f4389d6d822e9 [logterm: 1, index: 4] sent MsgVote request to 905789964ad0bb84 at term 1303
Sep 13 00:46:19 K8s-master etcd[6592]: 30f4389d6d822e9 [logterm: 1, index: 4] sent MsgVote request to da3a7ef3a236533 at term 1303
Sep 13 00:46:19 K8s-master etcd[6592]: 30f4389d6d822e9 [logterm: 1, index: 4] sent MsgVote request to 41a34e91fc16717 at term 1303
Sep 13 00:46:19 K8s-master etcd[6592]: 30f4389d6d822e9 received MsgVoteResp from 30f4389d6d822e9 at term 1303
Sep 13 00:46:19 K8s-master etcd[6592]: 30f4389d6d822e9 became candidate at term 1303
.......
Sep 13 00:46:19 K8s-master etcd[6592]: ClientTLS: cert = /etc/kubernetes/ssl/kubernetes.pem, key = /etc/kubernetes/ssl/kubernetes-key.pem, ca = /etc/kubernetes/ssl/kubernetes.pem, trusted-ca = /etc/kubernetes/ssl/ca.pem, client-cert-auth = false
Sep 13 00:46:19 K8s-master etcd[6592]: starting server... [version: 3.2.0, cluster version: to_be_decided]
Sep 13 00:46:19 K8s-master etcd[6592]: simple token is not cryptographically signed
Sep 13 00:46:19 K8s-master etcd[6592]: advertise client URLs = https://192.168.254.100:2379
Sep 13 00:46:19 K8s-master etcd[6592]: peerTLS: cert = /etc/kubernetes/ssl/kubernetes.pem, key = /etc/kubernetes/ssl/kubernetes-key.pem, ca = /etc/kubernetes/ssl/kubernetes.pem, trusted-ca = /etc/kubernetes/ssl/ca.pem, client-cert-auth = false
Sep 13 00:46:19 K8s-master etcd[6592]: the server is already initialized as member before, starting as etcd member...
Sep 13 00:46:19 K8s-master etcd[6592]: Go OS/Arch: linux/amd64
Sep 13 00:46:19 K8s-master etcd[6592]: Go Version: go1.8.3
Sep 13 00:46:19 K8s-master etcd[6592]: Git SHA: 66722b1
Sep 13 00:46:19 K8s-master etcd[6592]: etcd Version: 3.2.0

And I generate key pair with cfssl

ca-csr.json

{
    "CN": "kubernetes",
    "key": {
        "algo": "rsa",
        "size": 8192
    },
    "names": [
        {
            "C": "TW",
            "ST": "Some City",
            "L": "MINE",
            "O": "k8s",
            "OU": "System"
        }
    ]
}

kubenetes-csr.json

{
    "CN": "kubernetes",
    "hosts": [
        "192.168.254.100",
        "192.168.254.110",
        "192.168.254.120",
        "192.168.254.130",
        "kubernetes.mine.tw",
        "k8s.mine.tw",
        "127.0.0.1",
        "10.254.0.1",
        "kubernetes",
        "kubernetes.default",
        "kubernetes.default.svc",
        "kubernetes.default.svc.k8s.mine.tw",
        "kubernetes.default.svc.k8s.mine.tw"
    ],
    "key": {
        "algo": "rsa",
        "size": 8192
    },
    "names": [
        {
            "C": "TW",
            "ST": "Some City",
            "L": "MINE",
            "O": "k8s",
            "OU": "System"
        }
    ]
}

@heyitsanthony
Copy link
Contributor

@rockwyc992 there were TLS changes in 3.2 that were not in 3.1 that introduced some bugs, 3.2.0 isn't expected to work and there have been patch releases since then that fixes it.

@raoofm
Copy link
Contributor Author

raoofm commented Sep 13, 2017

@heyitsanthony v3.2.7 fails with the same reasosn as well

@raoofm
Copy link
Contributor Author

raoofm commented Sep 13, 2017

2017-09-13 15:17:44.727680 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT_2379_TCP_PROTO=tcp
2017-09-13 15:17:44.727776 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_SERVICE_PORT=2379
2017-09-13 15:17:44.740336 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT=tcp://100.71.29.46:2379
2017-09-13 15:17:44.740408 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT_2379_TCP_PORT=2379
2017-09-13 15:17:44.740427 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT_19999_TCP=tcp://100.70.168.89:19999
2017-09-13 15:17:44.740440 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT_2379_TCP_PROTO=tcp
2017-09-13 15:17:44.740457 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_SERVICE_PORT=19999
2017-09-13 15:17:44.740469 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT=tcp://100.70.168.89:19999
2017-09-13 15:17:44.740481 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT_2379_TCP=tcp://100.68.26.102:2379
2017-09-13 15:17:44.740493 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT_2379_TCP=tcp://100.71.29.46:2379
2017-09-13 15:17:44.740506 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT_19999_TCP_ADDR=100.70.168.89
2017-09-13 15:17:44.740518 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_HOST=100.68.26.102
2017-09-13 15:17:44.740552 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_SERVICE_PORT_BACKUP_SERVICE=19999
2017-09-13 15:17:44.740564 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT_19999_TCP_PORT=19999
2017-09-13 15:17:44.740576 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_SERVICE_HOST=100.71.29.46
2017-09-13 15:17:44.740588 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_PORT_CLIENT=2379
2017-09-13 15:17:44.740600 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT_19999_TCP_PROTO=tcp
2017-09-13 15:17:44.740614 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT_2379_TCP_ADDR=100.68.26.102
2017-09-13 15:17:44.740626 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_SERVICE_PORT_CLIENT=2379
2017-09-13 15:17:44.740637 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT=tcp://100.68.26.102:2379
2017-09-13 15:17:44.740649 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_PORT=2379
2017-09-13 15:17:44.740660 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_SERVICE_HOST=100.70.168.89
2017-09-13 15:17:44.740672 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT_2379_TCP_ADDR=100.71.29.46
2017-09-13 15:17:44.740699 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT_2379_TCP_PORT=2379
2017-09-13 15:17:44.740731 I | etcdmain: etcd Version: 3.2.7
2017-09-13 15:17:44.740743 I | etcdmain: Git SHA: bb66589
2017-09-13 15:17:44.740754 I | etcdmain: Go Version: go1.8.3
2017-09-13 15:17:44.740765 I | etcdmain: Go OS/Arch: linux/amd64
2017-09-13 15:17:44.740776 I | etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
2017-09-13 15:17:44.740823 N | etcdmain: the server is already initialized as member before, starting as etcd member...
2017-09-13 15:17:44.740873 I | embed: peerTLS: cert = /etc/etcdtls/member/peer-tls/peer.crt, key = /etc/etcdtls/member/peer-tls/peer.key, ca = , trusted-ca = /etc/etcdtls/member/peer-tls/peer-ca.crt, client-cert-auth = true
2017-09-13 15:17:44.741634 I | embed: listening for peers on https://0.0.0.0:2380
2017-09-13 15:17:44.741713 I | embed: listening for client requests on 0.0.0.0:2379
2017-09-13 15:17:44.743173 I | etcdserver: recovered store from snapshot at index 1
2017-09-13 15:17:44.746319 I | etcdserver: name = etcd-dev-cluster-0012
2017-09-13 15:17:44.746330 I | etcdserver: data dir = /var/etcd/data
2017-09-13 15:17:44.746335 I | etcdserver: member dir = /var/etcd/data/member
2017-09-13 15:17:44.746339 I | etcdserver: heartbeat = 100ms
2017-09-13 15:17:44.746342 I | etcdserver: election = 1000ms
2017-09-13 15:17:44.746345 I | etcdserver: snapshot count = 100000
2017-09-13 15:17:44.746353 I | etcdserver: advertise client URLs = https://etcd-dev-cluster-0012.etcd-dev-cluster.etcd-operator.svc:2379
2017-09-13 15:17:44.747248 I | etcdserver: restarting member 4b2bc20494bab7a9 in cluster e99b721a3b645472 at commit index 1
2017-09-13 15:17:44.747325 I | raft: 4b2bc20494bab7a9 became follower at term 1
2017-09-13 15:17:44.747357 I | raft: newRaft 4b2bc20494bab7a9 [peers: [4b2bc20494bab7a9], term: 1, commit: 1, applied: 1, lastindex: 1, lastterm: 1]
2017-09-13 15:17:44.747518 I | etcdserver/membership: added member 4b2bc20494bab7a9 [https://etcd-dev-cluster-0012.etcd-dev-cluster.etcd-operator.svc:2380] to cluster e99b721a3b645472 from store
2017-09-13 15:17:44.765694 W | auth: simple token is not cryptographically signed
2017-09-13 15:17:44.770025 I | etcdserver: starting server... [version: 3.2.7, cluster version: to_be_decided]
2017-09-13 15:17:44.770084 I | embed: ClientTLS: cert = /etc/etcdtls/member/server-tls/server.crt, key = /etc/etcdtls/member/server-tls/server.key, ca = , trusted-ca = /etc/etcdtls/member/server-tls/server-ca.crt, client-cert-auth = true
2017-09-13 15:17:44.818324 I | etcdmain: rejected connection from "10.36.249.250:42572" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:44.870662 I | etcdmain: rejected connection from "10.36.249.250:42570" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:44.980004 I | etcdmain: rejected connection from "10.36.249.250:42576" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.035089 I | etcdmain: rejected connection from "10.36.249.250:42586" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.123654 I | etcdmain: rejected connection from "10.36.249.250:42592" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.178069 I | etcdmain: rejected connection from "10.36.249.250:42594" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.272815 I | etcdmain: rejected connection from "10.36.249.250:42596" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.326716 I | etcdmain: rejected connection from "10.36.249.250:42598" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.418885 I | etcdmain: rejected connection from "10.36.249.250:42600" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.511501 I | etcdmain: rejected connection from "10.36.249.250:42602" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.547771 I | raft: 4b2bc20494bab7a9 is starting a new election at term 1
2017-09-13 15:17:45.547853 I | raft: 4b2bc20494bab7a9 became candidate at term 2
2017-09-13 15:17:45.547887 I | raft: 4b2bc20494bab7a9 received MsgVoteResp from 4b2bc20494bab7a9 at term 2
2017-09-13 15:17:45.547945 I | raft: 4b2bc20494bab7a9 became leader at term 2
2017-09-13 15:17:45.547968 I | raft: raft.node: 4b2bc20494bab7a9 elected leader 4b2bc20494bab7a9 at term 2
2017-09-13 15:17:45.548612 I | etcdserver: setting up the initial cluster version to 3.2
2017-09-13 15:17:45.548706 I | etcdserver: published {Name:etcd-dev-cluster-0012 ClientURLs:[https://etcd-dev-cluster-0012.etcd-dev-cluster.etcd-operator.svc:2379]} to cluster e99b721a3b645472
2017-09-13 15:17:45.548823 I | embed: ready to serve client requests
2017-09-13 15:17:45.549408 I | embed: serving client requests on [::]:2379
2017-09-13 15:17:45.554466 N | etcdserver/membership: set the initial cluster version to 3.2
2017-09-13 15:17:45.554620 I | etcdserver/api: enabled capabilities for version 3.2
2017-09-13 15:17:45.571737 I | etcdserver/api/v3rpc: Failed to dial 0.0.0.0:2379: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-09-13 15:17:45.619541 I | etcdmain: rejected connection from "10.36.249.250:42604" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.704822 I | etcdmain: rejected connection from "10.36.249.250:42606" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.764438 I | etcdmain: rejected connection from "10.36.249.250:42608" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.796317 I | etcdmain: rejected connection from "10.36.249.250:42610" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.861066 I | etcdmain: rejected connection from "10.36.249.250:42612" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:45.961337 I | etcdmain: rejected connection from "10.36.249.250:42614" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.066964 I | etcdmain: rejected connection from "10.36.249.250:42616" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.127851 I | etcdmain: rejected connection from "10.36.249.250:42618" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.240670 I | etcdmain: rejected connection from "10.36.249.250:42620" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.307218 I | etcdmain: rejected connection from "10.36.249.250:42622" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.392842 I | etcdmain: rejected connection from "10.36.249.250:42626" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.468529 I | etcdmain: rejected connection from "10.36.249.250:42628" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.534494 I | etcdmain: rejected connection from "10.36.249.250:42630" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.593796 I | etcdmain: rejected connection from "10.36.249.250:42632" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.673673 I | etcdmain: rejected connection from "10.36.249.250:42634" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.723870 I | etcdmain: rejected connection from "10.36.249.250:42636" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])
2017-09-13 15:17:46.852846 I | etcdmain: rejected connection from "10.36.249.250:42638" (EOF)
2017-09-13 15:17:54.960160 I | etcdserver/membership: added member 6fd30011e031a163 [https://etcd-dev-cluster-0013.etcd-dev-cluster.etcd-operator.svc:2380] to cluster e99b721a3b645472
2017-09-13 15:17:54.960260 I | rafthttp: starting peer 6fd30011e031a163...
2017-09-13 15:17:54.960309 I | rafthttp: started HTTP pipelining with peer 6fd30011e031a163
2017-09-13 15:17:54.963752 I | rafthttp: started peer 6fd30011e031a163
2017-09-13 15:17:54.967406 I | rafthttp: added peer 6fd30011e031a163
2017-09-13 15:17:54.967823 I | rafthttp: started streaming with peer 6fd30011e031a163 (writer)
2017-09-13 15:17:54.967869 I | rafthttp: started streaming with peer 6fd30011e031a163 (writer)
2017-09-13 15:17:54.967891 I | rafthttp: started streaming with peer 6fd30011e031a163 (stream MsgApp v2 reader)
2017-09-13 15:17:54.968415 I | rafthttp: started streaming with peer 6fd30011e031a163 (stream Message reader)
2017-09-13 15:17:56.547762 W | raft: 4b2bc20494bab7a9 stepped down to follower since quorum is not active
2017-09-13 15:17:56.547793 I | raft: 4b2bc20494bab7a9 became follower at term 2
2017-09-13 15:17:56.547801 I | raft: raft.node: 4b2bc20494bab7a9 lost leader 4b2bc20494bab7a9 at term 2
2017-09-13 15:17:56.781709 I | etcdmain: rejected connection from "10.36.249.250:43034" (tls: "10.36.249.250" does not match any of DNSNames ["*.etcd-dev-cluster.etcd-operator.svc"])

@heyitsanthony
Copy link
Contributor

heyitsanthony commented Sep 13, 2017

@raoofm it looks like rdns isn't set up right, what does dig -x 10.36.249.250 give?

@raoofm
Copy link
Contributor Author

raoofm commented Sep 13, 2017

/ # dig +x 10.36.249.250 
/bin/sh: dig: not found
/ # nslookup 10.36.249.250
nslookup: can't resolve '(null)': Name does not resolve

Name:      10.36.249.250
Address 1: 10.36.249.250 etcd-dev-cluster-0000.etcd-dev-cluster.etcd-operator.svc.cluster.local

@heyitsanthony
Copy link
Contributor

@raoofm *.etcd-dev-cluster.etcd-operator.svc" doesn't match *.etcd-dev-cluster.etcd-operator.svc.cluster.local

@raoofm
Copy link
Contributor Author

raoofm commented Sep 13, 2017

@heyitsanthony agree, thnx

probably etcd-operator needs to correct their doc for cluster-tls
I'll add this to the cert host using cfssl and regenerate

@raoofm
Copy link
Contributor Author

raoofm commented Sep 13, 2017

https://github.com/coreos/etcd-operator/blob/master/doc/user/cluster_tls.md

peer.crt: peer communication cert. The certificate should allow wildcard domain *.${clusterName}.${namespace}.svc
server.crt: etcd server's client communication cert. The certificate should allow wildcard domain *.${clusterName}.${namespace}.svc, ${clusterName}-client.${namespace}.svc, and localhost

@heyitsanthony
Copy link
Contributor

Closing since this is not an etcd bug.

raoofm added a commit to raoofm/etcd-operator that referenced this issue Sep 14, 2017
doc: cluster_tls, wildcard domain correction

based on the discussion etcd-io/etcd#8534 (comment)
@hongchaodeng
Copy link
Contributor

hongchaodeng commented Sep 14, 2017

@raoofm

nslookup 10.36.249.250

Reverse DNS lookup like this will give the full domain name, even if before expanded it ends with ".svc" only. That's not an evidence for the URL the peer is talking to.

Can you print out the etcd container's command line to show the etcd peers' URLs ?

@raoofm
Copy link
Contributor Author

raoofm commented Sep 14, 2017

@hongchaodeng @heyitsanthony @xiang90
Tried again with operator 0.5.2 and etcd 3.2.7

my current setup, using cfssl
ca-config.json

{
    "signing": {
        "default": {
            "expiry": "87600h"
        },
        "profiles": {
            "server": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth"
                ]
            },
            "client": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "client auth"
                ]
            },
            "peer": {
                "expiry": "87600h",
                "usages": [
                    "signing",
                    "key encipherment",
                    "server auth",
                    "client auth"
                ]
            }
        }
    }
}

client.json

{
    "CN": "client",
    "hosts": [],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "US",
            "L": ".....",
            "ST": ".....",
            "O": " .....",
            "OU": ".....",
            "OU": "....."
        }
    ]
}

peer.json

{
    "CN": "peer",
    "hosts": [
        "*.etcd-dev-cluster.etcd.svc"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "US",
            "L": ".....",
            "ST": ".....",
            "O": ".....",
            "OU": ".....",
            "OU": "....."
        }
    ]
}

server.json

{
    "CN": "server",
    "hosts": [
        "*.etcd-dev-cluster.etcd.svc",
        "etcd-dev-cluster-client.etcd.svc",
        "localhost"
    ],
    "key": {
        "algo": "rsa",
        "size": 2048
    },
    "names": [
        {
            "C": "US",
            "L": ".....",
            "ST": ".....",
            "O": ".....",
            "OU": ".....",
            "OU": "....."
        }
    ]
}

etcd-operator logs

time="2017-09-14T21:19:01Z" level=info msg="etcd-operator Version: 0.5.2" 
time="2017-09-14T21:19:01Z" level=info msg="Git SHA: 0fa2f7d" 
time="2017-09-14T21:19:01Z" level=info msg="Go Version: go1.8.1" 
time="2017-09-14T21:19:01Z" level=info msg="Go OS/Arch: linux/amd64" 
time="2017-09-14T21:19:19Z" level=info msg="Event(v1.ObjectReference{Kind:"Endpoints", Namespace:"etcd", Name:"etcd-operator", UID:"fac80426-9986-11e7-bb7e-124eda747fa6", APIVersion:"v1", ResourceVersion:"34513730", FieldPath:""}): type: 'Normal' reason: 'LeaderElection' etcd-operator-3926925624-1w3jk became leader" 
2017-09-14 21:20:40.017154 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
time="2017-09-14T21:20:40Z" level=info msg="creating cluster with Spec:" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="{" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    "size": 3," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    "baseImage": "quay.io/coreos/etcd"," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    "version": "3.2.7"," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    "pod": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        "resources": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            "limits": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                "cpu": "300m"," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                "memory": "200Mi"" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            }," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            "requests": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                "cpu": "200m"," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                "memory": "100Mi"" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            }" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        }" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    }," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    "backup": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        "pod": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            "resources": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                "limits": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                    "cpu": "50m"," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                    "memory": "50Mi"" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                }," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                "requests": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                    "cpu": "20m"," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                    "memory": "20Mi"" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                }" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            }" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        }," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        "storageType": "PersistentVolume"," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        "pv": {" cluster-name=etcd-dev-cluster pkg=cluster
time="2017-09-14T21:20:40Z" level=info msg="            "volumeSizeInMB": 5120," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            "storageClass": """ cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        }," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        "backupIntervalInSecond": 30," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        "maxBackups": 5," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        "autoDelete": true" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    }," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    "TLS": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        "static": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            "member": {" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                "peerSecret": "etcd-peer-tls"," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="                "serverSecret": "etcd-server-tls"" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            }," cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="            "operatorSecret": "etcd-client-tls"" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="        }" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="    }" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=info msg="}" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:40Z" level=warning msg="failed to check service etcd-dev-cluster-client-service-lb: no owner" pkg=gc 
time="2017-09-14T21:20:44Z" level=info msg="backup sidecar deployment and service created" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:44Z" level=info msg="cluster created with seed member (etcd-dev-cluster-0000)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:44Z" level=info msg="start running..." cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:52Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:52Z" level=info msg="running members: etcd-dev-cluster-0000" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:52Z" level=info msg="cluster membership: etcd-dev-cluster-0000" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:52Z" level=info msg="added member (etcd-dev-cluster-0001)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:52Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:20:57Z" level=warning msg="failed to update local backup service status: get service status (etcd-dev-cluster-backup-sidecar:19999) failed: Get http://etcd-dev-cluster-backup-sidecar:19999/v1/status: context deadline exceeded" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:02Z" level=warning msg="health check of etcd member (https://etcd-dev-cluster-0001.etcd-dev-cluster.etcd.svc:2379) failed: failed to create etcd client for https://etcd-dev-cluster-0001.etcd-dev-cluster.etcd.svc:2379: grpc: timed out when dialing" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:10Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:10Z" level=info msg="running members: etcd-dev-cluster-0000" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:10Z" level=info msg="cluster membership: etcd-dev-cluster-0000,etcd-dev-cluster-0001" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:10Z" level=info msg="Disaster recovery" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:10Z" level=info msg="pods are still running (etcd-dev-cluster-0000). Will try to make a latest backup from one of them." cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:10Z" level=info msg="made a latest backup" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:10Z" level=info msg="cluster created with seed member (etcd-dev-cluster-0002)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:10Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:15Z" level=warning msg="health check of etcd member (https://etcd-dev-cluster-0002.etcd-dev-cluster.etcd.svc:2379) failed: failed to create etcd client for https://etcd-dev-cluster-0002.etcd-dev-cluster.etcd.svc:2379: grpc: timed out when dialing" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:24Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:24Z" level=info msg="running members: etcd-dev-cluster-0002" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:24Z" level=info msg="cluster membership: etcd-dev-cluster-0002" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:24Z" level=info msg="added member (etcd-dev-cluster-0003)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:24Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:29Z" level=warning msg="health check of etcd member (https://etcd-dev-cluster-0003.etcd-dev-cluster.etcd.svc:2379) failed: failed to create etcd client for https://etcd-dev-cluster-0003.etcd-dev-cluster.etcd.svc:2379: grpc: timed out when dialing" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:37Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:37Z" level=info msg="running members: etcd-dev-cluster-0002" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:37Z" level=info msg="cluster membership: etcd-dev-cluster-0002,etcd-dev-cluster-0003" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:37Z" level=info msg="Disaster recovery" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:37Z" level=info msg="pods are still running (etcd-dev-cluster-0002). Will try to make a latest backup from one of them." cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:37Z" level=info msg="made a latest backup" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:37Z" level=info msg="cluster created with seed member (etcd-dev-cluster-0004)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:37Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:42Z" level=warning msg="health check of etcd member (https://etcd-dev-cluster-0004.etcd-dev-cluster.etcd.svc:2379) failed: failed to create etcd client for https://etcd-dev-cluster-0004.etcd-dev-cluster.etcd.svc:2379: grpc: timed out when dialing" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:50Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-dev-cluster-0004])" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:21:58Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-dev-cluster-0004])" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:06Z" level=info msg="skip reconciliation: running ([]), pending ([etcd-dev-cluster-0004])" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:14Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:14Z" level=info msg="running members: etcd-dev-cluster-0004" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:14Z" level=info msg="cluster membership: etcd-dev-cluster-0004" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:14Z" level=info msg="added member (etcd-dev-cluster-0005)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:14Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:19Z" level=warning msg="health check of etcd member (https://etcd-dev-cluster-0005.etcd-dev-cluster.etcd.svc:2379) failed: failed to create etcd client for https://etcd-dev-cluster-0005.etcd-dev-cluster.etcd.svc:2379: grpc: timed out when dialing" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:28Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:28Z" level=info msg="running members: etcd-dev-cluster-0004" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:28Z" level=info msg="cluster membership: etcd-dev-cluster-0004,etcd-dev-cluster-0005" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:28Z" level=info msg="Disaster recovery" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:28Z" level=info msg="pods are still running (etcd-dev-cluster-0004). Will try to make a latest backup from one of them." cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:28Z" level=info msg="made a latest backup" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:28Z" level=info msg="cluster created with seed member (etcd-dev-cluster-0006)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:28Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:33Z" level=warning msg="health check of etcd member (https://etcd-dev-cluster-0006.etcd-dev-cluster.etcd.svc:2379) failed: failed to create etcd client for https://etcd-dev-cluster-0006.etcd-dev-cluster.etcd.svc:2379: grpc: timed out when dialing" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:41Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:41Z" level=info msg="running members: etcd-dev-cluster-0006" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:41Z" level=info msg="cluster membership: etcd-dev-cluster-0006" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:41Z" level=info msg="added member (etcd-dev-cluster-0007)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:41Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:46Z" level=warning msg="health check of etcd member (https://etcd-dev-cluster-0007.etcd-dev-cluster.etcd.svc:2379) failed: failed to create etcd client for https://etcd-dev-cluster-0007.etcd-dev-cluster.etcd.svc:2379: grpc: timed out when dialing" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:54Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:54Z" level=info msg="running members: etcd-dev-cluster-0006" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:54Z" level=info msg="cluster membership: etcd-dev-cluster-0006,etcd-dev-cluster-0007" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:54Z" level=info msg="Disaster recovery" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:54Z" level=info msg="pods are still running (etcd-dev-cluster-0006). Will try to make a latest backup from one of them." cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:54Z" level=info msg="made a latest backup" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:54Z" level=info msg="cluster created with seed member (etcd-dev-cluster-0008)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:22:54Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:07Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:07Z" level=info msg="running members: etcd-dev-cluster-0008" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:07Z" level=info msg="cluster membership: etcd-dev-cluster-0008" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:07Z" level=info msg="added member (etcd-dev-cluster-0009)" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:07Z" level=info msg="Finish reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:13Z" level=warning msg="health check of etcd member (https://etcd-dev-cluster-0009.etcd-dev-cluster.etcd.svc:2379) failed: failed to create etcd client for https://etcd-dev-cluster-0009.etcd-dev-cluster.etcd.svc:2379: grpc: timed out when dialing" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:21Z" level=info msg="Start reconciling" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:21Z" level=info msg="running members: etcd-dev-cluster-0008" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:21Z" level=info msg="cluster membership: etcd-dev-cluster-0008,etcd-dev-cluster-0009" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:21Z" level=info msg="Disaster recovery" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:21Z" level=info msg="pods are still running (etcd-dev-cluster-0008). Will try to make a latest backup from one of them." cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:21Z" level=info msg="made a latest backup" cluster-name=etcd-dev-cluster pkg=cluster 
time="2017-09-14T21:23:21Z" level=info msg="cluster created with seed member (etcd-dev-cluster-0010)" cluster-name=etcd-dev-cluster pkg=cluster 

etcd -0000 logs

....
.............
...............
2017-09-14 21:20:46.316224 I | etcdserver: advertise client URLs = https://etcd-dev-cluster-0000.etcd-dev-cluster.etcd.svc:2379
2017-09-14 21:20:46.316266 I | etcdserver: initial advertise peer URLs = https://etcd-dev-cluster-0000.etcd-dev-cluster.etcd.svc:2380
2017-09-14 21:20:46.316291 I | etcdserver: initial cluster = etcd-dev-cluster-0000=https://etcd-dev-cluster-0000.etcd-dev-cluster.etcd.svc:2380
2017-09-14 21:20:46.322341 I | etcdserver: starting member 1c0ed140248ca8b6 in cluster 843c9aff6d4417d4
2017-09-14 21:20:46.322409 I | raft: 1c0ed140248ca8b6 became follower at term 0
2017-09-14 21:20:46.322435 I | raft: newRaft 1c0ed140248ca8b6 [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2017-09-14 21:20:46.322455 I | raft: 1c0ed140248ca8b6 became follower at term 1
2017-09-14 21:20:46.337632 W | auth: simple token is not cryptographically signed
2017-09-14 21:20:46.344569 I | etcdserver: starting server... [version: 3.2.7, cluster version: to_be_decided]
2017-09-14 21:20:46.344679 I | embed: ClientTLS: cert = /etc/etcdtls/member/server-tls/server.crt, key = /etc/etcdtls/member/server-tls/server.key, ca = , trusted-ca = /etc/etcdtls/member/server-tls/server-ca.crt, client-cert-auth = true
2017-09-14 21:20:46.346443 I | etcdserver/membership: added member 1c0ed140248ca8b6 [https://etcd-dev-cluster-0000.etcd-dev-cluster.etcd.svc:2380] to cluster 843c9aff6d4417d4
2017-09-14 21:20:46.722783 I | raft: 1c0ed140248ca8b6 is starting a new election at term 1
2017-09-14 21:20:46.722963 I | raft: 1c0ed140248ca8b6 became candidate at term 2
2017-09-14 21:20:46.723008 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 2
2017-09-14 21:20:46.723042 I | raft: 1c0ed140248ca8b6 became leader at term 2
2017-09-14 21:20:46.723127 I | raft: raft.node: 1c0ed140248ca8b6 elected leader 1c0ed140248ca8b6 at term 2
2017-09-14 21:20:46.723746 I | etcdserver: setting up the initial cluster version to 3.2
2017-09-14 21:20:46.723857 I | etcdserver: published {Name:etcd-dev-cluster-0000 ClientURLs:[https://etcd-dev-cluster-0000.etcd-dev-cluster.etcd.svc:2379]} to cluster 843c9aff6d4417d4
2017-09-14 21:20:46.723922 I | embed: ready to serve client requests
2017-09-14 21:20:46.724209 I | embed: serving client requests on [::]:2379
2017-09-14 21:20:46.729567 N | etcdserver/membership: set the initial cluster version to 3.2
2017-09-14 21:20:46.729647 I | etcdserver/api: enabled capabilities for version 3.2
2017-09-14 21:20:46.743419 I | etcdserver/api/v3rpc: Failed to dial 0.0.0.0:2379: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-09-14 21:20:52.363352 I | etcdserver/membership: added member 1e227609e8b9cb1c [https://etcd-dev-cluster-0001.etcd-dev-cluster.etcd.svc:2380] to cluster 843c9aff6d4417d4
2017-09-14 21:20:52.363447 I | rafthttp: starting peer 1e227609e8b9cb1c...
2017-09-14 21:20:52.363486 I | rafthttp: started HTTP pipelining with peer 1e227609e8b9cb1c
2017-09-14 21:20:52.367078 I | rafthttp: started peer 1e227609e8b9cb1c
2017-09-14 21:20:52.416524 I | rafthttp: added peer 1e227609e8b9cb1c
2017-09-14 21:20:52.416925 I | rafthttp: started streaming with peer 1e227609e8b9cb1c (writer)
2017-09-14 21:20:52.417102 I | rafthttp: started streaming with peer 1e227609e8b9cb1c (writer)
2017-09-14 21:20:52.417134 I | rafthttp: started streaming with peer 1e227609e8b9cb1c (stream MsgApp v2 reader)
2017-09-14 21:20:52.417734 I | rafthttp: started streaming with peer 1e227609e8b9cb1c (stream Message reader)
2017-09-14 21:20:53.579634 I | etcdmain: rejected connection from "10.38.203.115:45948" (tls: "10.38.203.115" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:20:53.722852 W | raft: 1c0ed140248ca8b6 stepped down to follower since quorum is not active
2017-09-14 21:20:53.722961 I | raft: 1c0ed140248ca8b6 became follower at term 2
2017-09-14 21:20:53.722987 I | raft: raft.node: 1c0ed140248ca8b6 lost leader 1c0ed140248ca8b6 at term 2
2017-09-14 21:20:55.022758 I | raft: 1c0ed140248ca8b6 is starting a new election at term 2
2017-09-14 21:20:55.022865 I | raft: 1c0ed140248ca8b6 became candidate at term 3
2017-09-14 21:20:55.022916 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 3
2017-09-14 21:20:55.022947 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 3
2017-09-14 21:20:56.722789 I | raft: 1c0ed140248ca8b6 is starting a new election at term 3
2017-09-14 21:20:56.722924 I | raft: 1c0ed140248ca8b6 became candidate at term 4
2017-09-14 21:20:56.722954 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 4
2017-09-14 21:20:56.722981 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 4
2017-09-14 21:20:57.417787 W | rafthttp: health check for peer 1e227609e8b9cb1c could not connect: <nil>
2017-09-14 21:20:58.022766 I | raft: 1c0ed140248ca8b6 is starting a new election at term 4
2017-09-14 21:20:58.022853 I | raft: 1c0ed140248ca8b6 became candidate at term 5
2017-09-14 21:20:58.022880 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 5
2017-09-14 21:20:58.022934 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 5
2017-09-14 21:20:59.622759 I | raft: 1c0ed140248ca8b6 is starting a new election at term 5
2017-09-14 21:20:59.622828 I | raft: 1c0ed140248ca8b6 became candidate at term 6
2017-09-14 21:20:59.622850 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 6
2017-09-14 21:20:59.622870 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 6
2017-09-14 21:21:00.722766 I | raft: 1c0ed140248ca8b6 is starting a new election at term 6
2017-09-14 21:21:00.722872 I | raft: 1c0ed140248ca8b6 became candidate at term 7
2017-09-14 21:21:00.722924 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 7
2017-09-14 21:21:00.722953 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 7
2017-09-14 21:21:02.122757 I | raft: 1c0ed140248ca8b6 is starting a new election at term 7
2017-09-14 21:21:02.122864 I | raft: 1c0ed140248ca8b6 became candidate at term 8
2017-09-14 21:21:02.122914 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 8
2017-09-14 21:21:02.122944 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 8
2017-09-14 21:21:02.417986 W | rafthttp: health check for peer 1e227609e8b9cb1c could not connect: dial tcp 10.38.203.115:2380: i/o timeout
2017-09-14 21:21:03.222742 I | raft: 1c0ed140248ca8b6 is starting a new election at term 8
2017-09-14 21:21:03.222845 I | raft: 1c0ed140248ca8b6 became candidate at term 9
2017-09-14 21:21:03.222895 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 9
2017-09-14 21:21:03.222925 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 9
2017-09-14 21:21:04.622745 I | raft: 1c0ed140248ca8b6 is starting a new election at term 9
2017-09-14 21:21:04.622850 I | raft: 1c0ed140248ca8b6 became candidate at term 10
2017-09-14 21:21:04.622901 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 10
2017-09-14 21:21:04.622931 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 10
2017-09-14 21:21:06.228401 I | raft: 1c0ed140248ca8b6 is starting a new election at term 10
2017-09-14 21:21:06.228476 I | raft: 1c0ed140248ca8b6 became candidate at term 11
2017-09-14 21:21:06.228530 I | raft: 1c0ed140248ca8b6 received MsgVoteResp from 1c0ed140248ca8b6 at term 11
2017-09-14 21:21:06.228558 I | raft: 1c0ed140248ca8b6 [logterm: 2, index: 5] sent MsgVote request to 1e227609e8b9cb1c at term 11

etcd-0001 logs


2017-09-14 21:20:53.465235 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT_2379_TCP_PROTO=tcp
2017-09-14 21:20:53.465355 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_SERVICE_PORT=2379
2017-09-14 21:20:53.465379 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT=tcp://100.67.127.51:2379
2017-09-14 21:20:53.465397 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT_19999_TCP=tcp://100.68.235.175:19999
2017-09-14 21:20:53.465414 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT_2379_TCP_PORT=2379
2017-09-14 21:20:53.465431 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT_2379_TCP_PROTO=tcp
2017-09-14 21:20:53.465471 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT=tcp://100.68.235.175:19999
2017-09-14 21:20:53.465491 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_SERVICE_PORT=19999
2017-09-14 21:20:53.465513 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT_2379_TCP=tcp://100.70.9.86:2379
2017-09-14 21:20:53.465530 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT_2379_TCP=tcp://100.67.127.51:2379
2017-09-14 21:20:53.465548 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT_19999_TCP_ADDR=100.68.235.175
2017-09-14 21:20:53.465564 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_HOST=100.70.9.86
2017-09-14 21:20:53.465581 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_SERVICE_PORT_BACKUP_SERVICE=19999
2017-09-14 21:20:53.465621 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT_19999_TCP_PORT=19999
2017-09-14 21:20:53.465639 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_PORT_19999_TCP_PROTO=tcp
2017-09-14 21:20:53.465658 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_SERVICE_HOST=100.67.127.51
2017-09-14 21:20:53.465674 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_PORT_CLIENT=2379
2017-09-14 21:20:53.465690 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT_2379_TCP_ADDR=100.70.9.86
2017-09-14 21:20:53.465706 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_SERVICE_PORT_CLIENT=2379
2017-09-14 21:20:53.465723 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_PORT=2379
2017-09-14 21:20:53.465739 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT=tcp://100.70.9.86:2379
2017-09-14 21:20:53.465775 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_PORT_2379_TCP_PORT=2379
2017-09-14 21:20:53.465795 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_CLIENT_SERVICE_LB_PORT_2379_TCP_ADDR=100.67.127.51
2017-09-14 21:20:53.465811 W | pkg/flags: unrecognized environment variable ETCD_DEV_CLUSTER_BACKUP_SIDECAR_SERVICE_HOST=100.68.235.175
2017-09-14 21:20:53.465844 I | etcdmain: etcd Version: 3.2.7
2017-09-14 21:20:53.465861 I | etcdmain: Git SHA: bb66589
2017-09-14 21:20:53.465877 I | etcdmain: Go Version: go1.8.3
2017-09-14 21:20:53.465892 I | etcdmain: Go OS/Arch: linux/amd64
2017-09-14 21:20:53.465908 I | etcdmain: setting maximum number of CPUs to 1, total number of available CPUs is 1
2017-09-14 21:20:53.465971 I | embed: peerTLS: cert = /etc/etcdtls/member/peer-tls/peer.crt, key = /etc/etcdtls/member/peer-tls/peer.key, ca = , trusted-ca = /etc/etcdtls/member/peer-tls/peer-ca.crt, client-cert-auth = true
2017-09-14 21:20:53.497754 I | embed: listening for peers on https://0.0.0.0:2380
2017-09-14 21:20:53.497861 I | embed: listening for client requests on 0.0.0.0:2379
2017-09-14 21:20:53.580736 W | etcdserver: could not get cluster response from https://etcd-dev-cluster-0000.etcd-dev-cluster.etcd.svc:2380: Get https://etcd-dev-cluster-0000.etcd-dev-cluster.etcd.svc:2380/members: EOF
2017-09-14 21:20:53.590625 C | etcdmain: cannot fetch cluster info from peer urls: could not retrieve cluster information from the given urls

etcd-0002 logs

2017-09-14 21:21:14.358635 I | etcdserver: data dir = /var/etcd/data
2017-09-14 21:21:14.358640 I | etcdserver: member dir = /var/etcd/data/member
2017-09-14 21:21:14.358644 I | etcdserver: heartbeat = 100ms
2017-09-14 21:21:14.358648 I | etcdserver: election = 1000ms
2017-09-14 21:21:14.358651 I | etcdserver: snapshot count = 100000
2017-09-14 21:21:14.358662 I | etcdserver: advertise client URLs = https://etcd-dev-cluster-0002.etcd-dev-cluster.etcd.svc:2379
2017-09-14 21:21:14.397586 I | etcdserver: restarting member b6200abd26c48f5a in cluster e9226534935de69 at commit index 1
2017-09-14 21:21:14.397650 I | raft: b6200abd26c48f5a became follower at term 1
2017-09-14 21:21:14.397671 I | raft: newRaft b6200abd26c48f5a [peers: [b6200abd26c48f5a], term: 1, commit: 1, applied: 1, lastindex: 1, lastterm: 1]
2017-09-14 21:21:14.397849 I | etcdserver/membership: added member b6200abd26c48f5a [https://etcd-dev-cluster-0002.etcd-dev-cluster.etcd.svc:2380] to cluster e9226534935de69 from store
2017-09-14 21:21:14.402520 W | auth: simple token is not cryptographically signed
2017-09-14 21:21:14.407847 I | etcdserver: starting server... [version: 3.2.7, cluster version: to_be_decided]
2017-09-14 21:21:14.407908 I | embed: ClientTLS: cert = /etc/etcdtls/member/server-tls/server.crt, key = /etc/etcdtls/member/server-tls/server.key, ca = , trusted-ca = /etc/etcdtls/member/server-tls/server-ca.crt, client-cert-auth = true
2017-09-14 21:21:14.499424 I | etcdmain: rejected connection from "10.38.202.214:47770" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:14.506826 I | etcdmain: rejected connection from "10.38.202.214:47768" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:14.539190 I | etcdmain: rejected connection from "10.38.202.214:47772" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:14.634228 I | etcdmain: rejected connection from "10.38.202.214:47776" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:14.647186 I | etcdmain: rejected connection from "10.38.202.214:47774" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:14.698119 I | raft: b6200abd26c48f5a is starting a new election at term 1
2017-09-14 21:21:14.698250 I | raft: b6200abd26c48f5a became candidate at term 2
2017-09-14 21:21:14.698297 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 2
2017-09-14 21:21:14.698330 I | raft: b6200abd26c48f5a became leader at term 2
2017-09-14 21:21:14.698375 I | raft: raft.node: b6200abd26c48f5a elected leader b6200abd26c48f5a at term 2
2017-09-14 21:21:14.699481 I | etcdserver: setting up the initial cluster version to 3.2
2017-09-14 21:21:14.699625 I | etcdserver: published {Name:etcd-dev-cluster-0002 ClientURLs:[https://etcd-dev-cluster-0002.etcd-dev-cluster.etcd.svc:2379]} to cluster e9226534935de69
2017-09-14 21:21:14.699689 I | embed: ready to serve client requests
2017-09-14 21:21:14.699954 I | embed: serving client requests on [::]:2379
2017-09-14 21:21:14.705152 N | etcdserver/membership: set the initial cluster version to 3.2
2017-09-14 21:21:14.705228 I | etcdserver/api: enabled capabilities for version 3.2
2017-09-14 21:21:14.711378 I | etcdserver/api/v3rpc: Failed to dial 0.0.0.0:2379: connection error: desc = "transport: remote error: tls: bad certificate"; please retry.
2017-09-14 21:21:14.800254 I | etcdmain: rejected connection from "10.38.202.214:47780" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:14.806614 I | etcdmain: rejected connection from "10.38.202.214:47778" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:14.937541 I | etcdmain: rejected connection from "10.38.202.214:47782" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:14.955626 I | etcdmain: rejected connection from "10.38.202.214:47784" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.070608 I | etcdmain: rejected connection from "10.38.202.214:47786" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.096663 I | etcdmain: rejected connection from "10.38.202.214:47790" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.194117 I | etcdmain: rejected connection from "10.38.202.214:47792" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.211671 I | etcdmain: rejected connection from "10.38.202.214:47794" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.309966 I | etcdmain: rejected connection from "10.38.202.214:47796" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.327361 I | etcdmain: rejected connection from "10.38.202.214:47798" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.426642 I | etcdmain: rejected connection from "10.38.202.214:47800" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.442365 I | etcdmain: rejected connection from "10.38.202.214:47802" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.548498 I | etcdmain: rejected connection from "10.38.202.214:47806" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.571123 I | etcdmain: rejected connection from "10.38.202.214:47808" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.670642 I | etcdmain: rejected connection from "10.38.202.214:47810" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.686261 I | etcdmain: rejected connection from "10.38.202.214:47812" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.797423 I | etcdmain: rejected connection from "10.38.202.214:47814" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:15.812265 I | etcdmain: rejected connection from "10.38.202.214:47816" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:24.041624 I | etcdserver/membership: added member a98ac3d227cfaf71 [https://etcd-dev-cluster-0003.etcd-dev-cluster.etcd.svc:2380] to cluster e9226534935de69
2017-09-14 21:21:24.041729 I | rafthttp: starting peer a98ac3d227cfaf71...
2017-09-14 21:21:24.041774 I | rafthttp: started HTTP pipelining with peer a98ac3d227cfaf71
2017-09-14 21:21:24.045469 I | rafthttp: started peer a98ac3d227cfaf71
2017-09-14 21:21:24.045536 I | rafthttp: added peer a98ac3d227cfaf71
2017-09-14 21:21:24.056364 I | rafthttp: started streaming with peer a98ac3d227cfaf71 (writer)
2017-09-14 21:21:24.056456 I | rafthttp: started streaming with peer a98ac3d227cfaf71 (writer)
2017-09-14 21:21:24.056530 I | rafthttp: started streaming with peer a98ac3d227cfaf71 (stream MsgApp v2 reader)
2017-09-14 21:21:24.057439 I | rafthttp: started streaming with peer a98ac3d227cfaf71 (stream Message reader)
2017-09-14 21:21:25.101385 I | etcdmain: rejected connection from "10.38.202.214:47898" (tls: "10.38.202.214" does not match any of DNSNames ["*.etcd-dev-cluster.etcd.svc"])
2017-09-14 21:21:25.698104 W | raft: b6200abd26c48f5a stepped down to follower since quorum is not active
2017-09-14 21:21:25.698210 I | raft: b6200abd26c48f5a became follower at term 2
2017-09-14 21:21:25.698236 I | raft: raft.node: b6200abd26c48f5a lost leader b6200abd26c48f5a at term 2
2017-09-14 21:21:27.601481 I | raft: b6200abd26c48f5a is starting a new election at term 2
2017-09-14 21:21:27.601512 I | raft: b6200abd26c48f5a became candidate at term 3
2017-09-14 21:21:27.601523 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 3
2017-09-14 21:21:27.601533 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 3
2017-09-14 21:21:28.998101 I | raft: b6200abd26c48f5a is starting a new election at term 3
2017-09-14 21:21:28.998209 I | raft: b6200abd26c48f5a became candidate at term 4
2017-09-14 21:21:28.998260 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 4
2017-09-14 21:21:28.998290 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 4
2017-09-14 21:21:29.057735 W | rafthttp: health check for peer a98ac3d227cfaf71 could not connect: dial tcp: lookup etcd-dev-cluster-0003.etcd-dev-cluster.etcd.svc on 100.64.0.10:53: no such host
2017-09-14 21:21:30.598119 I | raft: b6200abd26c48f5a is starting a new election at term 4
2017-09-14 21:21:30.598230 I | raft: b6200abd26c48f5a became candidate at term 5
2017-09-14 21:21:30.598358 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 5
2017-09-14 21:21:30.598385 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 5
2017-09-14 21:21:32.398112 I | raft: b6200abd26c48f5a is starting a new election at term 5
2017-09-14 21:21:32.398221 I | raft: b6200abd26c48f5a became candidate at term 6
2017-09-14 21:21:32.398271 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 6
2017-09-14 21:21:32.398302 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 6
2017-09-14 21:21:33.998116 I | raft: b6200abd26c48f5a is starting a new election at term 6
2017-09-14 21:21:33.998273 I | raft: b6200abd26c48f5a became candidate at term 7
2017-09-14 21:21:33.998303 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 7
2017-09-14 21:21:33.998334 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 7
2017-09-14 21:21:34.058008 W | rafthttp: health check for peer a98ac3d227cfaf71 could not connect: dial tcp: lookup etcd-dev-cluster-0003.etcd-dev-cluster.etcd.svc on 100.64.0.10:53: no such host
2017-09-14 21:21:35.198116 I | raft: b6200abd26c48f5a is starting a new election at term 7
2017-09-14 21:21:35.198209 I | raft: b6200abd26c48f5a became candidate at term 8
2017-09-14 21:21:35.198262 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 8
2017-09-14 21:21:35.198295 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 8
2017-09-14 21:21:36.198119 I | raft: b6200abd26c48f5a is starting a new election at term 8
2017-09-14 21:21:36.198207 I | raft: b6200abd26c48f5a became candidate at term 9
2017-09-14 21:21:36.198258 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 9
2017-09-14 21:21:36.198295 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 9
2017-09-14 21:21:37.598981 I | raft: b6200abd26c48f5a is starting a new election at term 9
2017-09-14 21:21:37.599014 I | raft: b6200abd26c48f5a became candidate at term 10
2017-09-14 21:21:37.599025 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 10
2017-09-14 21:21:37.599034 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 10
2017-09-14 21:21:39.058272 W | rafthttp: health check for peer a98ac3d227cfaf71 could not connect: dial tcp: lookup etcd-dev-cluster-0003.etcd-dev-cluster.etcd.svc on 100.64.0.10:53: no such host
2017-09-14 21:21:39.498113 I | raft: b6200abd26c48f5a is starting a new election at term 10
2017-09-14 21:21:39.498219 I | raft: b6200abd26c48f5a became candidate at term 11
2017-09-14 21:21:39.498328 I | raft: b6200abd26c48f5a received MsgVoteResp from b6200abd26c48f5a at term 11
2017-09-14 21:21:39.498356 I | raft: b6200abd26c48f5a [logterm: 2, index: 5] sent MsgVote request to a98ac3d227cfaf71 at term 11

bacup sidecar logs

2017-09-14 21:21:03.291482 I | warning: ignoring ServerName for user-provided CA for backwards compatibility is deprecated
time="2017-09-14T21:21:03Z" level=info msg="listening on 0.0.0.0:19999" 
time="2017-09-14T21:21:10Z" level=info msg="received a backup request" 
time="2017-09-14T21:21:10Z" level=info msg="getMaxRev: member etcd-dev-cluster-0000 revision (1)" 
2017-09-14 21:21:10.722806 I | saving backup for cluster (etcd-dev-cluster)
time="2017-09-14T21:21:10Z" level=info msg="saved snapshot /var/etcd-backup/v1/etcd-dev-cluster/3.2.7_0000000000000001_etcd.backup (size: 24608) successfully" 
time="2017-09-14T21:21:37Z" level=info msg="received a backup request" 
time="2017-09-14T21:21:37Z" level=info msg="getMaxRev: member etcd-dev-cluster-0002 revision (1)" 
time="2017-09-14T21:21:37Z" level=info msg="skipped creating new backup: no change since last time" 
time="2017-09-14T21:22:07Z" level=warning msg="no running etcd pods found" 
time="2017-09-14T21:22:07Z" level=error msg="failed to save snapshot: no running etcd pods found" 
time="2017-09-14T21:22:28Z" level=info msg="received a backup request" 
time="2017-09-14T21:22:28Z" level=info msg="getMaxRev: member etcd-dev-cluster-0004 revision (1)" 
time="2017-09-14T21:22:28Z" level=info msg="skipped creating new backup: no change since last time" 
time="2017-09-14T21:22:54Z" level=info msg="received a backup request" 
time="2017-09-14T21:22:54Z" level=info msg="getMaxRev: member etcd-dev-cluster-0006 revision (1)" 
time="2017-09-14T21:22:54Z" level=info msg="skipped creating new backup: no change since last time" 
time="2017-09-14T21:23:21Z" level=info msg="received a backup request" 
time="2017-09-14T21:23:21Z" level=info msg="getMaxRev: member etcd-dev-cluster-0008 revision (1)" 
time="2017-09-14T21:23:21Z" level=info msg="skipped creating new backup: no change since last time" 
time="2017-09-14T21:23:47Z" level=info msg="received a backup request" 
time="2017-09-14T21:23:47Z" level=info msg="getMaxRev: member etcd-dev-cluster-0010 revision (1)" 
time="2017-09-14T21:23:47Z" level=info msg="skipped creating new backup: no change since last time" 
time="2017-09-14T21:24:13Z" level=info msg="received a backup request" 
time="2017-09-14T21:24:13Z" level=info msg="getMaxRev: member etcd-dev-cluster-0012 revision (1)" 
time="2017-09-14T21:24:13Z" level=info msg="skipped creating new backup: no change since last time" 
time="2017-09-14T21:24:40Z" level=info msg="received a backup request" 

@raoofm
Copy link
Contributor Author

raoofm commented Sep 14, 2017

@hongchaodeng @heyitsanthony @xiang90 please note that I removed .cluster.local suffix entry from the host and used the newly generated certs and tls seems to fail

@heyitsanthony
Copy link
Contributor

2017-09-14 21:21:29.057735 W | rafthttp: health check for peer a98ac3d227cfaf71 could not connect: dial tcp: lookup etcd-dev-cluster-0003.etcd-dev-cluster.etcd.svc on 100.64.0.10:53: no such host

DNS is still misconfigured.

@raoofm
Copy link
Contributor Author

raoofm commented Sep 14, 2017

DNS is still misconfigured.

this is being done by etcd-operator @hongchaodeng ?

@raoofm
Copy link
Contributor Author

raoofm commented Sep 14, 2017

image

image

@raoofm
Copy link
Contributor Author

raoofm commented Sep 15, 2017

@heyitsanthony @hongchaodeng the exact same certs work with 3.1.10

@heyitsanthony
Copy link
Contributor

@raoofm sure, 3.2 added stricter SAN checking. This is an operator bug.

@raoofm
Copy link
Contributor Author

raoofm commented Sep 15, 2017

@hongchaodeng any timeline to test etcd-operator against etcd 3.2.*?

@raoofm
Copy link
Contributor Author

raoofm commented Sep 18, 2017

sure, 3.2 added stricter SAN checking. This is an operator bug.

@hongchaodeng do you agree with this, so that I i can stop the noise on etcd and follow up coreos/etcd-operator#1384 (comment)

@heyitsanthony
Copy link
Contributor

@raoofm, @hongchaodeng specifically requested the stricter SAN checking and I implemented it. Please follow up on etcd-operator

@hongchaodeng
Copy link
Contributor

hongchaodeng commented Oct 31, 2017

Sorry that I dismissed this issue. Now let me explain the background and the root cause better.

Originally there are some problems with unauthorized endpoints joining quorums. Which has led to #7687 that adds stricter SAN check.

So now the check has some problems with non-FQDN names. I will follow up an issue with a more detailed analysis in code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

5 participants