-
Notifications
You must be signed in to change notification settings - Fork 522
deploy failure: etcd failed to start when deploy private Cluster failed because of #2242
Comments
👋 Thanks for opening your first issue here! If you're reporting a 🐞 bug, please make sure you include steps to reproduce it. |
details steps to reproduce
root@k8s-master-48189569-0:/var/log/azure# systemctl status etcd Oct 31 05:09:11 k8s-master-48189569-0 etcd[6099]: rejected connection from "10.100.0.241:36602" (error "remote error: tls: bad certificate", ServerName "")
I have tried different etcd version number other than 3.3.15 (the default one), but doesn't change anything. |
I have solved this issue, it was my mistake, that the machine runs aks-engine command doesn't have the correct system time, so my self-signed certification gernated with a start time which is in the further, that cause etcd failed to start.. even after I upgrade etcd to the newest verison and I still got same error, that makes me to look at the detail certification file. and got the root cause.. my suggest is to always keeps NTP time before generate the cerfication.. |
Describe the bug
I'm deploying a private aks cluster using below configuration:
{
"apiVersion": "vlabs",
"properties": {
"orchestratorProfile": {
"orchestratorType": "Kubernetes",
"orchestratorRelease": "1.13",
"kubernetesConfig": {
"privateCluster": {
"enabled": true,
"jumpboxProfile": {
"name": "jumpbox",
"vmSize": "Standard_D2_v3",
"osDiskSizeGB": 30,
"username": "azureuser",
"publicKey": "ssh-rsa .........."
}
}
}
},
"masterProfile": {
"count": 3,
"dnsPrefix": "myakse",
"vmSize": "Standard_D2_v3",
"distro": "aks-ubuntu-18.04",
"etcdVersion": "3.3.15",
"vnetSubnetId": "/subscriptions/.........../subnets/MasterSubnet",
"firstConsecutiveStaticIP": "10.100.0.240"
},
"agentPoolProfiles": [
{
"name": "agents",
"count": 3,
"availabilityProfile": "VirtualMachineScaleSets",
"vmSize": "Standard_D2_v3",
"distro": "aks-ubuntu-18.04",
"vnetSubnetId": "/subscriptions/........../subnets/AgentSubnet"
}
],
"linuxProfile": {
"adminUsername": "azureuser",
"ssh": {
"publicKeys": [
{
"keyData": "ssh-rsa .........."
}
]
}
},
"servicePrincipalProfile": {
"clientId": ".......................",
"secret": "........................"
}
}
}
VM for master can be created, but start etcd service, it gets error, and cause deployment failure.
etcd.service failed to start and reports bad certificated error when doing the handshake with peers,
the ServerName is null ""
Steps To Reproduce
Expected behavior
successful installation and etcd service in master VM can be started
AKS Engine version
0.4.15
Kubernetes version
1.13.2
Additional context
aks uses a self signed certificate that doesn't containes the ServerName, but only ip addresss,
in that case etcd will fail to valide the certificate and fail to start,
etcd releases 3.3.17 version with a new DEAMON_ARGS to skip verifing the SAN from certification, looks like we need to include this changes in when deploying
https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.3.md#v3317-2019-10-11
etcd-io/etcd#11196
The text was updated successfully, but these errors were encountered: