Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GCP:Cluster] I got an error with timeout when adding a nodegroup, BUT a nodegroup is provisioned. #1334

Open
sykim-etri opened this issue Sep 20, 2024 · 1 comment
Labels
bug Something isn't working CloudDriver

Comments

@sykim-etri
Copy link
Member

노드그룹 추가시 타임아웃이 짧아 관련 노드그룹 정보를 확보하지 못해 에러가 리턴되고 있습니다. 하지만 시간이 조금 더 흐른 뒤 노드그룹이 정상적으로 생성된 것이 확인됩니다.

me-central1, me-central2, me-west1 리전에서 거의 발생하는 상황이며, 타임아웃 시간을 60초로 변경시 실패 확률이 줄어드는 것을 확인하였습니다.

노드그룹 생성 결과를 확인하는 방식을 개선할 필요가 있어 보입니다.

에러 발생 당시 CB-SP의 로그는 다음과 같습니다.

[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterRest.go:356, github.com/cloud-barista/cb-spider/api-runtime/rest-runtime.AddNodeGroup() - call AddNodeGroup()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterManager.go:831, github.com/cloud-barista/cb-spider/api-runtime/common-runtime.AddNodeGroup() - call AddNodeGroup()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 RegionInfoManager.go:114, github.com/cloud-barista/cb-spider/cloud-info-manager/region-info-manager.GetRegion() - call GetRegion()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 DriverInfoManager.go:117, github.com/cloud-barista/cb-spider/cloud-info-manager/driver-info-manager.GetCloudDriver() - call GetCloudDriver()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CloudDriverHandler_static.go:49, github.com/cloud-barista/cb-spider/cloud-control-manager.getCloudDriver() - CloudDriverHandler: called getStaticCloudDriver() - gcp-driver-v1.0.so
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CredentialInfoManager.go:221, github.com/cloud-barista/cb-spider/cloud-info-manager/credential-info-manager.GetCredentialDecrypt() - call GetCredential()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 RegionInfoManager.go:114, github.com/cloud-barista/cb-spider/cloud-info-manager/region-info-manager.GetRegion() - call GetRegion()
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:83, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getVMClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:84, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getVMClient
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:85, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getVMClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:91, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getContainerClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:92, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getContainerClient
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:93, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getContainerClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:99, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getBillingCatalogClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:100, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getBillingCatalogClient
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:101, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getBillingCatalogClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:107, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getCostEstimationClient ##################
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:108, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - getCostEstimationClient
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:04 GCPDriver.go:109, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp.(*GCPDriver).ConnectCloud() - ################## getCostEstimationClient ##################
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 GCP_CloudConnection.go:140, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/connect.(*GCPCloudConnection).CreateClusterHandler() - GCP Cloud Driver: called CreateClusterHandler()!
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ConnectionConfigInfoManager.go:102, github.com/cloud-barista/cb-spider/cloud-info-manager/connection-config-info-manager.GetConnectionConfig() - call GetConnectionConfig()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterHandler.go:427, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - GCP Cloud Driver: called AddNodeGroup()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 ClusterHandler.go:338, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).GetCluster() - GCP Cloud Driver: called GetCluster()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:04 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP GetCluster()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:883, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.mappingClusterInfo() - metaSecurityGroupTags : [crmgpbkcpuqas6a93apg-crmgpbkcpuq064vnth20]
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:998, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.mappingClusterInfo() - Cluster status : RUNNINGActive
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1216, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertCluster() - nodeGroupList [{{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]}]
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1233, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - convertNodeGroup {{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]}
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1237, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - keyValue {InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp}
[CB-SPIDER].[INFO]: 2024-09-20 15:12:07 ClusterHandler.go:1239, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - HasPrefix
[CB-SPIDER].[INFO]: 2024-09-20 15:12:10 ClusterHandler.go:1246, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - instanceList [gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12]
[CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:1258, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - nodeList [{gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12}]
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:12 ClusterHandler.go:1261, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - {{ng11-crmgqbscpuq064vnth3g ng11-crmgqbscpuq064vnth3g} {COS_CONTAINERD COS_CONTAINERD} e2-standard-2 pd-balanced 100 {NameId SystemId} true 1 1 1 Active [{gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-gh12}] [{InstanceGroup_0 gke-crmgqbscpuqas6a9-ng11-crmgqbscpuq-88c5ba5e-grp} {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}]}
[CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:1237, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.convertNodeGroup() - keyValue {keypair crmgpr4cp-crmgpr4cpuq064vnth2g}
[CB-SPIDER].[INFO]: 2024-09-20 15:12:12 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP AddNodeGroup()
[CB-SPIDER].[INFO]: 2024-09-20 15:12:12 ClusterHandler.go:503, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - parent : projects/sykim-etri-prj/locations/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:28 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:28 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:34 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:34 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:39 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:39 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:45 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:44 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:50 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:50 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[DEBUG]: 2024-09-20 15:12:57 CommonHandler.go:538, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - &{[]   <nil>  operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af [] CREATE_NODE_POOL <nil> https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/operations/operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af 2024-09-20T06:12:26.277822996Z PENDING  https://container.googleapis.com/v1/projects/291929448520/zones/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g me-central1-a {200 map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Fri, 20 Sep 2024 06:12:57 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} [] []}
[CB-SPIDER].[INFO]: 2024-09-20 15:13:02 CommonHandler.go:560, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.WaitContainerOperationFail() - Forcing termination of Wait because the status of resource [operation-1726812746277-1ec65591-0ed6-4f43-ad78-03050b76f6af] has not failed within [30] seconds.
[CB-SPIDER].[INFO]: 2024-09-20 15:13:02 ClusterHandler.go:1190, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.getNodePools() - GCP Cloud Driver: called getNodePools() projects/sykim-etri-prj/locations/me-central1-a/clusters/crmgqbscpuqas6a93aq-crmgqbscpuq064vnth30/nodePools/ng111-crmh4d4cpuq064vnth4g
[CB-SPIDER].[INFO]: 2024-09-20 15:13:02 CommonHandler.go:87, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.GetCallLogScheme() - Call GCP getNodePools()
[CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterHandler.go:1198, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.getNodePools() - Failed to getNodePools :  googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.RequestInfo",
    "requestId": "0x39c340060cab03d4"
  }
]
, notFound
[CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterHandler.go:528, github.com/cloud-barista/cb-spider/cloud-control-manager/cloud-driver/drivers/gcp/resources.(*GCPClusterHandler).AddNodeGroup() - Failed to getNodePools :  googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.RequestInfo",
    "requestId": "0x39c340060cab03d4"
  }
]
, notFound
[CB-SPIDER].[ERROR]: 2024-09-20 15:13:03 ClusterManager.go:943, github.com/cloud-barista/cb-spider/api-runtime/common-runtime.AddNodeGroup() - Failed to getNodePools :  googleapi: Error 404: Not found: node pool "ng111-crmh4d4cpuq064vnth4g" not found.
Details:
[
  {
    "@type": "type.googleapis.com/google.rpc.RequestInfo",
    "requestId": "0x39c340060cab03d4"
  }
]
, notFound
@sykim-etri sykim-etri added bug Something isn't working CloudDriver labels Sep 20, 2024
@powerkimhub
Copy link
Member

@hippo-an (@sykim-etri)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CloudDriver
Projects
None yet
Development

No branches or pull requests

2 participants