Skip to content
This repository has been archived by the owner on Jul 25, 2022. It is now read-only.

SSH code optimization #231

Merged
merged 1 commit into from
Jul 23, 2020
Merged

SSH code optimization #231

merged 1 commit into from
Jul 23, 2020

Conversation

tedteng
Copy link
Contributor

@tedteng tedteng commented Jul 23, 2020

What this PR does / why we need it:
increase stability of the existing ssh and code optimization to fix the ssh unstable connection issue.

add bastion ssh port open detect before ssh exected.
even ssh open already sometimes AWS bastion still not ready to connect. so add ssh retry 3 time to fix the case
add prompt msg as verbosity when ssh, the error msg from ssh will return issues from bastion or attend to connect nodes

image

Which issue(s) this PR fixes:
Fixes #229

Special notes for your reviewer:
@danielfoehrKn

Release note:
increase the stability of the existing ssh and code optimization.

@gardener-robot-ci-3 gardener-robot-ci-3 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 23, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) and removed reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) labels Jul 23, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 23, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 23, 2020
@gardener-robot-ci-2 gardener-robot-ci-2 added the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 23, 2020
@gardener-robot-ci-1 gardener-robot-ci-1 removed the reviewed/ok-to-test Has approval for testing (check PR in detail before setting this label because PR is run on CI/CD) label Jul 23, 2020
Copy link
Contributor

@DockToFuture DockToFuture left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@gardener-robot gardener-robot added the reviewed/lgtm Has approval for merging label Jul 23, 2020
@DockToFuture DockToFuture reopened this Jul 23, 2020
@DockToFuture DockToFuture merged commit 17069f5 into gardener-attic:master Jul 23, 2020
@@ -72,13 +72,27 @@ func sshToAWSNode(nodeName, path, user, pathSSKeypair string, sshPublicKey []byt
fmt.Println("(3/4) Creating bastion host")
a.createBastionHostInstance()

fmt.Println("Waiting SSH 22 port open")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in the future it makes sense to have a logging framework with configurable verbosity levels.
For instance to also see verbose output of the SSH command (good for understanding what went wrong).
Just to keep in mind.

Copy link
Contributor Author

@tedteng tedteng Jul 23, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure np, at the moments, after fixing the ssh process display looks like this when ssh failure" we can get brief information closed by 3.121.114.80 port 22 to identify the issue from local to bastion instance or bastion instance to node instance.

SSH gardener@3.121.114.80 => gardener@ip-10-250-23-169.eu-central-1.compute.internalConnection closed by 3.121.114.80 port 22
kex_exchange_identification: Connection closed by remote host
exit status 255

After investigation this time, I got some new findings. I think the root cause is from AWS EC2, As testing, SSH service of EC2 instance may not fully running immediately when EC2 instance status become running from pending/Initializing.. which causes SSH command may be hung at the first time until timeout exit. so I detecting 22 port open and retry ssh command this time.

-Ted

@danielfoehrKn
Copy link

Thanks @neo-liang-sap !

@tedteng tedteng deleted the new branch July 24, 2020 01:48
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
needs/ok-to-test Needs approval for testing (check PR in detail before setting this label because PR is run on CI/CD) reviewed/lgtm Has approval for merging
Projects
None yet
Development

Successfully merging this pull request may close these issues.

increase stability of the existing ssh command
7 participants