Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[action-ssh-to-runner] Run Tmate in foreground mode #89

Merged
merged 8 commits into from
Aug 24, 2023

Conversation

dvasilas
Copy link
Contributor

@dvasilas dvasilas commented Aug 7, 2023

Issue: OS-663

Context: What is the problem

Currently, if one disconnects from the SSH session there is no way to reconnect. The only option is to relaunch the GitHub Actions workflow from scratch and connect again.

Solution

Launch tmate in foreground mode.
From tmate.io:

When tmate is used for remote access only (as opposed to pair programming), it is useful to launch tmate in foreground mode with tmate -F. This does two things:
[...]
- It ensure the session never dies, by respawning a shell when it exits.

Changes

  • Remove usage of mxschmitt/action-tmate@v3 and manually install and configure tmate
  • Launch tmate in foreground mode
  • Add scripts for "continue" (unpause workflow) and disconnect (disconnect without the sessions being respawned).

How to test

See instruction here: https://scality.atlassian.net/browse/OS-663?focusedCommentId=355578

@dvasilas dvasilas requested a review from a team as a code owner August 7, 2023 13:47
action-ssh-to-runner/action.yaml Show resolved Hide resolved
action-ssh-to-runner/action.yaml Outdated Show resolved Hide resolved
action-ssh-to-runner/action.yaml Outdated Show resolved Hide resolved
action-ssh-to-runner/action.yaml Show resolved Hide resolved
Comment on lines 59 to 86
- name: "SSH to runner: Pause workflow and print connection instructions"
shell: bash
run: |
TMATE_SSH=$(${{ env.TMATE_BIN }} -f ${{ env.TMATE_CONF }} -S ${{ env.TMATE_SOCK }} display -p '#{tmate_ssh}')
MESSAGE=$(cat << EOF

--------------------------------------------------
You can connect to this runner using:
${TMATE_SSH}
To connect you need:
- To connected to the Scality VPN.
- The ssh key associated with your GitHub account.
The workflow has been paused.
To resume, run the "continue_workflow" command.
To disconnect, run the "disconnect" command.
--------------------------------------------------
EOF
)
CONTINUE_FILE=${GITHUB_WORKSPACE}/continue
while [[ -S ${{ env.TMATE_SOCK }} ]]; do
echo "::notice title=SSH::${MESSAGE}"
sleep 5

if [[ -e ${CONTINUE_FILE} ]]; then
echo -e "Continue file created, continuing to the next step."
exit 0
fi
done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You had the right idea using notice but I would suggest using job summaries instead as it supports markdown flavored messages:

I think this will greatly improve the user experience, so I tried modifying the existing code to support it. Might need some tweak in case some characters are not properly interpreted.

Suggested change
- name: "SSH to runner: Pause workflow and print connection instructions"
shell: bash
run: |
TMATE_SSH=$(${{ env.TMATE_BIN }} -f ${{ env.TMATE_CONF }} -S ${{ env.TMATE_SOCK }} display -p '#{tmate_ssh}')
MESSAGE=$(cat << EOF
--------------------------------------------------
You can connect to this runner using:
${TMATE_SSH}
To connect you need:
- To connected to the Scality VPN.
- The ssh key associated with your GitHub account.
The workflow has been paused.
To resume, run the "continue_workflow" command.
To disconnect, run the "disconnect" command.
--------------------------------------------------
EOF
)
CONTINUE_FILE=${GITHUB_WORKSPACE}/continue
while [[ -S ${{ env.TMATE_SOCK }} ]]; do
echo "::notice title=SSH::${MESSAGE}"
sleep 5
if [[ -e ${CONTINUE_FILE} ]]; then
echo -e "Continue file created, continuing to the next step."
exit 0
fi
done
- name: "SSH to runner: Pause workflow and print connection instructions"
shell: bash
run: |
TMATE_SSH=$(${{ env.TMATE_BIN }} -f ${{ env.TMATE_CONF }} -S ${{ env.TMATE_SOCK }} display -p '#{tmate_ssh}')
MESSAGE=$(cat << EOF
## SSH to runner
Use the following command to connect:
```shell
${TMATE_SSH}
```
### Requirements
Please note that the following must be set for the command
to be successful:
- An active connection to Scality's VPN.
- The private ssh key associated with your GitHub account
is available and configured locally.
### Commands
The workflow is waiting for user input before executing
the remaining steps:
- To resume, run the `continue_workflow` command.
- To disconnect, run the `disconnect` command.
EOF
)
CONTINUE_FILE=${GITHUB_WORKSPACE}/continue
while [[ -S ${{ env.TMATE_SOCK }} ]]; do
echo "${MESSAGE}" >> ${GITHUB_STEP_SUMMARY}
sleep 5
if [[ -e ${CONTINUE_FILE} ]]; then
echo -e "Continue file created, continuing to the next step."
exit 0
fi
done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did try that but the detail is that it only prints the summary after the job (and not) step has finished.
In this action we block the step and job waiting for the resume command.

We could split the composite action in two, but the users would have to put them in two different jobs when integrating it in their projects.

What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? That's a shame, I thought it be updated as I remember the capability of updating it through the API. But yeah best not to split the action in two, let's leave it as is then. It's already good and better than before.

If I have the time I'll then have a look if I can find another way capable of updating it during runtime.

action-ssh-to-runner/continue_workflow Outdated Show resolved Hide resolved
Copy link
Contributor

@tcarmet tcarmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I got you through a bunch of misinformation, this inconsistency between how node vs composite actions behave is quite annoying.

action-ssh-to-runner/action.yaml Show resolved Hide resolved
action-ssh-to-runner/action.yaml Show resolved Hide resolved
Comment on lines 59 to 86
- name: "SSH to runner: Pause workflow and print connection instructions"
shell: bash
run: |
TMATE_SSH=$(${{ env.TMATE_BIN }} -f ${{ env.TMATE_CONF }} -S ${{ env.TMATE_SOCK }} display -p '#{tmate_ssh}')
MESSAGE=$(cat << EOF

--------------------------------------------------
You can connect to this runner using:
${TMATE_SSH}
To connect you need:
- To connected to the Scality VPN.
- The ssh key associated with your GitHub account.
The workflow has been paused.
To resume, run the "continue_workflow" command.
To disconnect, run the "disconnect" command.
--------------------------------------------------
EOF
)
CONTINUE_FILE=${GITHUB_WORKSPACE}/continue
while [[ -S ${{ env.TMATE_SOCK }} ]]; do
echo "::notice title=SSH::${MESSAGE}"
sleep 5

if [[ -e ${CONTINUE_FILE} ]]; then
echo -e "Continue file created, continuing to the next step."
exit 0
fi
done
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really? That's a shame, I thought it be updated as I remember the capability of updating it through the API. But yeah best not to split the action in two, let's leave it as is then. It's already good and better than before.

If I have the time I'll then have a look if I can find another way capable of updating it during runtime.

Co-authored-by: Thomas Carmet <8408330+tcarmet@users.noreply.github.com>
Copy link
Contributor

@oliviergaraud oliviergaraud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks that super helpful!


set -x -e

kill -SIGKILL $(cat /tmp/tmate.pid)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be renamed to stop_debugging to avoid confusion (I first thought that disconnect has the same effect as Ctrl-D)

If ever we stop debugging session we should proceed with the end of the workflow.

Suggested change
kill -SIGKILL $(cat /tmp/tmate.pid)
resume
kill -SIGKILL $(cat /tmp/tmate.pid)

We should also prevent the action-delay-job-completion from running (one option is to create a file do_not_delay_completion and check its existence in action-delay-job-completion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also prevent the action-delay-job-completion from running

Good point. We should also prevent it in case the "Connect to SSH" step timeouts because noone connected.

My thought was to check for the existence of /tmp/tmate.pid, but I intent to do this in a separate ticket/PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Checking if tmate is running is indeed better.
Will you use the same ticket for the other PR ? If not please create a new one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

action-ssh-to-runner/action.yaml Outdated Show resolved Hide resolved
action-ssh-to-runner/action.yaml Outdated Show resolved Hide resolved
action-ssh-to-runner/action.yaml Show resolved Hide resolved
@dvasilas dvasilas merged commit 2f51aa5 into main Aug 24, 2023
3 checks passed
@dvasilas dvasilas deleted the improvement/tmate-foreground-mode branch September 8, 2023 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants