Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catch errors in shell commands #28

Open
charlesfrye opened this issue Jul 24, 2022 · 3 comments
Open

catch errors in shell commands #28

charlesfrye opened this issue Jul 24, 2022 · 3 comments

Comments

@charlesfrye
Copy link
Collaborator

charlesfrye commented Jul 24, 2022

Problem

we use the IPython ! magic to execute terminal commands from notebooks, but that causes shell scripts to fail silently -- at most, their errors are printed as strings to the console, as below. notice that control flow has continued to the next cell.
image

that makes testing the notebooks less useful. some of the most interesting and potentially fragile components of our labs are run via the shell, e.g. !python training/run_experiment.py.

Candidate Solutions

%sx / %sc

these are effectively the same as !, except for slight differences that don't help us here. they also capture the output, which means no live stdout updating -- critical for long-running scripts like training.

%%bash.

a cell-level magic that runs contents as a shell script. alternatively %%sh. this does raise CalledProcessErrors
image
but the contents are literally treated as a shell script. so that means no using Python-level variables f-string style inside {}, like the always handy --gpus {torch.cuda.is_available()}
image
and it doesn't behave the same on Colab (see below). this not a dealbreaker, because Colab is interactive, which means 1) automated testing happens locally / in cloud, not on Colab and 2) Colab testing will be manual, and a human is likely to notice these issues
image

%run

this line magic only works to run Python scripts, but it supports f-strings for variables and re-raises any errors.

Conclusions

it's understandable that there's not a solution here: errors in Unix/Windows/whateverOS are not Python errors, so it's not necessarily sensible to raise them. or rather, it probably would've been sensible to define that as part of behavior and force folks to always try/catch when using subprocesses, but that would've had to have been done from the beginning of the design and these are old, possibly Dutch bits of Python here.

for now, we can use %run everywhere we had !python (a greppable replacement, AFAICT). but, as the issue title says, we should find a way to catch errors in shell commands that use Python-level variables, like the following friend who triggered the discovery:

list_all_log_files = "find training/logs/lightning_logs/"
filter_to_ckpts = "grep \.ckpt$"
sort_recent_to_old = "sort -r"
take_first = "head -n 1"

latest_ckpt, = ! {list_all_log_files} | {filter_to_ckpts} | {sort_recent_to_old} | {take_first}
latest_ckpt
@charlesfrye
Copy link
Collaborator Author

fyi @sergeyk

@charlesfrye
Copy link
Collaborator Author

we could use shell error handling like ! command && if_succ || if_fail, but we'd need to capture the result in python in order to raise something, which means extra code, in particular if we want the raised error to be helpful and not just assert not if_fail.

that's complexity we don't want to show students, so we don't want this in every ! command in the labs -- perhaps only in a few critical commands and in places where it can be added with minimal seams

@charlesfrye
Copy link
Collaborator Author

charlesfrye commented Jul 25, 2022

for example, the "testing colab" shouldn't require babysitting -- failures should be loud.

here's roughly what error handling looks like, for a reference:

out = ! ./tasks/unit_test.sh && echo "success" || echo "failure"

if out[-1] == "failure":
    raise RuntimeError("\n".join(out[:-1]))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant