-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make train test traceable when error occurs #243
Conversation
alf/bin/train_test.py
Outdated
@@ -34,17 +36,22 @@ def run_and_stream(cmd, cwd): | |||
cwd (str): working directory for the process | |||
""" | |||
logging.info("Running %s", " ".join(cmd)) | |||
|
|||
logger = logging.ABSLLogger('') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of changing the code here, can we simply redirect stdout and stderr to stderr at build.sh?
python3 -m unittest discover -p "*_test.py" -v 1>&2
If not, we need some comment in the code to explain the reason of doing all these stuff so that future readers can understand it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, it can be done withpython3 -m unittest discover -p "*_test.py" -v 1>&2
,
but there is a potential problem Log length exceeded 4 MB if we log for all stdout (now the log file is about 3.2MB)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. Then please add some comments here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
other tests are not in sub process like train test (python -m unittest ... -> python .._test_py-> python -m alf.bin.train ... ) |
alf/bin/train_test.py
Outdated
process = subprocess.Popen( | ||
cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, cwd=cwd) | ||
|
||
while process.poll() is None: | ||
with io.TextIOWrapper(process.stdout, encoding="utf-8") as text_io: | ||
for line in text_io: | ||
logging.info(line.strip()) | ||
logger.info(line.strip()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now the original stdout goes to stderr, and the original stderr goes to stdout (because stderr=subprocess.STDOUT at line 50). So I am still confused. And why not simply use stdout=subprocess.STDERR at line 50?
I am still confused. If the stdout of train_test is suppressed, aren't the stdout of other tests also suppressed? |
train test cost much time, may encounter such a problem build-times-out-because-no-output-was-received when no stdout output . solve this with |
alf/bin/train_test.py
Outdated
|
||
import logging as sys_logging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
logging training stdout & stderr output to stderr (stdout is suppressed at ci) to make it traceable when error occurs.
and this pr can solve the issue: build-times-out-because-no-output-was-received (
travis_wait
not support for a docker run command )