Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make train test traceable when error occurs #243

Merged
merged 4 commits into from
Oct 28, 2019
Merged

Conversation

witwolf
Copy link
Contributor

@witwolf witwolf commented Oct 23, 2019

logging training stdout & stderr output to stderr (stdout is suppressed at ci) to make it traceable when error occurs.

and this pr can solve the issue: build-times-out-because-no-output-was-received (travis_wait not support for a docker run command )

@@ -34,17 +36,22 @@ def run_and_stream(cmd, cwd):
cwd (str): working directory for the process
"""
logging.info("Running %s", " ".join(cmd))

logger = logging.ABSLLogger('')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of changing the code here, can we simply redirect stdout and stderr to stderr at build.sh?

python3 -m unittest discover -p "*_test.py" -v 1>&2

If not, we need some comment in the code to explain the reason of doing all these stuff so that future readers can understand it.

Copy link
Contributor Author

@witwolf witwolf Oct 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, it can be done withpython3 -m unittest discover -p "*_test.py" -v 1>&2 ,
but there is a potential problem Log length exceeded 4 MB if we log for all stdout (now the log file is about 3.2MB)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Then please add some comments here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@witwolf
Copy link
Contributor Author

witwolf commented Oct 25, 2019

how about other tests, don’t they have the same issue?

other tests are not in sub process like train test (python -m unittest ... -> python .._test_py-> python -m alf.bin.train ... )

process = subprocess.Popen(
cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, cwd=cwd)

while process.poll() is None:
with io.TextIOWrapper(process.stdout, encoding="utf-8") as text_io:
for line in text_io:
logging.info(line.strip())
logger.info(line.strip())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now the original stdout goes to stderr, and the original stderr goes to stdout (because stderr=subprocess.STDOUT at line 50). So I am still confused. And why not simply use stdout=subprocess.STDERR at line 50?

@emailweixu
Copy link
Contributor

other tests are not in sub process like train test (python -m unittest ... -> python .._test_py-> python -m alf.bin.train ... )

I am still confused. If the stdout of train_test is suppressed, aren't the stdout of other tests also suppressed?

@witwolf
Copy link
Contributor Author

witwolf commented Oct 26, 2019

other tests are not in sub process like train test (python -m unittest ... -> python .._test_py-> python -m alf.bin.train ... )

I am still confused. If the stdout of train_test is suppressed, aren't the stdout of other tests also suppressed?

train test cost much time, may encounter such a problem build-times-out-because-no-output-was-received when no stdout output .

solve this with travis_wait in the new commit (travis-ci/travis-ci#6934 )


import logging as sys_logging
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@emailweixu emailweixu merged commit 80a6386 into master Oct 28, 2019
@witwolf witwolf deleted the PR_train_test1 branch October 29, 2019 03:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants