Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed pernicious bug in autobatching #1127

Merged
merged 3 commits into from
Dec 8, 2017
Merged

Fixed pernicious bug in autobatching #1127

merged 3 commits into from
Dec 8, 2017

Conversation

neubig
Copy link
Contributor

@neubig neubig commented Dec 8, 2017

OK, it took me literally all day to find a bug that can be fixed in 10 characters of code.

Basically, parallel memcpy (used in autobatching) requires memory to be transferred to the GPU, but this was being done in an asynchronous fashion, and as a result the memory wasn't necessarily arriving at the GPU before the memcpy kernel started executing.

This probably fixes #1057

@redpony
Copy link
Member

redpony commented Dec 8, 2017 via email

@neubig
Copy link
Contributor Author

neubig commented Dec 8, 2017

I'm not actually sure whether the copy and the kernel are being executed on the same stream (probably not). Probably the best thing to do is enforce that the copy uses the same stream as the kernel, but I wasn't immediately sure how to do this.

And yes, I'm aware that the forced sync might be expensive, I haven't merged because I haven't had time to figure out the speed ramifications yet.

@neubig
Copy link
Contributor Author

neubig commented Dec 8, 2017

OK, so CudaMemcpy caused a performance degradation, but specifying a stream for CudaMemcpyAsync seems to be working for now. I'm not super-confident in this fix though, so after merging I'll continue to monitor for reported problems.

@neubig neubig merged commit bc0bf05 into master Dec 8, 2017
@neubig neubig mentioned this pull request Dec 8, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CUBLAS related error
2 participants