-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reorganize CUDAScopedContext #355
Reorganize CUDAScopedContext #355
Conversation
Motivation is that the acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier.
Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics).
Validation summaryReference release CMSSW_10_6_0 at b45186e
|
No changes in physics performance, as expected, |
No changes in performance observed on the T4:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only thing which is slightly confusing is the use of CUDAScopedContextProduce
in an EDAnalyzer::analyze()
method.
But I don't have a better name to suggest, so I'm fine with the changes.
I agree that it's a bit confusing. I could add |
Yes, I think that would make things even clear, thank you. |
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.
PR description:
This PR reorganizes
CUDAScopedContext
in two waysCUDAScopedContext
toCUDAScopedContextAcquire
andCUDAScopedContextProduce
CUDAScopedContext
has a mixture of possible operations that is based on how it was constructed. This is rather error prone, e.g. I remember forgetting several times to passedm::WaitingTaskWithArenaHolder
to it leading to weird crashes. NowCUDAScopedContextAcquire
always requires the holder.CUDAContextToken
toCUDAContextState
, and instead of explicitly saving the state ofCUDAScopedContextAcquire
into it, passCUDAContextState
to the constructor ofCUDAScopedContextAcquire
so that the latter can save the state on its destructor.std::mutex
(as member variable) and locks (local variable)PR validation:
Profiling workflow runs.