Reorganize CUDAScopedContext #355

makortel · 2019-06-14T19:19:47Z

PR description:

This PR reorganizes CUDAScopedContext in two ways

Split CUDAScopedContext to CUDAScopedContextAcquire and CUDAScopedContextProduce
- Currently CUDAScopedContext has a mixture of possible operations that is based on how it was constructed. This is rather error prone, e.g. I remember forgetting several times to pass edm::WaitingTaskWithArenaHolder to it leading to weird crashes. Now CUDAScopedContextAcquire always requires the holder.
Rename CUDAContextToken to CUDAContextState, and instead of explicitly saving the state of CUDAScopedContextAcquire into it, pass CUDAContextState to the constructor of CUDAScopedContextAcquire so that the latter can save the state on its destructor.
- This way is much safer. The pattern is somewhat similar to std::mutex (as member variable) and locks (local variable)

PR validation:

Profiling workflow runs.

Motivation is that the acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier.

Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics).

fwyzard · 2019-06-18T11:27:49Z

Validation summary

Reference release CMSSW_10_6_0 at b45186e
Development branch CMSSW_10_6_X_Patatrack at 828f536
Testing PRs:

Reorganize CUDAScopedContext #355 at 91ec257

`makeTrackValidationPlots.py` plots

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

tracking validation plots and summary for workflow 10824.5
tracking validation plots and summary for workflow 10824.51
tracking validation plots and summary for workflow 10824.52

logs and `nvprof`/`nvvp` profiles

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.51
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.52
- ✔️ step3.py: log, visual profile and summary
- ✔️ profile.py: log, visual profile and summary
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
testing release, workflow 10824.51
- ✔️ step3.py: log, visual profile and summary
testing release, workflow 10824.52
- ✔️ step3.py: log, visual profile and summary
- ✔️ profile.py: log, visual profile and summary
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.51
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.52
- ✔️ step3.py: log, visual profile and summary
- ✔️ profile.py: log, visual profile and summary
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
testing release, workflow 10824.51
- ✔️ step3.py: log, visual profile and summary
testing release, workflow 10824.52
- ✔️ step3.py: log, visual profile and summary
- ✔️ profile.py: log, visual profile and summary
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

reference release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.51
- ✔️ step3.py: log, visual profile and summary
development release, workflow 10824.52
- ✔️ step3.py: log, visual profile and summary
- ✔️ profile.py: log, visual profile and summary
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 10824.5
- ✔️ step3.py: log, visual profile and summary
testing release, workflow 10824.51
- ✔️ step3.py: log, visual profile and summary
testing release, workflow 10824.52
- ✔️ step3.py: log, visual profile and summary
- ✔️ profile.py: log, visual profile and summary
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/4110ca453e58dc6903ff55dd22d58424d8352f4e/log .

fwyzard · 2019-06-20T14:02:39Z

No changes in physics performance, as expected,

fwyzard · 2019-06-20T14:10:21Z

No changes in performance observed on the T4:

reference: 981.2 ± 6.1 ev/s
changed: 982.8 ± 6.8 ev/s

fwyzard

The only thing which is slightly confusing is the use of CUDAScopedContextProduce in an EDAnalyzer::analyze() method.

But I don't have a better name to suggest, so I'm fine with the changes.

makortel · 2019-06-20T14:22:40Z

The only thing which is slightly confusing is the use of CUDAScopedContextProduce in an EDAnalyzer::analyze() method.

But I don't have a better name to suggest, so I'm fine with the changes.

I agree that it's a bit confusing. I could add CUDAScopedContextAnalyze that would differ from CUDAScopedContextProduce by not having the wrap() nor emplace(). If you think that would be useful, I can add it in a follow-up PR that introduces a pattern for a chain of CPU tasks and GPU "tasks" within a module.

fwyzard · 2019-06-20T14:25:22Z

Yes, I think that would make things even clear, thank you.

* Split CUDAScopedContext to *Acquire and *Produce The motivation is that acquire() and produce() need a different functionality, and are constructed differently (e.g. acquire version always needs the edm::WaitingTaskWithArenaHolder). This split should make it more difficult to make mistakes. It should also make future evolution, e.g. towards chains of TBB tasks alternating in CPU and GPU work, easier. * Rename CUDAContextToken to CUDAContextState, and change semantics Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics). * Document the constructors.

makortel added 3 commits June 14, 2019 21:00

Rename CUDAContextToken to CUDAContextState, and change semantics

48819a2

Now CUDAScopedContextAcquire takes it as a parameter to constructor, and stores the state in its destructor (yielding RAII semantics).

Document constructors

91ec257

makortel mentioned this pull request Jun 19, 2019

Running code-format for added packages #357

Closed

fwyzard approved these changes Jun 20, 2019

View reviewed changes

fwyzard merged commit 957e184 into cms-patatrack:CMSSW_10_6_X_Patatrack Jun 20, 2019

makortel mentioned this pull request Jun 20, 2019

Running code-format for added files in existing packages #358

Closed

makortel mentioned this pull request Jul 1, 2019

Prototype for module-internal chain of tasks #363

Merged

fwyzard mentioned this pull request Aug 13, 2020

Patatrack integration - GPU beamspot data format and transfer (4/N) cms-sw/cmssw#31130

Merged

fwyzard mentioned this pull request Oct 8, 2020

Patatrack integration - Pixel local reconstruction (9/N) cms-sw/cmssw#31721

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reorganize CUDAScopedContext #355

Reorganize CUDAScopedContext #355

makortel commented Jun 14, 2019

fwyzard commented Jun 18, 2019 •

edited

Loading

fwyzard commented Jun 20, 2019

fwyzard commented Jun 20, 2019

fwyzard left a comment

makortel commented Jun 20, 2019

fwyzard commented Jun 20, 2019 •

edited

Loading

Reorganize CUDAScopedContext #355

Reorganize CUDAScopedContext #355

Conversation

makortel commented Jun 14, 2019

PR description:

PR validation:

fwyzard commented Jun 18, 2019 • edited Loading

Validation summary

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValTTbar_13/CMSSW_10_6_0-PU25ns_106X_upgrade2018_design_v3-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Jun 20, 2019

fwyzard commented Jun 20, 2019

fwyzard left a comment

Choose a reason for hiding this comment

makortel commented Jun 20, 2019

fwyzard commented Jun 20, 2019 • edited Loading

fwyzard commented Jun 18, 2019 •

edited

Loading

`makeTrackValidationPlots.py` plots

logs and `nvprof`/`nvvp` profiles

fwyzard commented Jun 20, 2019 •

edited

Loading