Prototype for EventSetup data on GPUs #77

makortel · 2018-06-05T11:52:58Z

This PR adds a prototype for dealing with EventSetup data on GPUs. Resolves #65. The prototype is applied to the ES data used by Raw2Cluster (cabling map etc, gains) and RecHits (CPE).

About the system

As outlined in the issue, now it is the ESProduct who owns the GPU memory. Currently each of the affected ESProducts have a method getGPUProductAsync(cuda::stream_t<>&) (suggestions for better names are welcome), which will allocate the memory on the current GPU device and transfer the data there asynchronously, if the data is not there yet. The functionality of bookkeeping which devices have the data already, and necessary synchronization between multiple threads (only one thread may do the transfer per device) are abstracted to a helper template in HeterogeneousCore/CUDACore/interface/CUDAESProduct.h.

CPE

Adding support to PixelCPEFast was easy, as it was already produced by an ESProducer and dealt with copying the data to GPU.

Cabling map

Cabling map etc required a bit more work, as

the SiPixelFedCablingMap comes from the DB (so better not modify it)
we actually ship more data than just the cabling map (from geometry and SiPixelQuality)
we also ship event-based data for modules to unpack, whose filling has the same loop structure as the other data

I created a new ESProduct SiPixelFedCablingMapGPUWrapper (for time being on CkfComponentsRecord record, maybe we could create a "smaller" record in some more "local reco" package than RecoTracker/Record) to gather all the necessary ES data. The modules-to-unpack is implemented as a nested struct there to benefit from the constants and loop logic. In case the regionality is not used, the modules-to-unpack are transferred only once.

Gains

Gains required also a bit more work, as

the SiPixelGainCalibrationForHLT comes from the DB (so better not modify it)
there is a rather complex SiPixelGainCalibrationForHLTService class structure to access the gain values in CPU

Fortunately for the GPU we already just transfer the internal data of SiPixelGainCalibrationForHLT to the GPU, so it is enough to create a new ESProduct (SiPixelGainCalibrationForHLTGPU, on a newly created SiPixelGainCalibrationForHLTGPURcd record which depends also from TrackerDigiGeometryRecord), and let it to transfer the necessary data to GPU "as usual".

Other stuff

Following @VinInn's suggestion #65 (comment), the Raw2Cluster is moved to RecoLocalTracker/SiPixelClusterizer. This also allows to revert 766e967 of #62 (but apparently this fact can not be used later to clean the history as I see the PRs are squashed to single commits at the moment).

@fwyzard @felicepantaleo @VinInn @rovere

…der to use it outside of RecoLocalTracker/SiPixelClusterizer" This reverts commit 766e967.

cmsbot · 2018-06-05T11:53:17Z

A new Pull Request was created by @makortel (Matti Kortelainen) for CMSSW_10_2_X_Patatrack.

It involves the following packages:

CalibTracker/Records
CalibTracker/SiPixelESProducers
Configuration/StandardSequences
EventFilter/SiPixelRawToDigi
HeterogeneousCore/CUDACore
HeterogeneousCore/CUDAServices
RecoLocalTracker/Configuration
RecoLocalTracker/SiPixelClusterizer
RecoLocalTracker/SiPixelRecHits

The following packages do not have a category, yet:

HeterogeneousCore/CUDACore
Please create a PR for https://github.com/cms-sw/cms-bot/blob/master/categories_map.py to assign category

@cmsbot, @fwyzard can you please review it and eventually sign? Thanks.

cms-bot commands are listed here

fwyzard · 2018-06-05T12:51:50Z

RecoLocalTracker/SiPixelClusterizer/plugins/PixelThresholdClusterizer.h

@@ -56,7 +56,7 @@
 #include <vector>


-class PixelThresholdClusterizer final : public PixelClusterizerBase {
+class dso_hidden PixelThresholdClusterizer final : public PixelClusterizerBase {


why dso_hidden ?

It's there in CMSSW master
https://github.com/cms-sw/cmssw/blob/148a0029c3fb3c8a92cc74b32cc5ae884bde87c5/RecoLocalTracker/SiPixelClusterizer/plugins/PixelThresholdClusterizer.h#L59

See also the commit of #62 which (temporarily) removed it
766e967

fwyzard · 2018-06-05T14:22:49Z

Validation summary

Reference release CMSSW_10_2_0_pre4 at 926a81b
Development branch CMSSW_10_2_X_Patatrack at b1e6d1c
Testing PRs:

Prototype for EventSetup data on GPUs #77 at 76e5f9b

`makeTrackValidationPlots.py` plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8 are missing
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

logs and `nvprof/nvvp` profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

log, profile and summary for workflow
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

testing log, profile and summary for workflow 10824.9
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/e621e735feea29d20cc7017ea6a4b82cd8ead8b2/log .

fwyzard · 2018-06-05T15:04:54Z

@makortel thanks for the PR, I'm sorry I couldn;t comment on #65 earlier.

Am I correct that this approach mirrors how the event setup works on the CPU ?
That is, conditions are loaded and made available only on demand.

On the other hand, how does this approach deal with an IOV change ?

makortel · 2018-06-05T15:29:22Z

@fwyzard

Am I correct that this approach mirrors how the event setup works on the CPU ?
That is, conditions are loaded and made available only on demand.

That is correct.

On the other hand, how does this approach deal with an IOV change ?

My understanding is that on IOV transition the ESProduct of the old IOV is deleted, and the ESProduct of the new IOV is constructed (and in case of multiple lumis in flight these two may coexist for some time). What happens with the approach of this PR is that for the old IOV the GPU memory is deallocated, and for the new IOV the GPU memory is allocated (again) and the CPU->GPU transfer is made (so it continues to mirror how the system works on the CPU).

One could of course argue that the deallocation+allocation cycle is unnecessary. In a sense that is true, because even in case of multiple lumis in flight one could just multiply the buffers by the maximum number of concurrent lumis and add bookkeeping logic. Alternatively we could use custom allocator avoiding cudaMalloc/cudaFree every time. I'm sure we can spend lot's of time discussing how to improve :)

fwyzard · 2018-06-05T15:32:34Z

Actually, AFAIK the framework does support concurrent runs/lumis, but does not support concurrent IOVs.

I assume the framework team would like to implement support for that as well in the future, so I am fine with an approach that does not prevent it.

makortel · 2018-06-05T15:40:40Z

Actually, AFAIK the framework does support concurrent runs/lumis, but does not support concurrent IOVs.

Right, I mixed the thins a bit. My understanding is as well that concurrent IOVs are somewhere in the future plans, so I aimed for a solution that would automatically work with that.

makortel · 2018-06-12T12:49:47Z

Could we try to review+merge this one soonish?

fwyzard · 2018-06-12T15:48:31Z

CalibTracker/SiPixelESProducers/src/SiPixelGainCalibrationForHLTGPU.cc

+}
+
+SiPixelGainCalibrationForHLTGPU::GPUData::~GPUData() {
+  if(gainForHLTonGPU != nullptr) {


You can avoid the check, as cudaFree will not do anything if the argument is a null pointer

fwyzard · 2018-06-12T15:56:34Z

I don't have other comments, if nobody else has anything to add I will merge this tomorrow morning.

makortel · 2018-06-13T06:47:24Z

@fwyzard Let me address your comment #77 (comment) first (doing it right now).

fwyzard · 2018-06-14T13:51:19Z

Validation summary

Reference release CMSSW_10_2_0_pre5 at 30c7b03
Development branch CMSSW_10_2_X_Patatrack at 655e4ed
Testing PRs:

Prototype for EventSetup data on GPUs #77 at ad5349a

`makeTrackValidationPlots.py` plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

tracking validation plots for workflow 10824.5
tracking validation plots for workflow 10824.8
tracking validation plots for workflow 10824.7
tracking validation plots for workflow 10824.9 are missing

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference DQM plots for reference release, workflow 10824.5
DQM plots for development release, workflow 10824.5
DQM plots for development release, workflow 10824.8
DQM plots for development release, workflow 10824.7
DQM plots for development release, workflow 10824.9 are missing
DQM plots for testing release, workflow 10824.5
DQM plots for testing release, workflow 10824.8
DQM plots for testing release, workflow 10824.7
DQM plots for testing release, workflow 10824.9 are missing
DQM comparison for reference workflow 10824.5
DQM comparison for workflow 10824.8
DQM comparison for workflow 10824.7
DQM comparison for workflow 10824.9

logs and `nvprof/nvvp` profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

reference log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.5
development log, profile and summary for workflow 10824.8
development log, profile and summary for workflow 10824.7
development log, profile and summary for workflow 10824.9 are missing
testing log, profile and summary for workflow 10824.5
testing log, profile and summary for workflow 10824.8
testing log, profile and summary for workflow 10824.7
testing log, profile and summary for workflow 10824.9 are missing

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/ca74ef1417f5be3d0316db57ecb0a2d4cf18d9a6/log .

…oMiniAODReview Cmssw 10 1 x tau reco mini aod review

Adds a prototype for dealing with EventSetup data on GPUs. The prototype is applied to the ES data used by Raw2Cluster (cabling map etc, gains) and RecHits (CPE). Now it is the `ESProduct` who owns the GPU memory. Currently each of the affected `ESProducts` have a method `getGPUProductAsync(cuda::stream_t<>&)` that will allocate the memory on the current GPU device and transfer the data there asynchronously, if the data is not there yet. The functionality of bookkeeping which devices have the data already, and necessary synchronization between multiple threads (only one thread may do the transfer per device) are abstracted to a helper template in `HeterogeneousCore/CUDACore/interface/CUDAESProduct.h`. Technical changes: - `EventSetup`-based implementation for GPU cabling map, gains, etc - add support for multiple devices to `PixelCPEFast` - abstract the `EeventSetup` GPU transfer - move `malloc` and transfer to the lambda - move `cudaFree` outside of the `nullptr` check - move files (back) to the plusing directory - rename `siPixelDigisHeterogeneous` to `siPixelClustersHeterogeneous`

makortel added 8 commits June 5, 2018 11:23

Add support for multiple devices to PixelCPEFast

4bcd03c

Abstract the ES GPU transfer

ae0c3b7

Move malloc and transfer to the lambda

140ab7b

EventSetup-based implementation for GPU cabling map etc

88f36c5

EventSetup-based implementation for GPU gains

9973b52

Move SiPixelRawToCluster to RecoLocalTracker/SiPixelClusterizer

1572e39

Revert "Move PixelThresholdClusterizer (back?) to interface+src in or…

f749867

…der to use it outside of RecoLocalTracker/SiPixelClusterizer" This reverts commit 766e967.

Rename siPixelDigisHeterogeneous to siPixelClustersHeterogeneous

76e5f9b

makortel mentioned this pull request Jun 5, 2018

Improve GPU treatment of conditions #65

Closed

cmsbot added alca-pending labels Jun 5, 2018

fwyzard reviewed Jun 5, 2018

View reviewed changes

makortel mentioned this pull request Jun 5, 2018

Make the exclusive_scan to stream-synchronize in PixelRecHits #74

Merged

makortel mentioned this pull request Jun 12, 2018

GPU: better hits #81

Merged

fwyzard reviewed Jun 12, 2018

View reviewed changes

fwyzard merged commit 955d9df into cms-patatrack:CMSSW_10_2_X_Patatrack Jun 14, 2018

fwyzard removed comparison-pending labels Jul 5, 2018

fwyzard added this to the CMSSW_10_2_0_pre6_Patatrack milestone Jul 5, 2018

bkilian15 pushed a commit to bkilian15/cmssw that referenced this pull request May 15, 2019

Merge pull request cms-patatrack#77 from steggema/CMSSW_10_1_X_TauRec…

4285c06

…oMiniAODReview Cmssw 10 1 x tau reco mini aod review

fwyzard mentioned this pull request Oct 8, 2020

Patatrack integration - Pixel local reconstruction (9/N) cms-sw/cmssw#31721

Merged

fwyzard mentioned this pull request Dec 25, 2020

Patatrack integration - ECAL local reconstruction (7/N) cms-sw/cmssw#31719

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prototype for EventSetup data on GPUs #77

Prototype for EventSetup data on GPUs #77

makortel commented Jun 5, 2018

cmsbot commented Jun 5, 2018

fwyzard Jun 5, 2018

makortel Jun 5, 2018

makortel Jun 5, 2018

fwyzard commented Jun 5, 2018 •

edited

Loading

fwyzard commented Jun 5, 2018

makortel commented Jun 5, 2018

fwyzard commented Jun 5, 2018

makortel commented Jun 5, 2018

makortel commented Jun 12, 2018

fwyzard Jun 12, 2018

fwyzard commented Jun 12, 2018

makortel commented Jun 13, 2018

fwyzard commented Jun 14, 2018

Prototype for EventSetup data on GPUs #77

Prototype for EventSetup data on GPUs #77

Conversation

makortel commented Jun 5, 2018

About the system

CPE

Cabling map

Gains

Other stuff

cmsbot commented Jun 5, 2018

fwyzard Jun 5, 2018

Choose a reason for hiding this comment

makortel Jun 5, 2018

Choose a reason for hiding this comment

makortel Jun 5, 2018

Choose a reason for hiding this comment

fwyzard commented Jun 5, 2018 • edited Loading

Validation summary

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Jun 5, 2018

makortel commented Jun 5, 2018

fwyzard commented Jun 5, 2018

makortel commented Jun 5, 2018

makortel commented Jun 12, 2018

fwyzard Jun 12, 2018

Choose a reason for hiding this comment

fwyzard commented Jun 12, 2018

makortel commented Jun 13, 2018

fwyzard commented Jun 14, 2018

Validation summary

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

DQM GUI plots

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_2_0_pre3-PU25ns_101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_2_0_pre3-101X_upgrade2018_realistic_v7-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Jun 5, 2018 •

edited

Loading

`makeTrackValidationPlots.py` plots

logs and `nvprof/nvvp` profiles

`makeTrackValidationPlots.py` plots

logs and `nvprof/nvvp` profiles