Use 10 time samples for the HBHE digis in Run 3 #531

vkhristenko · 2020-08-05T17:29:24Z

PR description:

fixing issue #520.
I attach a cfg that I used to validate. In particular confs should be added for the gpu modifier (apologies, not sure where to add this)
cfg.txt

process.hcalRawToDigiGPU.nsamplesF01HE = 10
process.hcalRawToDigiGPU.nsamplesF5HB = 10
process.hcalRawToDigiGPU.nsamplesF3HB = 10

PR validation:

cuda-memcheck does not show invalids anymore (memcheck/initcheck).

fwyzard · 2020-08-05T22:19:13Z

@mariadalfonso @vkhristenko in which cases should we use 8 or 10 time samples ?

Run 2 data ?
Run 2 MC ? Up to / from which release ?
Run 3 MC ? Up to / from which release ?

vkhristenko · 2020-08-05T22:23:01Z

Run3 Mc - 10 samples Data - 8 samples, for run 3 as well Run 2 mc - 8 samples Releases not sure...

…

On Thu, 6 Aug 2020 at 00:19, Andrea Bocci ***@***.***> wrote: @mariadalfonso <https://github.com/mariadalfonso> @vkhristenko <https://github.com/vkhristenko> in which cases should we use 8 or 10 time samples ? - Run 2 data ? - Run 2 MC ? Up to / from which release ? - Run 3 MC ? Up to / from which release ? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#531 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABSFUCIO6UNVOJ6TYQFLCHLR7HLG7ANCNFSM4PVXJZYQ> .

mariadalfonso · 2020-08-05T22:27:28Z

Run3 Mc - 10 samples Data - 8 samples, for run 3 as well Run 2 mc - 8 samples Releases not sure...
…

The above is correct.

Note Mahi (RECO) runs always on 8 samples with 8 pulses but in Run3 in view of the 2024 ageing we need to simulate electronic effects ealier (the real detector has all the effect from -inf)

Maria

fwyzard · 2020-08-05T23:19:03Z

Thanks for all the details.

Maria, I think the Run 3 MC samples from CMSSW 11.1.0(-pre8) already have the simulation with 10 samples, correct ?

mariadalfonso · 2020-08-06T06:13:14Z

yes.
the two relevant PRs 28948 and 29035 entered
https://github.com/cms-sw/cmssw/releases/tag/CMSSW_11_1_0_pre4

fwyzard · 2020-08-06T06:17:53Z

Thanks !

fwyzard · 2020-08-06T10:15:32Z

I tested that adding

process.hcalDigisGPU.nsamplesF01HE = 10
process.hcalDigisGPU.nsamplesF3HB = 10
process.hcalDigisGPU.nsamplesF5HB = 10

to the Run 3 MC workflow solves all the cuda-memcheck reports.

@mariadalfonso do you know how this is handled for data vs MC in the cpu workflow ?

vkhristenko · 2020-08-06T10:23:54Z

overall, they do not define a diff based on mc vs data (in mahi's producer). it is based on what sits in the data (how many time samples and where the sample of interest is). However, the Raw to Digi producer does have this conf!

in cpu it is much easier to do as you do not need to know how to deal with offsets (for a particular thread...)

technically speaking, this conf could be lifted further, removed, I believe. it would be taken from data then... I decided to leave this like that cause, at the very least it shows what the expectation is for gpu code...

fwyzard · 2020-08-06T10:26:16Z

However, the Raw to Digi producer does have this conf!

you mean the original cpu rawtodigi ?

vkhristenko · 2020-08-06T10:27:38Z

yep yep, the cpu unpacker has this in its configuration...

fwyzard · 2020-08-06T12:15:58Z

In CMSSW 11_2_0_pre3, I get

process.hcalDigis = cms.EDProducer("HcalRawToDigi",
    ComplainEmptyData = cms.untracked.bool(False),
    ElectronicsMap = cms.string(''),
    ExpectedOrbitMessageTime = cms.untracked.int32(-1),
    FEDs = cms.untracked.vint32(),
    FilterDataQuality = cms.bool(True),
    HcalFirstFED = cms.untracked.int32(700),
    InputLabel = cms.InputTag("rawDataCollector"),
    UnpackCalib = cms.untracked.bool(True),
    UnpackTTP = cms.untracked.bool(True),
    UnpackUMNio = cms.untracked.bool(True),
    UnpackZDC = cms.untracked.bool(True),
    UnpackerMode = cms.untracked.int32(0),
    firstSample = cms.int32(0),
    lastSample = cms.int32(9),
    mightGet = cms.optional.untracked.vstring,
    saveQIE10DataNSamples = cms.untracked.vint32(),
    saveQIE10DataTags = cms.untracked.vstring(),
    saveQIE11DataNSamples = cms.untracked.vint32(),
    saveQIE11DataTags = cms.untracked.vstring(),
    silent = cms.untracked.bool(True)
)

for

Run 2 MC (10824.0_TTbar_13)
Run 3 MC (11634.0_TTbar)
Run2 data (136.885_RunHLTPhy2018D)

If the relevant part is

    firstSample = cms.int32(0),
    lastSample = cms.int32(9),

does it mean that we should use 10 samples for all data and MC, also for Run 2 ?

vkhristenko · 2020-08-06T12:22:39Z

this must be configured somewhere else... what matters above is actually this:

    saveQIE10DataNSamples = cms.untracked.vint32(),
    saveQIE10DataTags = cms.untracked.vstring(),
    saveQIE11DataNSamples = cms.untracked.vint32(),
    saveQIE11DataTags = cms.untracked.vstring(),

mariadalfonso · 2020-08-06T12:47:48Z

Before the 2018 let's forget.

--> the MC extended the presamples only for the Run3
https://github.com/cms-sw/cmssw/pull/29035/files#diff-dc0f44be94ecf8e8b308650de30eeeb9R167
because of physics reason.

--> DATA stay at 8 likely because of the bandwidth of things out of P5.

All (SOI,nFrame) should be transparent in RECO because the Frame contains the SOI/NSamples (no matter what is MC or DATA).
Only the parameters of the templates should be input parameters.
I will have to go through the code for a major cleanup of all the things duplicated and/or hardcoded.

fwyzard · 2020-08-06T12:57:34Z

this must be configured somewhere else...

those are already the fully expanded dumps for the cmsDriver.py configurations 🤔

I will have to go through the code for a major cleanup of all the things duplicated and/or hardcoded.

OK, for the time being I'll check if this works:

# Only for Run 3 MC; Run 2 MC and Run 2/3 data use 8 time samples
from Configuration.Eras.Modifier_run3_HB_cff import run3_HB
run3_HB.toModify(hcalDigisGPU,
    nsamplesF01HE = 10,
    nsamplesF3HB = 10,
    nsamplesF5HB = 10
)

(I think run3_HB is not the correct modifier, but it's the closest one I could find...)

vkhristenko · 2020-08-06T13:04:36Z

https://github.com/cms-sw/cmssw/blob/master/EventFilter/HcalRawToDigi/plugins/HcalRawToDigi.cc#L117
https://github.com/cms-sw/cmssw/blob/master/EventFilter/HcalRawToDigi/src/HcalUnpacker.cc#L780

this is used here and then during decoding... i'm not sure about the confs... but when debugging all that stuff this is what i have found... in other words, you have to match what sits in data with your expectations...

fwyzard · 2020-08-06T13:10:24Z

@vkhristenko in your GPU code, do you use the number of time samples only to allocate enough memory ?
would it work to set 10 samples also for running over data, which has only 8 ?

vkhristenko · 2020-08-06T13:18:08Z

nope, this sets the stride for the access actually and should be as expected. as I said above in principle this can be lifted... but the easiest for me was to add this configurable things...

fwyzard · 2020-08-06T13:36:41Z

The test should be fine, since it should use Run 2 settings for running on data and Run 3 settings for MC... still, I'll ask HCAL how this is handled in the current code to figure out the best way to set the parameters.

@vkhristenko of course if (later on) you can make the code detect the correct number from the data itself, even better :-)

mariadalfonso · 2020-08-06T14:03:04Z

This is what should be supported at DIGI
// maxTS = 10, SOI = 5 (new, for better modeling of SiPM noise in MC) Run3 MC
// maxTS = 8, SOI = 3 (typical 8-TS situation in data). Run3 data
// maxTS = 8, SOI = 3 Run2018 data/MC

In the above cases for RECO, we do the fit on redefined -- 8TS and the SOI is always at TS3

vkhristenko · 2020-08-06T15:20:19Z

Let’s do this, I’ll update this ASAP and close the topic

…

On Thu, 6 Aug 2020 at 16:03, mariadalfonso ***@***.***> wrote: This is what should be supported at DIGI // maxTS = 10, SOI = 5 (new, for better modeling of SiPM noise in MC) Run3 MC // maxTS = 8, SOI = 3 (typical 8-TS situation in data). Run3 data // maxTS = 8, SOI = 3 Run2018 data/MC In the above cases for RECO, we do the fit on redefined -- 8TS and the SOI is always at TS3 — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#531 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABSFUCOZGYVDYNEQ7IPNOTLR7KZ2RANCNFSM4PVXJZYQ> .

fwyzard · 2020-08-07T08:23:45Z

@vkhristenko this PR as it is fixes the issue on Run 3 MC (with 10 time samples), and looks fine on data (which still uses 8 time samples).
I would merge it, and let you work in parallel on a more automatic implementation, if that's ok for you ?

vkhristenko · 2020-08-07T20:43:55Z

ok, never thought that would make such a huge diff... let me fix that now, apologies...

makortel · 2020-08-07T20:45:22Z

IIRC cudaHostFree() introduces a global synchronization point, so when performance depends on concurrent processing, significant impact is not that surprising.

fwyzard · 2020-08-07T20:47:57Z

diff --git a/EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc b/EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc
index 2f79bbd7867e..f64200a40495 100644
--- a/EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc
+++ b/EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc
@@ -15,7 +15,7 @@
 class HcalDigisProducerGPU : public edm::stream::EDProducer<edm::ExternalWork> {
 public:
   explicit HcalDigisProducerGPU(edm::ParameterSet const& ps);
-  ~HcalDigisProducerGPU() override;
+  ~HcalDigisProducerGPU() override = default;
   static void fillDescriptions(edm::ConfigurationDescriptions&);
 
 private:
@@ -53,6 +53,11 @@ private:
   };
   ConfigParameters config_;
 
+  // temporary host buffers
+  HostCollectionf01 hf01_;
+  HostCollectionf5 hf5_;
+  HostCollectionf3 hf3_;
+
   // device products: product owns memory (i.e. not the module)
   DeviceCollectionf01 df01_;
   DeviceCollectionf5 df5_;
@@ -84,9 +89,12 @@ HcalDigisProducerGPU::HcalDigisProducerGPU(const edm::ParameterSet& ps)
   config_.maxChannelsF01HE = ps.getParameter<uint32_t>("maxChannelsF01HE");
   config_.maxChannelsF5HB = ps.getParameter<uint32_t>("maxChannelsF5HB");
   config_.maxChannelsF3HB = ps.getParameter<uint32_t>("maxChannelsF3HB");
-}
 
-HcalDigisProducerGPU::~HcalDigisProducerGPU() {}
+  // preallocate the host buffers
+  hf01_.reserve(config_.maxChannelsF01HE);
+  hf5_.reserve(config_.maxChannelsF5HB);
+  hf3_.reserve(config_.maxChannelsF3HB);
+}
 
 void HcalDigisProducerGPU::acquire(edm::Event const& event,
                                    edm::EventSetup const& setup,
@@ -94,6 +102,10 @@ void HcalDigisProducerGPU::acquire(edm::Event const& event,
   // raii
   cms::cuda::ScopedContextAcquire ctx{event.streamID(), std::move(holder), cudaState_};
 
+  // clear the host buffers
+  hf01_.clear();
+  hf5_.clear();
+  hf3_.clear();
 
   // event data
   edm::Handle<HBHEDigiCollection> hbheDigis;
@@ -101,17 +113,11 @@ void HcalDigisProducerGPU::acquire(edm::Event const& event,
   event.getByToken(hbheDigiToken_, hbheDigis);
   event.getByToken(qie11DigiToken_, qie11Digis);
   
-  // tmp host data
-  HostCollectionf01 hf01;
-  HostCollectionf5 hf5;
-  HostCollectionf3 hf3;
-
   // init f5 collection
   if (hbheDigis->size() > 0) {
     auto const nsamples = (*hbheDigis)[0].size();
     auto const stride = hcal::compute_stride<hcal::Flavor5>(nsamples);
-    hf5.stride = stride;
-    hf5.reserve(config_.maxChannelsF5HB);
+    hf5_.stride = stride;
   
     // flavor5 get device blobs
     df5_.stride = stride;
@@ -126,10 +132,8 @@ void HcalDigisProducerGPU::acquire(edm::Event const& event,
     auto const stride01 = hcal::compute_stride<hcal::Flavor01>(nsamples);
     auto const stride3 = hcal::compute_stride<hcal::Flavor3>(nsamples);
 
-    hf01.stride = stride01;
-    hf3.stride = stride3;
-    hf01.reserve(config_.maxChannelsF01HE);
-    hf3.reserve(config_.maxChannelsF3HB);
+    hf01_.stride = stride01;
+    hf3_.stride = stride3;
   
     // flavor 0/1 get devie blobs
     df01_.stride = stride01;
@@ -147,19 +151,19 @@ void HcalDigisProducerGPU::acquire(edm::Event const& event,
   for (auto const& hbhe : *hbheDigis) {
     auto const id = hbhe.id().rawId();
     auto const presamples = hbhe.presamples();
-    hf5.ids.push_back(id);
-    hf5.npresamples.push_back(presamples);
+    hf5_.ids.push_back(id);
+    hf5_.npresamples.push_back(presamples);
     auto const stride = hcal::compute_stride<hcal::Flavor5>(hbhe.size());
-    assert(stride == hf5.stride && "strides must be the same for every single digi of the collection");
+    assert(stride == hf5_.stride && "strides must be the same for every single digi of the collection");
     // simple for now...
     static_assert(hcal::Flavor5::HEADER_WORDS == 1);
     uint16_t header_word = (1 << 15) | (0x5 << 12) | (0 << 10) | ((hbhe.sample(0).capid() & 0x3) << 8);
-    hf5.data.push_back(header_word);
+    hf5_.data.push_back(header_word);
     for (unsigned int i = 0; i < stride - hcal::Flavor5::HEADER_WORDS; i++) {
       uint16_t s0 = (0 << 7) | (static_cast<uint8_t>(hbhe.sample(2 * i).adc()) & 0x7f);
       uint16_t s1 = (0 << 7) | (static_cast<uint8_t>(hbhe.sample(2 * i + 1).adc()) & 0x7f);
       uint16_t sample = (s1 << 8) | s0;
-      hf5.data.push_back(sample);
+      hf5_.data.push_back(sample);
     }
   }
 
@@ -170,21 +174,21 @@ void HcalDigisProducerGPU::acquire(edm::Event const& event,
       if (digi.detid().subdetId() != HcalEndcap)
         continue;
       auto const id = digi.detid().rawId();
-      hf01.ids.push_back(id);
+      hf01_.ids.push_back(id);
       for (int hw = 0; hw < hcal::Flavor01::HEADER_WORDS; hw++)
-        hf01.data.push_back((*qie11Digis)[i][hw]);
+        hf01_.data.push_back((*qie11Digis)[i][hw]);
       for (int sample = 0; sample < digi.samples(); sample++) {
-        hf01.data.push_back((*qie11Digis)[i][hcal::Flavor01::HEADER_WORDS + sample]);
+        hf01_.data.push_back((*qie11Digis)[i][hcal::Flavor01::HEADER_WORDS + sample]);
       }
     } else if (digi.flavor() == 3) {
       if (digi.detid().subdetId() != HcalBarrel)
         continue;
       auto const id = digi.detid().rawId();
-      hf3.ids.push_back(id);
+      hf3_.ids.push_back(id);
       for (int hw = 0; hw < hcal::Flavor3::HEADER_WORDS; hw++)
-        hf3.data.push_back((*qie11Digis)[i][hw]);
+        hf3_.data.push_back((*qie11Digis)[i][hw]);
       for (int sample = 0; sample < digi.samples(); sample++) {
-        hf3.data.push_back((*qie11Digis)[i][hcal::Flavor3::HEADER_WORDS + sample]);
+        hf3_.data.push_back((*qie11Digis)[i][hcal::Flavor3::HEADER_WORDS + sample]);
       }
     }
   }
@@ -198,19 +202,19 @@ void HcalDigisProducerGPU::acquire(edm::Event const& event,
     cudaCheck(cudaMemcpyAsync(dest, src.data(), src.size() * sizeof(type), cudaMemcpyHostToDevice, ctx.stream()));
   };
 
-  lambdaToTransfer(df01_.data.get(), hf01.data);
-  lambdaToTransfer(df01_.ids.get(), hf01.ids);
+  lambdaToTransfer(df01_.data.get(), hf01_.data);
+  lambdaToTransfer(df01_.ids.get(), hf01_.ids);
 
-  lambdaToTransfer(df5_.data.get(), hf5.data);
-  lambdaToTransfer(df5_.ids.get(), hf5.ids);
-  lambdaToTransfer(df5_.npresamples.get(), hf5.npresamples);
+  lambdaToTransfer(df5_.data.get(), hf5_.data);
+  lambdaToTransfer(df5_.ids.get(), hf5_.ids);
+  lambdaToTransfer(df5_.npresamples.get(), hf5_.npresamples);
 
-  lambdaToTransfer(df3_.data.get(), hf3.data);
-  lambdaToTransfer(df3_.ids.get(), hf3.ids);
+  lambdaToTransfer(df3_.data.get(), hf3_.data);
+  lambdaToTransfer(df3_.ids.get(), hf3_.ids);
 
-  df01_.size = hf01.ids.size();
-  df5_.size = hf5.ids.size();
-  df3_.size = hf3.ids.size();
+  df01_.size = hf01_.ids.size();
+  df5_.size = hf5_.ids.size();
+  df3_.size = hf3_.ids.size();
 }
 
 void HcalDigisProducerGPU::produce(edm::Event& event, edm::EventSetup const& setup) {

vkhristenko · 2020-08-07T21:04:27Z

added, the only main diff wrt @fwyzard 's diff is the presence of stride setting before reserve call.. for the max statically known... let me know!

EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc

fwyzard · 2020-08-07T22:25:45Z

Validation summary

Reference release CMSSW_11_1_0 at b7ad279
Development branch cms-patatrack/CMSSW_11_1_X_Patatrack at fcdcb51
Testing PRs:

Use 10 time samples for the HBHE digis in Run 3 #531 at cf82e0f

Validation plots

/RelValTTbar_14TeV/CMSSW_11_1_0_pre8-PU_111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_14/CMSSW_11_1_0_pre8-111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZEE_14/CMSSW_11_1_0_pre8-111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

logs and `nvprof`/`nvvp` profiles

/RelValTTbar_14TeV/CMSSW_11_1_0_pre8-PU_111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.521
- ✔️ step3.py: log
development release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ❌ cuda-memcheck --tool initcheck (report, log) found 982096 errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.885522
testing release, workflow 11634.521
- ✔️ step3.py: log
testing release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ❌ cuda-memcheck --tool synccheck (report, log) found no CUDA-MEMCHECK results
testing release, workflow 136.885522

/RelValZMM_14/CMSSW_11_1_0_pre8-111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.521
- ✔️ step3.py: log
development release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ❌ cuda-memcheck --tool initcheck (report, log) found 48696 errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.885522
testing release, workflow 11634.521
- ✔️ step3.py: log
testing release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.885522

/RelValZEE_14/CMSSW_11_1_0_pre8-111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

reference release, workflow 11634.5
- ✔️ step3.py: log
development release, workflow 11634.521
- ✔️ step3.py: log
development release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ❌ cuda-memcheck --tool initcheck (report, log) found 26616 errors
- ✔️ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) did not find any errors
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
development release, workflow 136.885522
testing release, workflow 11634.521
- ✔️ step3.py: log
testing release, workflow 11634.522
- ✔️ step3.py: log
- ✔️ profile.py: log
- ✔️ cuda-memcheck --tool initcheck (report, log) did not find any errors
- ❌ cuda-memcheck --tool memcheck --leak-check full --report-api-errors all (report, log) found no CUDA-MEMCHECK results
- ✔️ cuda-memcheck --tool synccheck (report, log) did not find any errors
testing release, workflow 136.885522

Logs

The full log is available at https://patatrack.web.cern.ch/patatrack/validation/pulls/a0c322c7d08669abebd0c760126f04136d0cd2cd/log .

fwyzard · 2020-08-08T00:10:03Z

CUDA is still (again ?) not happy :-(

vkhristenko · 2020-08-08T07:54:31Z

U mean numbers? Or cuda men check?

…

On Sat, 8 Aug 2020 at 02:10, Andrea Bocci ***@***.***> wrote: CUDA is still (again ?) not happy :-( — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#531 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABSFUCLZ4PYOBQ7JUSO5ATDR7SJWPANCNFSM4PVXJZYQ> .

fwyzard · 2020-08-08T08:11:15Z

The last round of validation spotted problem with cuda-memcheck here and here.

Just check the ❌ under "testing release".

vkhristenko · 2020-08-08T08:32:30Z

I'm not seeing these issues (--tool initcheck), i'm running on ttbar...

fwyzard · 2020-08-08T15:17:49Z

Running step3.py on TTbar, I can reproduce the crash with the synccheck tool:

cuda-memcheck --tool synccheck --log-file synccheck.log -- cmsRun step3.py
...
08-Aug-2020 17:10:38 CEST  Initiating request to open file file:/gpu_data/store/relval/CMSSW_11_1_0_pre8/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_111X_mcRun3_2021_realistic_v4-v1/20000/6767846A-04AA-AD40-BDAB-407450210E53.root
08-Aug-2020 17:10:39 CEST  Successfully opened file file:/gpu_data/store/relval/CMSSW_11_1_0_pre8/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_111X_mcRun3_2021_realistic_v4-v1/20000/6767846A-04AA-AD40-BDAB-407450210E53.root
Begin processing the 1st record. Run 1, Event 6107, LumiSection 62 on stream 6 at 08-Aug-2020 17:10:53.325 CEST
Begin processing the 2nd record. Run 1, Event 6101, LumiSection 62 on stream 1 at 08-Aug-2020 17:10:53.338 CEST
Begin processing the 3rd record. Run 1, Event 6106, LumiSection 62 on stream 0 at 08-Aug-2020 17:10:53.338 CEST
Begin processing the 4th record. Run 1, Event 6104, LumiSection 62 on stream 4 at 08-Aug-2020 17:10:53.340 CEST
Begin processing the 5th record. Run 1, Event 6102, LumiSection 62 on stream 3 at 08-Aug-2020 17:10:53.341 CEST
Begin processing the 6th record. Run 1, Event 6108, LumiSection 62 on stream 7 at 08-Aug-2020 17:10:53.342 CEST
Begin processing the 7th record. Run 1, Event 6103, LumiSection 62 on stream 5 at 08-Aug-2020 17:10:53.342 CEST
Begin processing the 8th record. Run 1, Event 6105, LumiSection 62 on stream 2 at 08-Aug-2020 17:10:53.343 CEST
Begin processing the 9th record. Run 1, Event 6115, LumiSection 62 on stream 7 at 08-Aug-2020 17:10:53.980 CEST
terminate called after throwing an instance of 'cms::cuda::bad_alloc'
  what():  an illegal memory access was encountered


A fatal system signal has occurred: abort signal
The following is the call stack containing the origin of the signal.

However, this might be a bug in the tool rather than a problem in our code.
Running with the more recent Compute Sanitezer does not reveal any problems:

$ /usr/local/cuda-11.0/Sanitizer/compute-sanitizer --tool synccheck --log-file synccheck.log cmsRun step3.py
...
08-Aug-2020 17:14:53 CEST  Initiating request to open file file:/gpu_data/store/relval/CMSSW_11_1_0_pre8/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_111X_mcRun3_2021_realistic_v4-v1/20000/6767846A-04AA-AD40-BDAB-407450210E53.root
08-Aug-2020 17:14:54 CEST  Successfully opened file file:/gpu_data/store/relval/CMSSW_11_1_0_pre8/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_111X_mcRun3_2021_realistic_v4-v1/20000/6767846A-04AA-AD40-BDAB-407450210E53.root
Begin processing the 1st record. Run 1, Event 6107, LumiSection 62 on stream 5 at 08-Aug-2020 17:15:08.419 CEST
Begin processing the 2nd record. Run 1, Event 6101, LumiSection 62 on stream 0 at 08-Aug-2020 17:15:08.432 CEST
Begin processing the 3rd record. Run 1, Event 6106, LumiSection 62 on stream 2 at 08-Aug-2020 17:15:08.433 CEST
Begin processing the 4th record. Run 1, Event 6104, LumiSection 62 on stream 3 at 08-Aug-2020 17:15:08.435 CEST
Begin processing the 5th record. Run 1, Event 6102, LumiSection 62 on stream 6 at 08-Aug-2020 17:15:08.435 CEST
Begin processing the 6th record. Run 1, Event 6108, LumiSection 62 on stream 4 at 08-Aug-2020 17:15:08.436 CEST
Begin processing the 7th record. Run 1, Event 6103, LumiSection 62 on stream 1 at 08-Aug-2020 17:15:08.437 CEST
Begin processing the 8th record. Run 1, Event 6105, LumiSection 62 on stream 7 at 08-Aug-2020 17:15:08.438 CEST
Begin processing the 9th record. Run 1, Event 6115, LumiSection 62 on stream 6 at 08-Aug-2020 17:15:09.016 CEST
Begin processing the 10th record. Run 1, Event 6116, LumiSection 62 on stream 0 at 08-Aug-2020 17:15:09.016 CEST
...
Begin processing the 99th record. Run 1, Event 6197, LumiSection 62 on stream 0 at 08-Aug-2020 17:15:21.639 CEST
Begin processing the 100th record. Run 1, Event 6199, LumiSection 62 on stream 6 at 08-Aug-2020 17:15:21.639 CEST
08-Aug-2020 17:15:22 CEST  Closed file file:/gpu_data/store/relval/CMSSW_11_1_0_pre8/RelValTTbar_14TeV/GEN-SIM-DIGI-RAW/PU_111X_mcRun3_2021_realistic_v4-v1/20000/6767846A-04AA-AD40-BDAB-407450210E53.root

and

$ cat synccheck.log
========= COMPUTE-SANITIZER
========= ERROR SUMMARY: 0 errors

Unfortunately the version bundled with CUDA in CMSSW is incomplete, I'll have to fix that before we can switch to in the validation tests.

fwyzard · 2020-08-08T15:29:04Z

As for the memcheck and leak check on the Zee sample, the error is not 100% reproducible, running interactively I got the error it only once.

Otherwise, the report is

========= CUDA-MEMCHECK
========= LEAK SUMMARY: 0 bytes leaked in 0 allocations
========= ERROR SUMMARY: 0 errors

fwyzard · 2020-08-08T15:31:07Z

Merged in CMSSW_11_1_X and CMSSW_11_2_X .

vkhristenko added 2 commits August 5, 2020 17:12

fix misspelling

855eaca

adding option for 10 time samples and use only 8 of them for mc

0044be1

vkhristenko mentioned this pull request Aug 5, 2020

Uninitialized memory read in MAHI code #520

Closed

fwyzard added bug-fix HCAL HCAL-related developments labels Aug 5, 2020

Set 10 time samples for the Run 3 GPU workflow

8713797

This comment has been minimized.

Sign in to view

fwyzard changed the title ~~fixing issue #520~~ Use 10 time samples for the HBHE digis in Run 3 Aug 6, 2020

fwyzard mentioned this pull request Aug 6, 2020

Update the BeamSpotCUDA class #530

Merged

preallocate buffers on the host

8d616af

fwyzard reviewed Aug 7, 2020

View reviewed changes

EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc Outdated Show resolved Hide resolved

fwyzard reviewed Aug 7, 2020

View reviewed changes

EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc Outdated Show resolved Hide resolved

fwyzard reviewed Aug 7, 2020

View reviewed changes

EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc Outdated Show resolved Hide resolved

fwyzard reviewed Aug 7, 2020

View reviewed changes

EventFilter/HcalRawToDigi/plugins/HcalDigisProducerGPU.cc Outdated Show resolved Hide resolved

fwyzard added 3 commits August 7, 2020 23:37

Minimize changes

947f08e

Explicit default destructors

f8b009b

Revert the customisation

cf82e0f

fwyzard merged commit 66fe2d8 into cms-patatrack:CMSSW_11_1_X_Patatrack Aug 8, 2020

fwyzard pushed a commit that referenced this pull request Aug 8, 2020

Use up to 10 time samples for the HBHE digis in Run 3 MC (#531)

68e6a9e

fwyzard pushed a commit that referenced this pull request Aug 8, 2020

Use up to 10 time samples for the HBHE digis in Run 3 MC (#531)

146b6fd

fwyzard mentioned this pull request Aug 16, 2020

Fix HLT customisation following the HBHE updates #537

Merged

fwyzard pushed a commit that referenced this pull request Oct 8, 2020

Use up to 10 time samples for the HBHE digis in Run 3 MC (#531)

06eb12f

fwyzard mentioned this pull request Oct 8, 2020

Patatrack integration - HCAL local reconstruction (8/N) cms-sw/cmssw#31720

Merged

fwyzard pushed a commit that referenced this pull request Nov 9, 2020

Use up to 10 time samples for the HBHE digis in Run 3 MC (#531)

2338b6d

fwyzard pushed a commit that referenced this pull request Nov 12, 2020

Use up to 10 time samples for the HBHE digis in Run 3 MC (#531)

505d981

fwyzard pushed a commit that referenced this pull request Nov 16, 2020

Use up to 10 time samples for the HBHE digis in Run 3 MC (#531)

fc103ce

fwyzard added a commit that referenced this pull request Nov 26, 2020

Use up to 10 time samples for the HBHE digis in Run 3 MC (#531)

41c6b90

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use 10 time samples for the HBHE digis in Run 3 #531

Use 10 time samples for the HBHE digis in Run 3 #531

vkhristenko commented Aug 5, 2020 •

edited by fwyzard

Loading

fwyzard commented Aug 5, 2020

vkhristenko commented Aug 5, 2020 via email

mariadalfonso commented Aug 5, 2020

fwyzard commented Aug 5, 2020

mariadalfonso commented Aug 6, 2020

fwyzard commented Aug 6, 2020

fwyzard commented Aug 6, 2020

vkhristenko commented Aug 6, 2020

fwyzard commented Aug 6, 2020

vkhristenko commented Aug 6, 2020

fwyzard commented Aug 6, 2020

vkhristenko commented Aug 6, 2020

mariadalfonso commented Aug 6, 2020

fwyzard commented Aug 6, 2020 •

edited

Loading

vkhristenko commented Aug 6, 2020

fwyzard commented Aug 6, 2020

vkhristenko commented Aug 6, 2020

This comment has been minimized.

fwyzard commented Aug 6, 2020

mariadalfonso commented Aug 6, 2020 •

edited

Loading

vkhristenko commented Aug 6, 2020 via email

fwyzard commented Aug 7, 2020

vkhristenko commented Aug 7, 2020

makortel commented Aug 7, 2020

fwyzard commented Aug 7, 2020

vkhristenko commented Aug 7, 2020

fwyzard commented Aug 7, 2020 •

edited

Loading

fwyzard commented Aug 8, 2020

vkhristenko commented Aug 8, 2020 via email

fwyzard commented Aug 8, 2020

vkhristenko commented Aug 8, 2020

fwyzard commented Aug 8, 2020

fwyzard commented Aug 8, 2020

fwyzard commented Aug 8, 2020

Use 10 time samples for the HBHE digis in Run 3 #531

Use 10 time samples for the HBHE digis in Run 3 #531

Conversation

vkhristenko commented Aug 5, 2020 • edited by fwyzard Loading

PR description:

PR validation:

fwyzard commented Aug 5, 2020

vkhristenko commented Aug 5, 2020 via email

mariadalfonso commented Aug 5, 2020

fwyzard commented Aug 5, 2020

mariadalfonso commented Aug 6, 2020

fwyzard commented Aug 6, 2020

fwyzard commented Aug 6, 2020

vkhristenko commented Aug 6, 2020

fwyzard commented Aug 6, 2020

vkhristenko commented Aug 6, 2020

fwyzard commented Aug 6, 2020

vkhristenko commented Aug 6, 2020

mariadalfonso commented Aug 6, 2020

fwyzard commented Aug 6, 2020 • edited Loading

vkhristenko commented Aug 6, 2020

fwyzard commented Aug 6, 2020

vkhristenko commented Aug 6, 2020

This comment has been minimized.

fwyzard commented Aug 6, 2020

mariadalfonso commented Aug 6, 2020 • edited Loading

vkhristenko commented Aug 6, 2020 via email

fwyzard commented Aug 7, 2020

vkhristenko commented Aug 7, 2020

makortel commented Aug 7, 2020

fwyzard commented Aug 7, 2020

vkhristenko commented Aug 7, 2020

fwyzard commented Aug 7, 2020 • edited Loading

Validation summary

Validation plots

/RelValTTbar_14TeV/CMSSW_11_1_0_pre8-PU_111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_14/CMSSW_11_1_0_pre8-111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZEE_14/CMSSW_11_1_0_pre8-111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

Throughput plots

/EphemeralHLTPhysics1/Run2018D-v1/RAW run=323775 lumi=53

logs and nvprof/nvvp profiles

/RelValTTbar_14TeV/CMSSW_11_1_0_pre8-PU_111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZMM_14/CMSSW_11_1_0_pre8-111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

/RelValZEE_14/CMSSW_11_1_0_pre8-111X_mcRun3_2021_realistic_v4-v1/GEN-SIM-DIGI-RAW

Logs

fwyzard commented Aug 8, 2020

vkhristenko commented Aug 8, 2020 via email

fwyzard commented Aug 8, 2020

vkhristenko commented Aug 8, 2020

fwyzard commented Aug 8, 2020

fwyzard commented Aug 8, 2020

fwyzard commented Aug 8, 2020

vkhristenko commented Aug 5, 2020 •

edited by fwyzard

Loading

fwyzard commented Aug 6, 2020 •

edited

Loading

mariadalfonso commented Aug 6, 2020 •

edited

Loading

fwyzard commented Aug 7, 2020 •

edited

Loading

logs and `nvprof`/`nvvp` profiles