Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix broken LCTs reported in HLT_L1SingleMu25 [11_1_X] #30914

Merged
merged 14 commits into from
Jul 31, 2020
Merged

Fix broken LCTs reported in HLT_L1SingleMu25 [11_1_X] #30914

merged 14 commits into from
Jul 31, 2020

Conversation

dildick
Copy link
Contributor

@dildick dildick commented Jul 25, 2020

PR description:

An inefficiency was reported in HLT_L1SingleMu25 in CMSSW_11_2_0_pre2, which was not present in CMSSW_11_2_0_pre1 (https://its.cern.ch/jira/browse/CMSLITDPG-903). Investigation showed only ME1/2 LCTs (oddly enough) with invalid pattern numbers. This was fixed with 9354dff.

Further investigation revealed that the CLCT processor readout function was not checking if CLCTs were valid, and that GEM-CSC motherboards do not check if incoming pads are valid. The code was also missing quality control. I added a number of checkValid functions that check the properties for ALCT/CLCT/LCT at various stages of the algorithm. LogErrors are printed in case values go out of bounds.

PR validation:

Tested on 10k events of

  • /RelValSingleMuFlatPt2To100/CMSSW_11_0_0-PU25ns_110X_mcRun4_realistic_v3_2026D49PU200-v1/GEN-SIM-DIGI-RAW
  • /RelValSingleMuFlatPt2To100/CMSSW_11_0_0-110X_mcRun4_realistic_v2_2026D49noPU-v1/GEN-SIM-DIGI-RAW
    I did not see LogErrors of invalid stubs.

if this PR is a backport please specify the original PR and why you need to backport that PR:

Backport of #30909.

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 25, 2020

A new Pull Request was created by @dildick (Sven Dildick) for CMSSW_11_1_X.

It involves the following packages:

DataFormats/CSCDigi
L1Trigger/CSCCommonTrigger
L1Trigger/CSCTriggerPrimitives

@cmsbuild, @rekovic, @benkrikler, @civanch, @mdhildreth can you please review it and eventually sign? Thanks.
@Martin-Grunewald, @ptcox, @valuev, @rovere this is something you requested to watch as well.
@silviodonato, @dpiparo, @qliphy you are the release manager for this.

cms-bot commands are listed here

@dildick dildick changed the title Fix broken ME1/2 LCTs reported in HLT_L1SingleMu25 Fix broken ME1/2 LCTs reported in HLT_L1SingleMu25 [11_1_X] Jul 25, 2020
@civanch
Copy link
Contributor

civanch commented Jul 26, 2020

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 26, 2020

The tests are being triggered in jenkins.

@cmsbuild
Copy link
Contributor

+1
Tested at: 5253476
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b170c8/8295/summary.html
CMSSW: CMSSW_11_1_X_2020-07-25-1100
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b170c8/8295/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 3 differences found in the comparisons
  • DQMHistoTests: Total files compared: 36
  • DQMHistoTests: Total histograms compared: 2780792
  • DQMHistoTests: Total failures: 2
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2780740
  • DQMHistoTests: Total skipped: 50
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 35 files compared)
  • Checked 152 log files, 16 edm output root files, 36 DQM output files

@dildick
Copy link
Contributor Author

dildick commented Jul 26, 2020

Interesting. I just tested this branch in CMSSW_11_1_X_2020-07-25-1100 on a 11_1_0_pre8 relval sample (in particular root://cmsxrootd-site.fnal.gov//store/relval/CMSSW_11_1_0_pre8/RelValZMM_14/GEN-SIM-DIGI-RAW/111X_mcRun3_2021_realistic_v4-v1/20000/E1B44039-321A-284E-91FB-97CD10F43A48.root).

The number of wiregroups the pretrigger loops on is constrained to [0,47]. Somehow the wiregroup number starts inflating past the bounds of the for-loop, ultimately causing a segfault in the anode processor pretrigger function.

...
%MSG-e CSCAnodeLCTProcessor:   CSCTriggerPrimitivesProducer:simCscTriggerPrimitiveDigis  26-Jul-2020 15:58:09 CDT Run: 1 Event: 754
CSCALCTDigi with invalid wire-group: 799; allowed [0, 48]
%MSG
...
%MSG-e CSCAnodeLCTProcessor:   CSCTriggerPrimitivesProducer:simCscTriggerPrimitiveDigis  26-Jul-2020 15:58:09 CDT Run: 1 Event: 754
CSCALCTDigi with invalid wire-group: 1242; allowed [0, 48]
%MSG
...
%MSG-e CSCAnodeLCTProcessor:   CSCTriggerPrimitivesProducer:simCscTriggerPrimitiveDigis  26-Jul-2020 15:58:09 CDT Run: 1 Event: 754
CSCALCTDigi with invalid wire-group: 1244; allowed [0, 48]
%MSG
...

Not sure what is causing this. The anode pretrigger function did not change recently. I also did not see such weird behavior in the corresponding 11_2_X branch (#30909). In fact, I don't recall ever seeing this.

It seems that - at the very least - I need to add another explicit check that the wiregroup in the for-loop is valid.

@dildick
Copy link
Contributor Author

dildick commented Jul 26, 2020

I noticed that if (lct.getKeyWG() > max_wire) in checkValid should be tightened to if (lct.getKeyWG() >= max_wire). E.g. max_wire returns 48 for ME1/1, but wiregroup numbering starts from 0.

@silviodonato
Copy link
Contributor

@rekovic ?

@ptcox
Copy link
Contributor

ptcox commented Jul 27, 2020 via email

@rekovic
Copy link
Contributor

rekovic commented Jul 27, 2020

hi @dildick
Regarding your comment #30914 (comment).
Do you see the same on the HLT TDR samples, and does this fix the inefficiency we have reported in [1] ?

Why would you not see this in 11_2_X is surprising to me. Is there anything in the geometry and counting that is different b/w 11_1_X and 11_2_X (@civanch @ptcox) ?

[1]
https://hypernews.cern.ch/HyperNews/CMS/get/L1TriggerUpgrades/385/2.html

@dildick
Copy link
Contributor Author

dildick commented Jul 27, 2020

@rekovic I'll check it now on an HLT TDR sample.

@cmsbuild
Copy link
Contributor

Pull request #30914 was updated. @cmsbuild, @rekovic, @benkrikler, @civanch, @mdhildreth can you please check and sign again.

@dildick
Copy link
Contributor Author

dildick commented Jul 31, 2020

A few LCT efficiency plots (not for publication) from /RelValSingleMuFlatPt2To100/CMSSW_11_0_0-PU25ns_110X_mcRun4_realistic_v3_2026D49PU200-v1/GEN-SIM-DIGI-RAW

Eff_CSCLCT_ME11
Eff_CSCLCT_ME12
Eff_CSCLCT_ME13
Eff_CSCLCT_ME21
Eff_CSCLCT_ME22
Eff_CSCLCT_ME31
Eff_CSCLCT_ME32
Eff_CSCLCT_ME41
Eff_CSCLCT_ME42

@dildick
Copy link
Contributor Author

dildick commented Jul 31, 2020

Efficiencies largely above 90%. Outliers are (1) ME1/2, which does not have upgraded processors or motherboard algorithms, (2) ME2/1 near |eta|~1.7. Despite loosening matching windows in GE2/1, I'm not yet able to recover this inefficiency. Requires more investigation.

@dildick
Copy link
Contributor Author

dildick commented Jul 31, 2020

@tahuang1991 Any idea why the GE2/1-ME2/1 algorithm would loose 20% efficiency near |eta|~1.7?

@rekovic
Copy link
Contributor

rekovic commented Jul 31, 2020

please test

@cmsbuild
Copy link
Contributor

cmsbuild commented Jul 31, 2020

The tests are being triggered in jenkins.

@fwyzard
Copy link
Contributor

fwyzard commented Jul 31, 2020

unhold

@cmsbuild cmsbuild removed the hold label Jul 31, 2020
@fwyzard
Copy link
Contributor

fwyzard commented Jul 31, 2020

@dildick thank you very much for the plots

I admit I am not familiar with them... do you have them before the fix ?

@dildick
Copy link
Contributor Author

dildick commented Jul 31, 2020

Previously, the ME3/1 and ME4/1 LCT efficiencies (on a PU0 SingleMu relval sample) looked like this:

Screen Shot 2020-07-31 at 2 24 28 AM

Screen Shot 2020-07-31 at 2 24 33 AM

@rekovic
Copy link
Contributor

rekovic commented Jul 31, 2020

@dildick
We had reports of TkMu efficiency of about 20% in the EndCap. Your plots show about 20% impovement in TP efficiency, and that is from 70+ to 90+.

Do you have any plots of change in CSC eta-phi coordinates before your work on the updating the algorithms and now ?
Those should not be affected, right ?

@cmsbuild
Copy link
Contributor

+1
Tested at: 6011f7f
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b170c8/8457/summary.html
CMSSW: CMSSW_11_1_X_2020-07-30-2300
SCRAM_ARCH: slc7_amd64_gcc820

@cmsbuild
Copy link
Contributor

Comparison job queued.

@silviodonato
Copy link
Contributor

urgent

@cmsbuild
Copy link
Contributor

Comparison is ready
https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-b170c8/8457/summary.html

Comparison Summary:

  • No significant changes to the logs found
  • Reco comparison results: 0 differences found in the comparisons
  • DQMHistoTests: Total files compared: 36
  • DQMHistoTests: Total histograms compared: 2780792
  • DQMHistoTests: Total failures: 5
  • DQMHistoTests: Total nulls: 0
  • DQMHistoTests: Total successes: 2780737
  • DQMHistoTests: Total skipped: 50
  • DQMHistoTests: Total Missing objects: 0
  • DQMHistoSizes: Histogram memory added: 0.0 KiB( 35 files compared)
  • Checked 152 log files, 16 edm output root files, 36 DQM output files

@silviodonato
Copy link
Contributor

merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants