-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GBRForest implementation without TMVA #24432
Conversation
The code-checks are being triggered in jenkins. |
-code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-24432/6260 Code check has found code style and quality issues which could be resolved by applying a patch in https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-24432/6260/git-diff.patch You can run |
The code-checks are being triggered in jenkins. |
+code-checks Logs: https://cmssdt.cern.ch/SDT/code-checks/cms-sw-PR-24432/6263 |
A new Pull Request was created by @guitargeek (Jonas Rembser) for master. It involves the following packages: CommonTools/Utils @perrotta, @smuzaffar, @Dr15Jones, @arunhep, @tocheng, @monttj, @cmsbuild, @gpetruc, @franzoni, @slava77, @ggovi, @pohsun, @arizzi, @lpernie can you please review it and eventually sign? Thanks. cms-bot commands are listed here |
@cmsbuild please test |
The tests are being triggered in jenkins. |
-1 Tested at: 0430eeb You can see the results of the tests here: I found follow errors while testing this PR Failed tests: Build HeaderConsistency
I found an error when building: >> Compiling /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/src/RecoEgamma/EgammaElectronAlgos/src/TrajSeedMatcher.cc >> Compiling /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/src/RecoEgamma/EgammaElectronAlgos/src/SiStripElectronAlgo.cc In file included from /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/src/RecoEgamma/ElectronIdentification/interface/ElectronMVAEstimator.h:5:0, from /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/src/RecoEgamma/EgammaElectronAlgos/interface/GsfElectronAlgoHeavyObjectCache.h:5, from /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/src/RecoEgamma/EgammaElectronAlgos/src/GsfElectronAlgoHeavyObjectCache.cc:1: /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/poison/RecoEgamma/EgammaTools/interface/GBRForestTools.h:1:2: error: #error THIS FILE HAS BEEN REMOVED FROM THE PACKAGE. #error THIS FILE HAS BEEN REMOVED FROM THE PACKAGE. ^~~~~ In file included from /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/src/RecoEgamma/ElectronIdentification/interface/SoftElectronMVAEstimator.h:9:0, from /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/src/RecoEgamma/EgammaElectronAlgos/interface/GsfElectronAlgoHeavyObjectCache.h:6, from /build/cmsbld/jenkins/workspace/ib-any-integration/CMSSW_10_3_X_2018-08-31-1100/src/RecoEgamma/EgammaElectronAlgos/src/GsfElectronAlgoHeavyObjectCache.cc:1: |
Comparison not run due to Build errors (RelVals and Igprof tests were also skipped) |
+1 |
@@ -5,7 +5,7 @@ | |||
<use name="FWCore/Utilities"/> | |||
<use name="DataFormats/Candidate"/> | |||
<use name="DataFormats/PatCandidates"/> | |||
<use name="CommonTools/Utils"/> | |||
<use name="CommonTools/MVAUtils"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 |
@smuzaffar I see an error report in https://cmssdt.cern.ch/SDT/jenkins-artifacts/pull-request-integration/PR-24432/31245/headers_chks.log is it an internal failure of the check? I think we already saw something of this kind... |
+upgrade |
+1 |
Final Summary
GBRForest
data class itself and moved it toGBRForestTools
where the parser was completely rewritten. The XML file is now parsed directly instead of translating a TMVA::Reader. This reduces startup time and CPU usage since the tree structure is only iterated over once.GBRForestTools
which set up a GBRForest from (gzipped) XML files by parsing it with tinyxml2GBRForestWriter
to read GBRForests from DB (previously in CommonTools/Utils)TMVAEvaluator
andTMVAZipReader
(previously in CommonTools/Utils)PFEgammaAlgo
GBRForestTools
can use themOriginal Message
Hi all,
recently I re-implemented the GBRForest/GBRTree for a few reasons:
Reason 1. triggered my work here, because my Bayesian optimization studies for the electron ID resulted in > 20 MB weight files which I could not readily use with the current implementation.
I don't know if you want to merge it to master, but at least it would be nice to get it tested once at least so I feel confident in the results of my private studies where I use this implementation. But the bigger weight files will come sooner or later also to CMSSW :-)
Notes:
readGzipFile
function in CommonTools/Utils [3], because adding that package to the BuildFile of CondFormats/EgammaObjects would introduce a cyclic dependency. It's maybe not optimal that CommonTools/Utils houses both really low-level functionality like the reading of gzip files and higher level business that requires GBRForests itself.Cheers,
Jonas
[1] #24395
[2] https://github.com/leethomason/tinyxml2
[3] https://github.com/cms-sw/cmssw/blob/master/CommonTools/Utils/src/TMVAZipReader.cc#L21
[4] https://github.com/cms-data/RecoParticleFlow-PFProducer
[5] https://gist.github.com/guitargeek/a00c86ffd697c8c8aac5e6797a352a3c