Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

large start-of-job memory spikes #28780

Closed
davidlange6 opened this issue Jan 23, 2020 · 14 comments · Fixed by #31220
Closed

large start-of-job memory spikes #28780

davidlange6 opened this issue Jan 23, 2020 · 14 comments · Fixed by #31220

Comments

@davidlange6
Copy link
Contributor

Hi
running a random reco workflow - I notice there is a (new to me) spike

image

ignore the two lines - the test I was doing was unrelated..

anyway, I also randomly caught this in igprof - is there some way to limit the memory this xml unpacking can take? I presume the input file and the output data structure are not nearly this big..

69.3 3'279'594'277 661'891 createGBRForest(edm::FileInPath const&) [19]
59.9 2'836'863'761 384'000 edm::WorkerMaker::makeModule(edm::ParameterSet const&) const [20]
59.9 2'836'863'432 383'995 lowptgsfeleseed::HeavyObjectCache::HeavyObjectCache(edm::ParameterSet const&) [21]
52.7 2'494'229'198 387'472 tinyxml2::XMLDocument::Parse(char const*, unsigned long) [22]
34.6 1'639'647'264 398'718 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*) [23]
34.6 1'639'647'264 398'718 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'2 [24]
34.6 1'639'647'264 398'718 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'3 [25]
34.6 1'638'268'224 398'380 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'4 [26]
34.5 1'632'935'248 397'099 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'5 [27]
34.3 1'622'567'168 394'608 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'6 [28]
34.0 1'607'661'296 391'057 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'7 [29]
33.6 1'591'944'368 387'253 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'8 [30]
33.2 1'573'121'936 382'727 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'9 [31]
32.7 1'547'707'040 376'534 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'10 [32]
31.9 1'509'950'288 367'307 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'11 [33]
30.7 1'455'860'976 354'097 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'12 [34]
30.1 1'425'838'160 346'875 tinyxml2::XMLElement::ParseDeep(char*, tinyxml2::StrPair*, int*) [35]
30.1 1'425'838'160 346'875 tinyxml2::XMLElement::ParseAttributes(char*, int*) [36]
30.1 1'425'838'160 346'875 tinyxml2::XMLElement::CreateAttribute() [37]
29.0 1'371'421'120 333'412 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'13 [38]
26.4 1'249'606'000 303'721 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'14 [39]
22.8 1'081'623'120 262'619 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'15 [40]
19.6 927'220'461 82 reco::details::readGzipFile(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) [41]
18.2 864'046'000 209'733 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'16 [42]
12.7 599'349'392 146'039 tinyxml2::XMLNode::ParseDeep(char*, tinyxml2::StrPair*, int*)'17 [43]

@cmsbuild
Copy link
Contributor

A new Issue was created by @davidlange6 David Lange.

@Dr15Jones, @smuzaffar, @silviodonato, @makortel, @davidlange6, @fabiocos can you please review it and eventually sign/assign? Thanks.

cms-bot commands are listed here

@slava77
Copy link
Contributor

slava77 commented Jan 23, 2020

@davidlange6
was the memory profile for a single thread job, some production-like reco?
I'm trying to understand if this jump at the start above the baseline is a concern for production.

See
#24432 (comment)
for CPU and memory related notes on the tinyxml2 parts. The additional allocations needed inside tinyxml2 compared to the TMVAReader were observed during integration and seemed acceptable considering the other gains.

@davidlange6
Copy link
Contributor Author

davidlange6 commented Jan 23, 2020 via email

@makortel
Copy link
Contributor

assign reconstruction, analysis

@cmsbuild
Copy link
Contributor

New categories assigned: analysis,reconstruction

@slava77,@santocch,@perrotta you have been requested to review this Pull request/Issue and eventually sign? Thanks

@makortel
Copy link
Contributor

@davidlange6 Is the y-axis of your plot RSS in the units of bytes?

@davidlange6
Copy link
Contributor Author

davidlange6 commented Jan 23, 2020 via email

@makortel
Copy link
Contributor

@Dr15Jones and I took a look based on the IgProf stack trace. The lowptgsfeleseed::HeavyObjectCache is used as a GlobalCache here

class LowPtGsfElectronSeedProducer final
: public edm::stream::EDProducer<edm::GlobalCache<lowptgsfeleseed::HeavyObjectCache> > {

Its constructor uses createGBRForest() to read the models
for (auto& weights : conf.getParameter<std::vector<std::string> >("ModelWeights")) {
models_.push_back(createGBRForest(edm::FileInPath(weights)));
}

which itself calls the init() function
if (weightsFile[0] == '/') {
gbrForest = init(weightsFile, varNames);
} else {
edm::FileInPath weightsFileEdm(weightsFile);
gbrForest = init(weightsFileEdm.fullPath(), varNames);

which then parses the XML document with DOM
tinyxml2::XMLDocument xmlDoc;
using namespace reco::details;
if (hasEnding(weightsFileFullPath, ".xml")) {
xmlDoc.LoadFile(weightsFileFullPath.c_str());
} else if (hasEnding(weightsFileFullPath, ".gz") || hasEnding(weightsFileFullPath, ".gzip")) {
char* buffer = readGzipFile(weightsFileFullPath);
xmlDoc.Parse(buffer);
free(buffer);
}

.

Based on the cfi file

ModelWeights = cms.vstring([
'RecoEgamma/ElectronIdentification/data/LowPtElectrons/RunII_Autumn18_LowPtElectrons_unbiased.xml.gz',
'RecoEgamma/ElectronIdentification/data/LowPtElectrons/RunII_Autumn18_LowPtElectrons_displaced_pt_eta_biased.xml.gz',
]),

I guess the XML files in question are
https://github.com/cms-data/RecoEgamma-ElectronIdentification/blob/master/LowPtElectrons/RunII_Autumn18_LowPtElectrons_unbiased.xml.gz
https://github.com/cms-data/RecoEgamma-ElectronIdentification/blob/master/LowPtElectrons/RunII_Autumn18_LowPtElectrons_displaced_pt_eta_biased.xml.gz
whose uncompressed sizes are 298 MB each. It is not really surprising that reading such large documents with DOM leads to large memory use, even if only temporary.

Looking at the body of the init() function, it seems that the code reads

  • MethodSetup.GeneralInfo, which appears to be small
  • MethodSetup.Options, which appears to be small
  • loop over all BinaryTree elements of MethodSetup.Weights, these are large
    • the first file has 490 BinaryTree elements, and the second one 783

We wonder whether a SAX parser could be used here to avoid parsing the entire XML document in memory with DOM. By quick look on the code SAX would not look impossible, but would likely require some amount of work.

@slava77
Copy link
Contributor

slava77 commented Jan 23, 2020

@guitargeek
in case you are interested/available, please take a look.
Thank you.

@slava77
Copy link
Contributor

slava77 commented Jan 23, 2020

@Dr15Jones and I took a look based on the IgProf stack trace. The lowptgsfeleseed::HeavyObjectCache is used as a GlobalCache here

So, this then is independent of the number of threads.
On the normal 8-thread job and some 10GB per-job allocation, I would expect that this bump of 2.XGB is not a likely reason for trouble.
Still, almost 3GB bump in memory use seems excessive.

@slava77
Copy link
Contributor

slava77 commented Jan 23, 2020

@mverzett
please take a note, especially if some future updates may need a more complicated training.

@bendavid
Copy link
Contributor

Can't this be solved just be pre-running the conversion to GBRForest and reading the root file/conditions object instead of the xml file?

@guitargeek
Copy link
Contributor

guitargeek commented Jan 23, 2020

I don't think improving the XML parsing is worth it, because this XML is already a workaround itself.

All these new egamma MVAs are trained with xgboost, which can save the model as a text file. Then, we manually converted these text files into the TMVA XML format because that's what cmssw could deal with.

I think the simpler solution here would be to either just serialize the GBRForests to ROOT files such that they can be loaded much faster, or load the model from the original xgboos txt file (if @mverzett still has it) with a library that is very fast at parsing it, like my own one here:
https://github.com/guitargeek/XGBoost-FastForest

This FastForest is basically an even more optimized spin-off of the GBRForest, but it parses xgboost txt instead of TMVA xml. We could just copy-paste that code into cmssw.

@slava77
Copy link
Contributor

slava77 commented Jan 23, 2020

Can't this be solved just be pre-running the conversion to GBRForest and reading the root file/conditions object instead of the xml file?

I like this proposal/option.
Perhaps the DB/conditions is a better option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants