Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GSoC 2022: Multiweight integration #125

Draft
wants to merge 57 commits into
base: substructure
Choose a base branch
from

Conversation

kfan326
Copy link
Contributor

@kfan326 kfan326 commented Sep 1, 2022

Context:
GSoc 2022 - Multiweight integration

Description of the Change:

This update extends the Cutflow and Histogram classes to hold data for n-weights and outputs the multidimensional histogram and cutflow data to SQLite3 format.

This draft is currently fully functional for Both Cutflows and histograms, currently still working on building an interface for SQlite3 output format.

Benefits:

This implementation will output Cutflow and Histogram data for all weights in a single SQLite3 file which is long term stable(updates in software version will not change accessibility) and easily transferable. SQlite3 supports the core query commands of the SQL standard and users will be able to process the result data however they wish, the database file can be accessed independently of MADAnalysis5 and plotted in Excel or whichever plotting package or the user's choosing.

Possible Drawbacks:

Currently still working on an interface to detect if SQLite3 is available on the system and if not, give user the option to install it. MacOS includes SQLite3 by default, this issue will only apply to Linux users.

Existing analysis files need to update the SampleAnalyzer.cpp file:

in the Execute function: user needs to pass in the weigh map as shown in the example below. the WeightCollection object supports *, /, +, - operators.

Example: user wants to multiply all weights by a double, say 1.2345, the user can do so by the following.

EvMultiweight*=1.2345;

MAdouble64 EvWeight;
WeightCollection EvMultiweight;  // create a WeightCollection object to hold multi weights
if(Configuration().IsNoEventWeight()) {
	  EvWeight=1.;
	  EvMultiweight=1.;
} else if(event.mc()->weight()!=0.) {
	  EvWeight=event.mc()->weight();
	  EvMultiweight=event.mc()->multiweights();
} else { return false;}
 
Manager()->InitializeForNewEvent(EvWeight, EvMultiweight.GetWeights()); //pass in weight map object in addition to single weight. 

The current implementation will run both the single and multi weight simultaneously and output independently (Existing SAF format is preserved and untouched, multi weight will output Outflow and Histogram SQLite3 files in their respective directories). The plan is to deprecate the single weight implementation once numerical validation is complete, the current implementation is a little cumbersome but is fully functional and easier to debug should their be discrepancies between the existing and new implementation.

Related GitHub Issues:

kfan326 and others added 25 commits June 30, 2022 18:12
destructor now deletes allocated region selection objects when going out of scope.
…nager.h

Co-authored-by: Jack Y. Araz <jackaraz@gmail.com>
@jackaraz jackaraz changed the title Multi weight/multi thread GSoC: Multiweight integration Sep 17, 2022
@kfan326 kfan326 marked this pull request as ready for review September 23, 2022 03:18
kfan326 and others added 9 commits October 19, 2022 21:58
Co-authored-by: Jack Y. Araz <jackaraz@gmail.com>
Co-authored-by: Jack Y. Araz <jackaraz@gmail.com>
Co-authored-by: Jack Y. Araz <jackaraz@gmail.com>
Co-authored-by: Jack Y. Araz <jackaraz@gmail.com>
Co-authored-by: Jack Y. Araz <jackaraz@gmail.com>
…nalyzer now passes in necessary data to output manager, further implementation only need to be added to the output manager
@jackaraz jackaraz added the ⚙️enhancement New feature or request label Jan 1, 2023
@jackaraz jackaraz changed the title GSoC: Multiweight integration GSoC 2022: Multiweight integration Jan 17, 2023
Copy link
Member

@jackaraz jackaraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here are some comments that need resolving.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file should not be here!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Database Manager is not a good naming here indicates extreme general structure. Please create two dedicated classes for cutflow and histogramming. Current structure does not look expandable i.e. we might want to use sqlite for completely different thing than just writing histograms and curflows so dedicated code structures would be ideal. Additionally this piece of code needs bit of documentation so please add a README.md file here explaining how to read curflow and histo file generated by this file through python.


def AutoDetection(self):
# Which
result = ShellCommand.Which('sqlite3',all=False,mute=True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please also add
if importlib.util.find_spec("sqlite3") is None: ...
since one might have weird specification with cpp headers defined but python files missing etc.

}

// Add a cut to the CutFlow
void AddCut(std::string const &CutName)
{ cutflow_.InitCut(CutName); }

/// Getting ready for a new event
void InitializeForNewEvent(const MAfloat64 &weight)
void InitializeForNewEvent(const MAfloat64 &weight,const std::map<MAuint32, MAfloat64> &multiweight)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not backwards compatible; it breaks all the codes developed in the past! Please split this into two as follows;

/// Initializing the new event with a single weight
void InitializeForNewEvent(const MAfloat64 &weight);

/// Initializing the new event with multi-weight
void InitializeForNewEvent(const std::map<MAuint32, MAfloat64> &multiweight);

This way, we won't need to modify old implementations individually. You can test old implementations by using the install PADForSFS command, which will download a set of implementation codes in tools/PADForSFS/Build/SampleAnalyzer/User/Analyzer, and you can see which functions can not be changed. All those codes have to work.

@jackaraz jackaraz marked this pull request as draft March 21, 2023 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
⚙️enhancement New feature or request
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

2 participants