Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make gpu-cpu cluster matching deterministic #294

Merged

Conversation

VinInn
Copy link

@VinInn VinInn commented Mar 21, 2019

This PRs makes gpu-cpu cluster matching deterministic by intrusively marking CPU clusters with the cpu cluster index.
The size of SiPixelCluster does not increase neither in memory nor on disk (there was a 16 byte padding space that came handy...)

Obsolete code (comments and config param) in Raw2Cluster removed.
Took the opportunity to properly write the loop in the kernel

Recompiles a bit more than the whole universe: not something you want to carry on in your development area too long

minor regression expected (not in the GPU results, at most in TP matching).
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison).
can be downgraded to debug

@VinInn VinInn changed the title make cpu-cpu cluster matching deterministic make gpu-cpu cluster matching deterministic Mar 21, 2019
@@ -196,7 +197,10 @@ class SiPixelCluster {
float getSplitClusterErrorX() const { return err_x; }
float getSplitClusterErrorY() const { return err_y; }


// the original id (they get sorted)
auto oriId() const { return theClusId;}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we call these originalId() and setOriginalId() ?

@@ -207,6 +211,8 @@ class SiPixelCluster {
uint16_t theMinPixelCol=MAXPOS; // Minimum pixel index in the y direction (left edge).
uint8_t thePixelRowSpan=0; // Span pixel index in the x direction (low edge).
uint8_t thePixelColSpan=0; // Span pixel index in the y direction (left edge).

uint16_t theClusId=std::numeric_limits<uint16_t>::max();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we call it theClusterId ?

@fwyzard
Copy link

fwyzard commented Mar 21, 2019

Thanks, I will rebuild at flatiron and run the usual checks... then it sounds like a good reason to cut the next release.

@VinInn
Copy link
Author

VinInn commented Mar 21, 2019

ok, I will edit here, my area is still compiling and this changes will trigger full recompilation again...
maybe Not, too many files to edit

@VinInn
Copy link
Author

VinInn commented Mar 21, 2019

modified, recompiled, tested, committed, pushed

@fwyzard
Copy link

fwyzard commented Mar 21, 2019

No impact on the throughput (as expected), running on a P100:

before: 1275.0 ± 2.8 ev/s
after:  1275.1 ± 2.7 ev/s

@fwyzard
Copy link

fwyzard commented Mar 21, 2019

Validation summary

Reference release CMSSW_10_5_0_pre2 at 4fd4ff5
Development branch CMSSW_10_5_X_Patatrack at 07607a4
Testing PRs:

makeTrackValidationPlots.py plots

/RelValTTbar_13/CMSSW_10_5_0_pre2-PU25ns_105X_upgrade2018_realistic_v2-v1/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflow 10824.5
  • tracking validation plots and summary for workflow 10824.52
  • tracking validation plots and summary for workflow 10824.51
  • tracking validation plots and summary for workflow 10824.53

/RelValZMM_13/CMSSW_10_5_0_pre2-105X_upgrade2018_realistic_v2-v1/GEN-SIM-DIGI-RAW

  • tracking validation plots and summary for workflow 10824.5
  • tracking validation plots and summary for workflow 10824.52
  • tracking validation plots and summary for workflow 10824.51
  • tracking validation plots and summary for workflow 10824.53

logs and nvprof/nvvp profiles

/RelValTTbar_13/CMSSW_10_5_0_pre2-PU25ns_105X_upgrade2018_realistic_v2-v1/GEN-SIM-DIGI-RAW

/RelValZMM_13/CMSSW_10_5_0_pre2-105X_upgrade2018_realistic_v2-v1/GEN-SIM-DIGI-RAW

Logs

The full log is available at https://fwyzard.web.cern.ch/fwyzard/patatrack/pulls/64caa13b38f2fcd1bd7d48c26ed4ccf10ba58112/log .

@fwyzard fwyzard added the Pixels Pixels-related developments label Mar 21, 2019
@fwyzard
Copy link

fwyzard commented Mar 21, 2019

No changes in physics performance observed (as expected).

@fwyzard fwyzard merged commit fcf1dd0 into cms-patatrack:CMSSW_10_5_X_Patatrack Mar 21, 2019
@fwyzard fwyzard added this to the CMSSW_10_5_X_Patatrack milestone Mar 26, 2019
fwyzard pushed a commit that referenced this pull request Oct 8, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard pushed a commit that referenced this pull request Oct 19, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard pushed a commit that referenced this pull request Oct 20, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard pushed a commit that referenced this pull request Oct 23, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard pushed a commit that referenced this pull request Nov 6, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard pushed a commit that referenced this pull request Nov 16, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard added a commit that referenced this pull request Nov 27, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard pushed a commit that referenced this pull request Dec 25, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard pushed a commit that referenced this pull request Dec 29, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
fwyzard pushed a commit that referenced this pull request Dec 29, 2020
Makes the GPU-CPU cluster matching deterministic by intrusively marking CPU clusters with the cluster index.
Reuse existing padding space to store the extra transient field, so that the size of SiPixelCluster does not increase.
There is still a warning in case of mismatch of the content of the cluster (based on charge comparison), that can eventually be downgraded to a debug message.

Properly rewrite the loop in the RawToDigi_kernel .

Remove obsolete code (comments and configuration parameters) in SiPixelRawToClusterCUDA and SiPixelRawToClusterGPUKernel.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Pixels Pixels-related developments
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants