Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patatrack integration - Pixel track reconstruction (10/N) #31722

Merged

Commits on Jan 15, 2021

  1. Use the gpu modifier to read the pixel clusters from the unpacker (#31

    )
    
    When running the GPU algorithms, the pixel unpacker is reponsible
    for providing both the digis and the cluster.
    These changes make use of the unpacker label to access the clusters,
    conditionally on the presence of the `gpu` process modifier.
    fwyzard committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    ec0cc29 View commit details
    Browse the repository at this point in the history
  2. Implement Riemann fit for pixel tracks (#34)

    Matrix operations are based on Eigen.
    
    A first GPU version, running Eigen together with CUDA, is available in the test directory but currently disabled.
    rovere authored and fwyzard committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    a10a72d View commit details
    Browse the repository at this point in the history
  3. Implement a Heterogeneous version of Raw2Cluster and RecHit (#62)

      - reorganize `SiPixelRawToDigi` as `SiPixelRawToDigiHeterogeneous` using `HeterogeneousEDProducer`
          - output a `HeterogeneousEvent`
          - use `PixelThresholdClusterizer`
          - add `SiPixelDigiHeterogeneousConverter`
          - make cabling and gain transfers asynchronous
      - reorganize `SiPixelRecHits` as `SiPixelRecHitHeterogeneous`
      - move `PixelThresholdClusterizer` (back?) to interface+src in order to use it outside of RecoLocalTracker/SiPixelClusterizer
      - replace __host__ __device__ with constexpr to avoid weird compilation failures
      - split clusters to their own converter
    makortel authored and fwyzard committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    7dc6682 View commit details
    Browse the repository at this point in the history
  4. Heterogeneous Cellular Automaton for pixel tracks

    Port the Cellular Automaton (back) to GPUs and CUDA, using the
    `HeterogeneousEDProducer` approach:
      - do memory allocations in the framework begin stream
      - run the memory copies and kernels asynchronously, in a dedicated
        CUDA stream per framework stream
    
    Use the new GPU::VecArray for holding repeated data structures.
    
    By default, run on the GPU in all gpu-enable workflows (e.g. 10824.8).
    felicepantaleo authored and fwyzard committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    b117d17 View commit details
    Browse the repository at this point in the history
  5. Clean up CAHitNtupletHeterogeneousEDProducer (#83)

    Apply some clean up to the code and formatting of `CAHitNtupletHeterogeneousEDProducer` and `CAHitQuadrupletGeneratorGPU`, as suggested by @makortel during the review of #48:
      - clean up the `BuildFile.xml`
      - remove unused data members and arguments from function calls;
      - percolate the CUDA stream instead of storing it as a data member.
    
    Also:
      - add `cudaCheck` calls around memory allocations and copies;
      - reduce the number of memory allocations used to set up the GPU state.
    fwyzard committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    ba46ad5 View commit details
    Browse the repository at this point in the history
  6. Port the Riemann fit to CUDA (#60)

      - the CPU Riemann fit works using all combinations between the 2 booleans: `useErrors` and `useMultipleScattering`;
      - the standalone version of the GPU Riemann fit has been updated in order to explore all possibilities among the 2 booleans above: all of them work and produce identical results up to 1e-5 precision (the default one, 1e-6 fails when enabling multiScattering, most likely due to matrix inversions);
      - the GPU version of the Riemann fit within CMSSW works, with 1 fit assigned to each thread, with 32 threads/warps, all dynamically computed.
    
    Things that needs a "hack":
    
      - limit the "dynamic" size of Eigen matrices to at most, 4x4, which is just fine for quadruplets. Using anything wider will cause errors which I *believe* is related to the stack size of threads on the GPU;
      - cast matrices to be inverted to 4x4 (was done before the previous point: will revert it back and see if that's still needed or not, but I believe it is); this was done in order to "specialize" the `invert()` call to something that is "natively" supported by Eigen on GPU (that brought in also few `__host__` `__device__` here and there in Eigen);
      - fix the alignment of the `struct` holding the results of the fit, since its size was different on GPU and CPU, causing an annoying off-by-one effect.
    rovere authored and fwyzard committed Jan 15, 2021
    Configuration menu
    Copy the full SHA
    b263509 View commit details
    Browse the repository at this point in the history
  7. Configuration menu
    Copy the full SHA
    a2e681d View commit details
    Browse the repository at this point in the history

Commits on Apr 1, 2021

  1. Customize function to provide a minimal configuration for profiling (#…

    …106)
    
    Can be included with the following snippet in the configuration:
    
        from RecoPixelVertexing.Configuration.customizePixelTracksForProfiling import customizePixelTracksForProfiling
        process = customizePixelTracksForProfiling(process)
    
    Removes validation, DQM, and output modules.
    As suggested in #70 (comment), an `AsciiOutputModule` is used to require the `pixelTracks`.
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    bb60075 View commit details
    Browse the repository at this point in the history
  2. Heterogeneous ClusterTPAssociation (#105)

    Implement a heterogeneous Cluster-to-TrackingParticle associator running on the GPU.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    a7c22e4 View commit details
    Browse the repository at this point in the history
  3. Configuration menu
    Copy the full SHA
    1a43506 View commit details
    Browse the repository at this point in the history
  4. Configuration menu
    Copy the full SHA
    b6b2fff View commit details
    Browse the repository at this point in the history
  5. Pixel doublets on GPU (#118)

    Pixel doublets (actually CACells) are created on GPU and fed to CA.
    The whole workflow up to quadruplets candidates is now fully on GPU.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    ecd1465 View commit details
    Browse the repository at this point in the history
  6. Cleanup defines, includes, file names, and debug messages (#122)

    Do not #ifdef on __NVCC__: to protect CUDA-aware code sections, check if the __CUDACC__ symbol is defined. The symbol __NVCC__ is defined when building with nvcc, but not when building CUDA code with clang.
    
    Move header files referenced from outside their directory to the interface/ directory, and update the include guards accordingly.
    
    Include <cuda_runtime.h> instead of <cuda.h> to handle the CUDA attributes in non-CUDA compilations.
    
    Rename PixelTrackReconstructionGPU_impl.cu to PixelTrackReconstructionGPU.cu.
    
    Other cleanup: #defines, debug messages, change __inline__ to inline, fix include guards, whitespaces, etc.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    e539007 View commit details
    Browse the repository at this point in the history
  7. Move all CUDA code to the plugins/ directory (#123)

    Keep RiemannFit.h in the interface, as it is include-only.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    979dbdb View commit details
    Browse the repository at this point in the history
  8. Configuration menu
    Copy the full SHA
    f5e6831 View commit details
    Browse the repository at this point in the history
  9. Cleanup after merging with CMSSW 10.2.2 (#134)

    Clean up unnecessary changes, whitespaces, defines and include directives.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    4f361d1 View commit details
    Browse the repository at this point in the history
  10. Add optional flags to disable SOA->legacy conversion and GPU->CPU tra…

    …nsfer (#132)
    
    Always produce the CPU cluster and rechit collections, since they are needed anyway.
    Add transfer and conversion flags to clusterizer, rechits and CA.
    Add a skeleton for the future pixel track producer.
    Add customize functions to disable conversions to legacy formats, and to disable unnecessary GPU->CPU transfers.
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    65ac243 View commit details
    Browse the repository at this point in the history
  11. Reformat the Riemann fit code (#143)

    Apply clang-format reformatting to RiemannFit.h
    felicepantaleo authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    3ef1e8a View commit details
    Browse the repository at this point in the history
  12. Clean up and bugfixes for the Riemann fit (#148)

    Fix for uninitialised variables.
    Always assume multiple scattering treatment and remove unused parameters.
    Remove test that has diverged from the actual implementation.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    14a9169 View commit details
    Browse the repository at this point in the history
  13. Reduce CA memory need (#159)

    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    c82fc4d View commit details
    Browse the repository at this point in the history
  14. Add MTV instance for pixel tracks from PV (#156)

    Add separate plots for tracks associated to the primary vertex.
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    2396633 View commit details
    Browse the repository at this point in the history
  15. Tune and speed up doublet algo (#158)

    Tune and speed up the pixel doublet alforithm, and take advantage of GPU read-only memory for a further speedup.
    
    Includes a python notebook to tune the cuts for doublets and triplets.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    f76d1a9 View commit details
    Browse the repository at this point in the history
  16. Optimise gpuPixelDoublets::doubletsFromHisto() kernel (#167)

    Pre-compute few constants that could not be declared constexpr.
    Reduce temporary buffer size.
    Reduce the block size of the calls to gpuPixelDoublets::getDoubletsFromHisto() from 256 to 64, to make better usage of the GPU processors.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    9b538fc View commit details
    Browse the repository at this point in the history
  17. Add Rieman fit to the CA (#169)

    Also, add back the stand-alone GPU fit test.
    rovere authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    0d80e17 View commit details
    Browse the repository at this point in the history
  18. Configuration menu
    Copy the full SHA
    6d9630c View commit details
    Browse the repository at this point in the history
  19. Clean up Riemann fit in CA (#178)

    Reduce the number of blocks used to launch the Riemann fit kernels within the CA.
    Rename the kernels to avoid the ambiguiity with the standalone Riemann fit.
    Work around spurious warnings in the Eigen test.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    79fcd95 View commit details
    Browse the repository at this point in the history
  20. Riemann fit multiple scattering (#174)

    Implement the multiple scattering treatments in the Riemann Fit. In particular:
      - modify the previous implementation of the multiple scattering in the circle fit to correctly cover both the barrel and the forward case;
      - implement the multiple scattering in the line fit in the S-Z plane both for the barrel and the forward case.
    
    The effective radiation length is still an approximate value since the phi angle is not taken into account (it is not known on a layer-by-layer case). Ad ad-hoc correction based on the inverse of the pt has been added, with a cut-off of 1 GeV.
    
    The pulls are ok-ish, the material could be further tuned.
    The Chi2 is flat on all eta range.
    rovere authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    f932567 View commit details
    Browse the repository at this point in the history
  21. Configuration menu
    Copy the full SHA
    e1c1a7e View commit details
    Browse the repository at this point in the history
  22. Riemann fit rework (#190)

    The Riemann Fit has been reworked so that both barrel
    and forward cases are naturally supported without branching.
    The underlying assumption is the uniform material distribution
    within the Pixel Tracker.
    The line fit has been reworked and is now using an ordinary
    least square fit in the S-Z plane.
    See the motivations and explanations inside the comments
    in the code.
    
    Additional changes:
      - code clean up
      - remove unused functions
      - fix standalone test of RiemannFit on GPU
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    e14fd83 View commit details
    Browse the repository at this point in the history
  23. Configuration menu
    Copy the full SHA
    eef1ec8 View commit details
    Browse the repository at this point in the history
  24. Migrate tracker local reconstruction and pixel tracking to Tasks (bac…

    …kport cms-sw#25163) (#202)
    
    Backport "Migrate tracker local reconstruction and pixel tracking to Tasks" (cms-sw#25163) to the Patatrack branch:
      - migrate RecoLocalTracker_cff to Tasks;
      - migrate RecoPixelVertexing_cff to Tasks;
      - keeping sequences to avoid massive migration (for now).
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    ffa2d95 View commit details
    Browse the repository at this point in the history
  25. Fix MTV validation of initialStepPreSplitting tracks and add B-hadron…

    … MTV variation to pixel track validation sequence (#199)
    
      - add B-hadron MTV variation to pixel track validation sequence
      - fix MTV validation of initialStepPreSplitting tracks
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    adea719 View commit details
    Browse the repository at this point in the history
  26. Address code style and quality issues (#203)

    Cleaned up by clang-tidy 7.0.0.
    Enabled checks:
      - boost-use-to-string
      - misc-uniqueptr-reset-release
      - modernize-deprecated-headers
      - modernize-make-shared
      - modernize-use-bool-literals
      - modernize-use-equals-delete
      - modernize-use-nullptr
      - modernize-use-override
      - performance-unnecessary-copy-initialization
      - readability-container-size-empty
      - readability-redundant-string-cstr
      - readability-static-definition-in-anonymous-namespace
      - readability-uniqueptr-delete-release
    
    See http://releases.llvm.org/7.0.0/tools/clang/tools/extra/docs/clang-tidy/index.html for details.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    15e668c View commit details
    Browse the repository at this point in the history
  27. Configuration menu
    Copy the full SHA
    9e6f88d View commit details
    Browse the repository at this point in the history
  28. Full workflow from raw data to pixel tracks and vertices on GPUs (#216)

    Port and optimise the full workflow from pixel raw data to pixel tracks and vertices to GPUs.
    Clean the pixel n-tuplets with the "fishbone" algorithm (only on GPUs).
    
    Other changes:
      - recover the Riemann fit updates lost during the merge with CMSSW 10.4.x;
      - speed up clustering and track fitting;
      - minor bug fix to avoid trivial regression with the optimized fit.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    917c412 View commit details
    Browse the repository at this point in the history
  29. Remove unnecessary pragmas (#249)

    `#pragma unroll` is not supported by GCC, leading to compilation
    warnings in host code.
    GCC 8 supports `#pragma GCC unroll N` which could be used instead.
    
    However, benchmarking on a V100 with and without the `#pragma unroll`
    there is no observable difference, so it is simpler to remove them.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    3deb206 View commit details
    Browse the repository at this point in the history
  30. Skip CUDA-related tests if no GPU is present (#252)

    Make unit tests that require a CUDA device skip the test and exit
    succesfully if the CUDA runtime is not available, or no CUDA devices
    are available.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    0f2c2e0 View commit details
    Browse the repository at this point in the history
  31. Speed up the doublet finder (#260)

    Introduce the inner loop parallelization in the doublet finder using the
    stride pattern already used in the "fishbone", and make use of a 2D grid
    instead of a hand-made stride.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    64b28b4 View commit details
    Browse the repository at this point in the history
  32. Configuration menu
    Copy the full SHA
    5a77d60 View commit details
    Browse the repository at this point in the history
  33. Next prototype of the framework integration (#100)

    Provide a mechanism for a chain of modules to share a resource, that can be e.g. CUDA device memory or a CUDA stream.
    Minimize data movements between the CPU and the device, and support multiple devices.
    Allow the same job configuration to be used on all hardware combinations.
    
    See HeterogeneousCore/CUDACore/README.md for a more detailed description and examples.
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    96b2f73 View commit details
    Browse the repository at this point in the history
  34. Various updates to pixel track/vertex DQM and MTV (#285)

    * Add DQM for pixel vertices
    
    * Add pT>0.9GeV pixel track collections to MTV
    
    * Add dzPV0p1, Pt0to1, Pt1 variants of pixel track DQM
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    4ed9088 View commit details
    Browse the repository at this point in the history
  35. Configuration menu
    Copy the full SHA
    3e828dd View commit details
    Browse the repository at this point in the history
  36. Implementation of the broken line fit (#340)

    Create modifiers for enabling the broken line fit on the cpu and on the gpu.
    
    Use dinamically-sized-matrices: the advantage over statically-sized ones
    is that the code would also work with n>4); the switch can be easily done at
    the start of the file.
    
    Update Eigen tests with the features used by the broken line fit.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    a74051b View commit details
    Browse the repository at this point in the history
  37. Rework the Riemann fit and broken line fit (#338)

    Merge the Riemann and broken line fits into single configurable pixel
    n-tuplet fitter, and extend it to work with up to 5 hits.
    
    Mmake the broken line fit the default algorithm.
    
    Try both triplets and quadruplets in the pixel "hole".
    
    Limit pT used to compute the multple scattering.
    
    Use the inline Cholesky decomposition.
    
    Generic clean up and improvements.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    874102c View commit details
    Browse the repository at this point in the history
  38. Improve pixel doublets and CA, and extend debugging functionality (#338)

    Improve pixel doublets and CA:
      - add pixel cluster size and shape cuts in doublets;
      - add triplet cleaner;
      - improved cluster size studies
      - implement layer-dependent cuts in the CA.
    
    Add counters in GPU code and possibility to test full doublet combinatorics.
    
    Update python notebook and include z0 resolution.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    f07cca0 View commit details
    Browse the repository at this point in the history
  39. Migrate the pixel rechits producer and CA to the new heterogeneous fr…

    …amework (#338)
    
    Use cleaned hits.
    Use pixel layer and ladders geometry, and use pixel triplets in the gaps.
    
    Optimise GPU memory usage:
      - reduce the number of memory allocations
      - fix the size of the cub workspace
      - allocate memory per event via the caching allocator
      - use constant memory for geometry and parameters
      - use shared memory where the content is the same for every thread
    
    Optimise kernel launches, and add a protection for empty events and overflows.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    260c0b2 View commit details
    Browse the repository at this point in the history
  40. Configuration menu
    Copy the full SHA
    548f5cf View commit details
    Browse the repository at this point in the history
  41. Configuration menu
    Copy the full SHA
    47dc7f7 View commit details
    Browse the repository at this point in the history
  42. Configuration menu
    Copy the full SHA
    e5372ab View commit details
    Browse the repository at this point in the history
  43. Configuration menu
    Copy the full SHA
    daae2fa View commit details
    Browse the repository at this point in the history
  44. Configuration menu
    Copy the full SHA
    2b385c3 View commit details
    Browse the repository at this point in the history
  45. Implement triplets in the pixel ntuplet producer (#382)

    Enable pixel triplets with:
    
        process.pixelTracksHitQuadruplets.minHitsPerNtuplet = 3
        process.pixelTracksHitQuadruplets.includeJumpingForwardDoublets = True
    
    Changes:
      - adjust for the average pixel geometry and the beam spot position;
      - allow "jumping doublets" in the forward region (FPIX1-FPIX3) for triplets.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    9e6ca10 View commit details
    Browse the repository at this point in the history
  46. Configuration menu
    Copy the full SHA
    0806471 View commit details
    Browse the repository at this point in the history
  47. Port the whole pixel workflow to new heterogeneous framework (#384)

      - port the whole pixel workflow to new heterogeneous framework
      - implement a legacy cluster to SoA converter for the pixel RecHits
      - update the vertex producer to run on CPU as well as GPU
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    8df4bc8 View commit details
    Browse the repository at this point in the history
  48. Configuration menu
    Copy the full SHA
    3ae0df1 View commit details
    Browse the repository at this point in the history
  49. Move event and stream caches, and caching allocators out from CUDASer…

    …vice (#364)
    
    To reduce dependencies on edm::Service, and to make CUDAService less
    of a collection of everything, split off from it:
      - the CUDAEventCache
      - the CUDAStreamCache
      - the caching allocators
    
    Other changes:
      - clean up unnecessary use of CUDAService
      - fix maxCachedFraction, add debug printouts
      - add make_*_unique_uninitialized that avoid the static_assert
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    a68be03 View commit details
    Browse the repository at this point in the history
  50. Configuration menu
    Copy the full SHA
    2d6d811 View commit details
    Browse the repository at this point in the history
  51. Configuration menu
    Copy the full SHA
    468e9ac View commit details
    Browse the repository at this point in the history
  52. Fix clang warnings (#387)

    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    604a797 View commit details
    Browse the repository at this point in the history
  53. Replace use of API wrapper stream and event with plain CUDA, part 1 (#…

    …389)
    
    Replace cuda::stream_t<> with cudaStream_t in client code
    Replace cuda::event_t with cudaEvent_t in the client code
    Clean up BuildFiles
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    80ec6eb View commit details
    Browse the repository at this point in the history
  54. Configuration menu
    Copy the full SHA
    4181007 View commit details
    Browse the repository at this point in the history
  55. Synchronize event in the CUDAProductBase destructor (#391)

    Otherwise there are possibilities for weird races, e.g. combination of
    non-ExternalWork producers, consumed-but-not-read CUDAProducts, CUDA
    streams executing work later than expected (= on the next event).
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    3756786 View commit details
    Browse the repository at this point in the history
  56. Configuration menu
    Copy the full SHA
    5a135cb View commit details
    Browse the repository at this point in the history
  57. Optimize doublet reconstruction and cuts (#411)

    Reorder cuts and some factorize code to speed up doublets.
    Increase various buffers size not to overflow in case of very relaxed cuts.
    Rename some parameters to better reflect their actual action in code.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    f532901 View commit details
    Browse the repository at this point in the history
  58. Migrate cluster track associator (#409)

    Migrate ClusterTPAssociationHeterogeneous using the depreacted HeterogeneousEDProducer to
    ClusterTPAssociationProducerCUDA, and implement a simple analyzer to consume its procuct.
    
    To test it, add a dummy analyzer to an MC workflow:
    
        process.load("SimTracker.TrackerHitAssociation.clusterTPCUDAdump_cfi")
        process.validation_step = cms.EndPath(process.globalValidationPixelTrackingOnly + process.clusterTPCUDAdump)
        process.tpClusterProducerCUDAPreSplitting.dumpCSV = True
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    2b7e4cb View commit details
    Browse the repository at this point in the history
  59. Configuration menu
    Copy the full SHA
    ae764c5 View commit details
    Browse the repository at this point in the history
  60. Configuration menu
    Copy the full SHA
    7df797f View commit details
    Browse the repository at this point in the history
  61. Configuration menu
    Copy the full SHA
    b43a1ed View commit details
    Browse the repository at this point in the history
  62. Configuration menu
    Copy the full SHA
    f77f909 View commit details
    Browse the repository at this point in the history
  63. Configuration menu
    Copy the full SHA
    9ecedf7 View commit details
    Browse the repository at this point in the history
  64. Implement changes from the CUDA framework review (#429)

    Rename the cudautils namespace to cms::cuda or cms::cudatest, and drop the CUDA prefix from the symbols defined there.
    
    Always record and query the CUDA event, to minimize need for error checking in CUDAScopedContextProduce destructor.
    
    Add comments to highlight the pieces in CachingDeviceAllocator that have been changed wrt. cub.
    
    Various other updates and clean up:
      - enable CUDA for compute capability 3.5.
      - clean up CUDAService, CUDA tests and plugins.
      - add CUDA existence protections to BuildFiles.
      - mark thread-safe static variables with CMS_THREAD_SAFE.
    makortel authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    78dc66e View commit details
    Browse the repository at this point in the history
  65. Synchronise with CMSSW_11_1_0_pre2

    Major changes:
      - restructure the RecoPixelVertexing/PixelVertexFinding package;
      - update the interface of PixelCPEFast.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    0491687 View commit details
    Browse the repository at this point in the history
  66. Apply feedback from upstream PR (#441)

    Fix include guard in CUDADataFormats/Track/src/classes.h .
    Remove unused variables in DataFormats/Math/test/CholeskyInvert_t.cpp .
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    712dc8b View commit details
    Browse the repository at this point in the history
  67. Configuration menu
    Copy the full SHA
    3ecdd54 View commit details
    Browse the repository at this point in the history
  68. Integrate the comments from the upstream PRs (#442)

    Clean up the Patatrack code base following the comments received during the integration into the upstream release.
    
    Currently tracks the changes introduced due to
       - cms-sw#29109: Patatrack integration - trivial changes (1/N)
       - cms-sw#29110: Patatrack integration - common tools (2/N)
    
    List of changes:
     * Remove unused files
     * Fix compilation warnings
     * Fix AtomicPairCounter unit test
     * Rename the cudaCompat namespace to cms::cudacompat
     * Remove extra semicolon
     * Move SimpleVector and VecArray to the cms::cuda namespace
     * Add missing dependency
     * Move HistoContainer, AtomicPairCounter, prefixScan and radixSort to the cms::cuda namespace
     * Remove rule exception for HeterogeneousCore
     * Fix code rule violations:
        - replace using namespace cms::cuda in test/OneToManyAssoc_t.h .
        - add an exception for cudaCompat.h:
          cudaCompat relies on defining equivalent symbols to the CUDA
          intrinsics in the cms::cudacompat namespace, and pulling them in the
          global namespace when compiling device code without CUDA.
    * Protect the headers to compile only with a CUDA compiler
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    23bc909 View commit details
    Browse the repository at this point in the history
  69. Configuration menu
    Copy the full SHA
    cd2faf1 View commit details
    Browse the repository at this point in the history
  70. Configuration menu
    Copy the full SHA
    668deea View commit details
    Browse the repository at this point in the history
  71. Fix use of namespaces (#446)

    Clean up instances of using namespace ... from header files,
    following the comments from the upstream integration.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    71563e8 View commit details
    Browse the repository at this point in the history
  72. Configuration menu
    Copy the full SHA
    2bfeeb9 View commit details
    Browse the repository at this point in the history
  73. Use std::isnan (#456)

    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    14de161 View commit details
    Browse the repository at this point in the history
  74. Replace cub prefix scan with home-brewed one (#447)

    Replace the use of the prefix scan from CUB with a home-brewed implementation,
    using dynamic instead of static shared memory.
    
    No changes to physics or timing performance.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    b20bd1b View commit details
    Browse the repository at this point in the history
  75. Configuration menu
    Copy the full SHA
    ca80b70 View commit details
    Browse the repository at this point in the history
  76. Configuration menu
    Copy the full SHA
    ddeaccb View commit details
    Browse the repository at this point in the history
  77. Reduce GPU memory usage (#509)

    Adjust the growth factor in the caching allocators to use more granular bins, reducing the memory wasted by the allocations.
    
    Use a dynamic buffer for CA cells components.
    
    Fix a possible data race in the prefix scan.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    17accf2 View commit details
    Browse the repository at this point in the history
  78. Configuration menu
    Copy the full SHA
    ceb4e96 View commit details
    Browse the repository at this point in the history
  79. Configuration menu
    Copy the full SHA
    2004cbd View commit details
    Browse the repository at this point in the history
  80. Add customisations for profiling the Pixel-only workflow (#553)

    customizePixelOnlyForProfilingGPUOnly:
      Customise the Pixel-only reconstruction to run on GPU
      Run the unpacker, clustering, ntuplets, track fit and vertex reconstruction on GPU.
    
    customizePixelOnlyForProfilingGPUWithHostCopy:
      Customise the Pixel-only reconstruction to run on GPU, and copy the data to the host
      Run the unpacker, clustering, ntuplets, track fit and vertex reconstruction on GPU,
      and copy all the products to the host in SoA format.
      The same customisation can be also used on the SoA CPU workflow, running up to the
      tracks and vertices on the CPU in SoA format, without conversion to legacy format.
    
    customizePixelOnlyForProfiling:
      Customise the Pixel-only reconstruction to run on GPU, copy the data to the host,
      and convert to legacy format
      Run the unpacker, clustering, ntuplets, track fit and vertex reconstruction on GPU;
      copy all the products to the host in SoA format; and convert them to legacy format.
      The same customisation can be also used on the CPU workflow, running up to the
      tracks and vertices on the CPU.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    e641460 View commit details
    Browse the repository at this point in the history
  81. Configuration menu
    Copy the full SHA
    28d292a View commit details
    Browse the repository at this point in the history
  82. Further clean up after merging CMSSW_11_2_0_pre7 (#556)

    Minor bug fixes:
      - fix a typo in EventFilter/EcalRawToDigi/plugins/BuildFile.xml .
    
    Clean up:
      - remove obsolete ArrayShadow class;
      - remove obsolete profiling functions.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    a817147 View commit details
    Browse the repository at this point in the history
  83. Update the RelVal workflows and the CPU customisation (#549)

    Update the RelVal workflows and the CPU customisation:
      - change the .501 workflow to run the full Patatrack pixel track reconstruction on CPU
      - add a customisation to run the Patatrack reconstruction with triplets, on CPU and GPU
      - add the .505 and .506 workflows to reconstruct triplets, on CPU and GPU
    
    Co-authored-by: Andrea Bocci <andrea.bocci@cern.ch>
    AdrianoDee and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    c31b20f View commit details
    Browse the repository at this point in the history
  84. Configuration menu
    Copy the full SHA
    2636448 View commit details
    Browse the repository at this point in the history
  85. Configuration menu
    Copy the full SHA
    c6ac806 View commit details
    Browse the repository at this point in the history
  86. Configuration menu
    Copy the full SHA
    bb61e1c View commit details
    Browse the repository at this point in the history
  87. Move hit indexes to 32 bits (#583)

    Add a counter for forlorn doublets.
    VinInn authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    7473687 View commit details
    Browse the repository at this point in the history
  88. Clean up the pixel local reconstruction code (#593)

    Address the pixel local reconstruction review comments.
    
    General clean up of the pixel local reconstruction code:
      - remove commented out and obsolete code and data members
      - use named constants more consistently
      - update variable names to follow the coding rules and for better consistency
      - use member initializer lists in the constructors
      - allow `if constexpr` in CUDA code
      - use `std::size` instead of hardcoding the array size
      - convert iterator-based loops to range-based loops
      - replace `cout` and `printf` with `LogDebug` or `LogWarning`
      - use put tokens
      - reorganise the auto-generated cfi files and use them more consistently
      - adjust code after rearranging an `#ifdef GPU_DEBUG` block
      - apply code formatting
      - other minor changes
    
    Improve comments:
      - improve comments and remove obsolete ones
      - clarify comments and types regarding `HostProduct`
      - update comments about `GPU_SMALL_EVENTS` being kept for testing purposes
      - add notes about the original cpu code
    
    Reuse some more common code:
      - move common pixel cluster code to `PixelClusterizerBase`
      - extend the `SiPixelCluster` constructor
    
    Rename classes and modules for better consistency:
      - remove the `TrackingRecHit2DCUDA.h` and `gpuClusteringConstants.h` forwarding headers
      - rename `PixelRecHits` to `PixelRecHitGPUKernel`
      - rename SiPixelRecHitFromSOA to SiPixelRecHitFromCUDA
      - rename `siPixelClustersCUDAPreSplitting` to `siPixelClustersPreSplittingCUDA`
      - rename `siPixelRecHitsCUDAPreSplitting` to `siPixelRecHitsPreSplittingCUDA`
      - rename `siPixelRecHitsLegacyPreSplitting` to `siPixelRecHitsPreSplittingLegacy`
      - rename `siPixelRecHitHostSoA` to `siPixelRecHitSoAFromLegacy`
    
    Re-apply changes from cms-sw#29805 that were lost in the Patatrack branch.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    50f697e View commit details
    Browse the repository at this point in the history
  89. Configuration menu
    Copy the full SHA
    e2c52fb View commit details
    Browse the repository at this point in the history
  90. Clean up the pixel local reconstruction code (#602)

    Address the pixel local reconstruction review comments:
      - remove obsolete comments;
      - consistently use named constants;
      - rename data members and methods to be more descriptive;
      - rename local variables according to the coding rules and for
        consistency with cms-sw#32591;
      - update transient dictionaries to match data types.
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    2455463 View commit details
    Browse the repository at this point in the history
  91. Configuration menu
    Copy the full SHA
    2f0c8d4 View commit details
    Browse the repository at this point in the history
  92. Clean up the pixel track reconstruction code (#606)

    Updat EDM access:
      - switch to consumes() scheme for event setup;
      - simplify some event data access.
    
    Style fixes:
      - make class member private & fixed problematic cast;
      - format of comments for clang-tidy;
      - chang to enum class to avoid creating a namespace (usage becomes: pixelTrack::Quality::loose);
      - add article reference in comment (it was already further down in the file);
      - fix member functions and classes capitalization;
      - fix one letter or upper case variable names in formulas (trying to keep the naming from the reference article).
    
    Avoid some code repetitions.
    ericcano authored and fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    35c6817 View commit details
    Browse the repository at this point in the history
  93. Configuration menu
    Copy the full SHA
    8d19af6 View commit details
    Browse the repository at this point in the history
  94. Minor fixes and clean up for the pixel track reconstruction code (#611)

    Fix RecoPixelVertexing/PixelTrackFitting/test/BuildFile.xml following file renames.
    
    Remove unnecessary customisation from
    RecoPixelVertexing/Configuration/python/customizePixelTracksSoAonCPU.py .
    fwyzard committed Apr 1, 2021
    Configuration menu
    Copy the full SHA
    5bde441 View commit details
    Browse the repository at this point in the history
  95. Configuration menu
    Copy the full SHA
    a75ae0b View commit details
    Browse the repository at this point in the history
  96. Configuration menu
    Copy the full SHA
    16e7fdb View commit details
    Browse the repository at this point in the history