Skip to content

Releases: GridTools/gridtools

GridTools version 2.3.5

30 Sep 09:43
Compare
Choose a tag to compare

changes since v2.3.4

GridTools v2.3.5 requires CMake 3.21.0 or later to properly support HIP.

Performance improvements

  • Introduce ldg_ptr to Enable __ldg in Data Stores and simple_ptr_holder #1802
  • Use ‘const’ in Neighbor Table Value Types #1796

Bug fixes

  • Fixes for HIP detection for recent ROCm and CMake #1804
  • Fix GT_ASSUME for NVCC and Enable GT_ASSUME on Recent GCC Versions #1789
  • Fix include #1793
  • Improve tests #1794, #1799, #1798

GridTools version 2.3.4

26 Jun 16:08
Compare
Choose a tag to compare

changes since v2.3.2 (v2.3.3 removed because it introduced a breaking change in the nanobind adapter)

Performance improvements

Bug fixes

Tests

  • Add Missing fn_unstructured_nabla_fused_tuple_of_fields to Regression & Performance Tests by @fthaler in #1783
  • Improved Data Layout for Neighbor Tables by @fthaler in #1782

CI / Deployment

GridTools version 2.3.2

31 Jan 15:05
1d76954
Compare
Choose a tag to compare

Bug fixes

  • Apply workaround to CUDA 12.3, see #1766 (#1768)

GridTools version 2.3.1

16 Aug 14:12
Compare
Choose a tag to compare

Python bindings

  • Python SID adapter: Add support for HIP/ROCm buffers (#1759)
  • Python bindings: Nanobind SID adapter (#1762)

Bug fixes

  • Partial workarounds for CUDA 12.1 and 12.2, see #1766 (#1764)
  • Fix GCC 13: add missing include (#1761)

GridTools version 2.3.0

20 Apr 15:07
53fc455
Compare
Choose a tag to compare

Support for NVHPC (#1747)

GridTools now supports NVHPC starting from release 23.3!

Parallel fn::backend::naive (#1746)

Naive (just parallel for, no blocking and other optimizations) OpenMP parallelization of the naive backend.

SID util to transform a dimension to a tuple_like element type (#1750)

Translates a SID with dimension D and element type T to a SID with D removed and type is tuple<T>-like, with tuple_size N for sid::dimension_to_tuple_like<D, N>(sid).

Bug fixes and smaller features

  • fn: allow execution of stencils with 0d domain (#1728)
  • Make pybind11::buffer sid copyable (#1755)

Build fixes

  • Support for Clang 16 (#1751)

and other changes already included in v2.2.3

fn: SID neighbor table wrapper (#1730)

Adds a simple class that wraps a SID and implements the neighbour table concept. (Picked for convenience into 2.2.2.)

Support for Python packaging (#1720)

Starting with this release we will publish GridTools C++ on pypi.org to make it easier to consume GridTools C++ from GT4Py.

Bug fixes

  • Fix CUDA 12.0 compilation (#1741)
  • Improvements to Python packaging (#1742, #1743, #1744)
  • Fix get_keys of empty hymap (#1728)
  • fn: CUDA early exit on empty grid - an empty domain skips execution instead of erroring (#1729)
  • fn: prefer qualified names over ADL for fn builtins (they are not customization points for the user) (#1731, #1732)
  • Enable workarounds for CUDA 11.8 (#1734)
  • Enable workarounds for Clang 15 (#1735)
  • Update pybind11 version to fix wrong C++ standard (#1723)
  • Fix perfect forwarding in sid::composite::make_values (#1722)
  • Workaround for NVCC bug in gcl (present in 11.6, 11.7 and most likely in 11.8) (#1726)

Performance fixes

  • Alternative skip value check in fn, which improves CUDA performance (#1721)

Build fixes

  • Fix perftests CMake target when no tests are added (#1724)

Cleanup

  • Replace boost::variant by std::variant (#1718)

CI

Contributions

This release contains contributions from
@DropD, @egparedes, @fthaler, @havogt, @petiaccja

GridTools version 2.0.1

18 Apr 10:43
b817808
Compare
Choose a tag to compare

Bug fixes

  • Fix: storage_gpu for HIPCC-AMDGPU (#1540)
  • Performance fix for C++17 (#1618)
  • Enable several CUDA workarounds for recent compilers (#1681 and others)
  • Some declarations to definitions
  • Workaround gtest incompatibilities in recent compilers

CI

  • Compile (and run) all tests on GitHub actions (Jenkins doesn't run on v2.0.x anymore)

GridTools version 2.2.3

15 Feb 08:55
Compare
Choose a tag to compare

Bug fixes

CI

GridTools version 2.2.2

12 Dec 10:30
Compare
Choose a tag to compare

fn: SID neighbor table wrapper (#1730)

Adds a simple class that wraps a SID and implements the neighbour table concept. (Picked for convenience into 2.2.2.)

Support for Python packaging (#1720)

Starting with this release we will publish GridTools C++ on pypi.org to make it easier to consume GridTools C++ from GT4Py.

Bug fixes

  • Fix get_keys of empty hymap (#1728)
  • fn: CUDA early exit on empty grid - an empty domain skips execution instead of erroring (#1729)
  • fn: prefer qualified names over ADL for fn builtins (they are not customization points for the user) (#1731, #1732)
  • Enable workarounds for CUDA 11.8 (#1734)
  • Enable workarounds for Clang 15 (#1735)

Build fixes

  • Fix perftests CMake target when no tests are added (#1724)

GridTools version 2.2.1

04 Aug 08:57
Compare
Choose a tag to compare

Bug fixes

  • Update pybind11 version to fix wrong C++ standard (#1723)
  • Fix perfect forwarding in sid::composite::make_values (#1722)
  • Workaround for NVCC bug in gcl (present in 11.6, 11.7 and most likely in 11.8) (#1726)

Performance fixes

  • Alternative skip value check in fn, which improves CUDA performance (#1721)

Cleanup

  • Replace boost::variant by std::variant (#1718)

GridTools version 2.2.0

06 Jul 10:36
240e8b0
Compare
Choose a tag to compare

C++ standard upgraded to C++17

Starting with this version of GridTools, we require the C++17 standard (#1680) and improved the code base using C++17 features (#1693, #1716, #1697):

  • Get rid of tuple_util::make
  • GT_CONSTEXPR and GT_CONSTEXPR_TARGET goes away
  • wstd stuff goes away
  • is_trivially_copy_constructible check is consistently used instead of is_trivially_copyable where the data is passed host/device boundary, because it is exactly what is needed.
  • make_[smth] pattern is replaced to template argument deduction in several places, the old pattern is deprecated
  • composite is rewritten using c++17
  • overload is rewritten using c++17
  • std::[smth]_v<...> are used instead of std::[smth]<...>::value
  • static_assert(<cond>) used instead of static_assert(<cond>, "")
  • CTAD for simple_ptr_holder (#1701, #1708)

If you were using functionality from the internal library common you might have to update your code (all common is considered internal API, see Release process). The most common change is using CTAD instead of makers where possible. If not possible due to compiler bugs, the maker pattern was updated to be independant of tuple_util::make. E.g. replace

  • tuple_util::make<tuple>(...) by tuple(...)
  • tuple_util::make<array>(...) by array(...)
  • sid::composite::make<...>(...) by sid::composite::keys<...>::make_values(...)
  • tuple_util::make<hymap::keys<...>::values>(...) by hymap::keys<...>::make_values(...);

New library fn: functional model backend

The fn library provides functionality for the Declarative GT4Py to implement a backend for the functional model. It supports (naive, no-blocking) CPU and (efficient) GPU (CUDA) execution for structured (Cartesian) and unstructured grids. See examples in tests/regression/fn/.
The library provides a high-level, human-readable frontend, but is mainly meant as a target for code generators.

  • Introduce functional model backend (#1648, #1666, #1679)
  • Implements fn::extents (#1683)
  • Column Stage (#1685)
  • New Backend Backends (#1695)
  • Fn Frontend (#1698)
  • Performance References for FN Backends (#1711)
  • Add fn::tuple_get and fn::make_tuple (#1713)
  • Allow setting CUDA stream (#1712)

Minor new features

  • int_vector library (#1672)
  • add conversion assign to hymap (#1702)

Minor improvements

  • Extensions to meta and hymap (#1663)
  • Soften sid value type requirements from std::trivially_copyable to std::trivially_copy_constructible (#1663)
  • is_tuple_like (#1676) and is_hymap (#1677)

Bug fixes

  • Propagate CXX_STANDARD to all tests (#1664)
  • Compilation fixes for nvcc 11.5 and clang 12 with std=c++20 (#1665)
  • Workaround for nvcc bug https://godbolt.org/z/orrev1xnM (#1681)
  • c_bindings example: fix typo and split cpu and gpu fortran sources (#1684)
  • Fix unused param warnings (#1706)
  • Fix Compilation with CUDA 11.6 (#1710)
  • Support for Clang 14 (#1707)

Testing

  • Add C++20 with Cray Clang on Piz Daint to Jenkins CI (#1675)
  • Perftest Updates (#1690)
  • CI dom: Downgrade to gcc 10.3 for CUDA toolkit support (#1699)

Contributions

This release contains contributions from
@anstaf, @fthaler, @havogt.