25 Oct 13:09

havogt

a8039cb

GridTools version 2.1.0

New features

Dump backend: outputs a json representation of the stencil specification (#1456)
Reduction library with naive, CPU and GPU backends (#1590, #1594, #1619)
SID: Python cuda array interface support (#1596)

Extended features

Support for compile time length in data stores (#1545)
Several SID improvements (#1548)
Structured bindings support for gridtools tuple-like (#1556)
Improvements for Hugepage Allocation (#1562)
Add protection against misuse of device namespace (#1581)
fortran_array_view: allow to disable openacc (#1603)
Introduce sid::unknown_kind (#1605)

Non-functional changes

Hold the sids within sid::composite as tuple (#1564)
Various cleanups and c++17 related changes (#1579)
C++17 versions of meta::fold (#1549)
Sid as a proper C++20 concept (#1580, #1582)

Performance

More Inlining in cpu_kfirst Backend (#1634)
Support for Compile-Time Unit Stride Dimension for Python SID Adapter (#1635, #1651)

Bug fixes

K-cache fixes (#1530)
CMake: Fix storage_gpu for HIPCC-AMDGPU (#1540)
Remove a warning in hugepage_alloc which warns about a problem which only affects testing code (#1560)
Improve HIP + OpenMP Compilation (#1578)
Fix empty composite and add composite::make helper (#1583)
Fix as_const to work with any SID and be compatible with std::as_const (#1601, #1611)
SID composite: add static_assert against incorrect kinds (#1604)
Workaround a CUDA problem: tuple_util::concat remove constexpr var (#1606)
Improve Compliance with Parallel Model: Limit fusion of k-parallel execution with k-offsets (#1612)
GCC 9.x: Optimize multishift (#1630)
Python SID adapter: fix integer format check (#1632)
GCC 11.x: Compilation fixes (#1641, #1646)
Fixes for CUDA 11.4 (#1644)

Testing

Update to GTest v1.11 and minor changes to adapt for changed gtest interface (#1655)

Documentation

Clarifications to the execution model (#1541)

Contributions

This release contains contributions from
@anstaf, @fthaler, @havogt, @lukasm91.

Contributors

fthaler, havogt, and 2 other contributors

Assets 2

04 Oct 08:42

havogt

v1.1.4

4de50ec

GridTools version 1.1.4

Bug fixes

speedup compile time (#1608)
Support for GPU backend with custom block sizes in boundary conditions (#1438)
Fix sid shift origin (#1517)

Compatibility with new compilers

Added support for GCC 11.x (#1652, #1654)
Fix for CUDA 11 (#1520)

Assets 2

31 Jul 09:28

havogt

v2.0.0

8101c64

GridTools version 2.0.0

GridTools v2.0.0

GridTools v2.0.0 comes with an improved API for stencil composition and storage construction.
These changes and a few others (see below) are breaking changes.

Changes since v1.1.0

New API: Stencil Composition

The make_computation API for composing stencils is replaced by a new stencil specification API, e.g.

auto horizontal_diffusion_spec = [](auto coeff, auto in, auto out) {
    GT_DECLARE_TMP(double, lap, flx, fly);
    return st::execute_parallel()
        .ij_cached(lap, flx, fly)
        .stage(lap_function(), lap, in)
        .stage(flx_function(), flx, in, lap)
        .stage(fly_function(), fly, in, lap)
        .stage(out_function(), out, in, flx, fly, coeff);
};

st::run(horizontal_diffusion_spec, stencil_backend_t(), grid, coeff, in, out);

instead of

auto horizontal_diffusion = gt::make_computation<backend_t>(grid,
    p_coeff{} = coeff,
    gt::make_multistage(gt::enumtype::execute<gt::enumtype::parallel, 20>{},
        define_caches(gt::cache<gt::IJ, gt::cache_io_policy::local>(p_lap{}, p_flx{}, p_fly{})),
        gt::make_stage<lap_function>(p_lap{}, p_in{}),
        gt::make_independent(gt::make_stage<flx_function>(p_flx{}, p_in{}, p_lap{}),
            gt::make_stage<fly_function>(p_fly{}, p_in{}, p_lap{})),
        gt::make_stage<out_function>(p_out{}, p_in{}, p_flx{}, p_fly{}, p_coeff{})));

horizontal_diffusion.run(p_in{} = in, p_out{} = out);

See the documentation and examples for details about the new API.

Related PRs: #1388

New API: Storage Builder

Datastores are now created using a builder API, e.g.

auto storage_builder = gt::storage::builder<storage_traits_t>.dimensions(d1, d2, d3).halos(halo, halo, 0);

auto in = storage_builder.type<double const>().value(42).build();
auto coeff = storage_builder.type<double const>().value(42).build();
auto out = storage_builder.type<double>().build();

The type returned by the builder is a shared_ptr of a data_store (previously the shared_ptr was inside the data_store)

Other storage related changes:

Memory alignment is applied in bytes (instead of in elements).
Host/device buffers are automatically synchronized on creation of views or on access of the underlying pointer (the sync method is removed).

See the documentation and examples for details about the new API.

Related PRs #1388, #1534

API break: New Backend names

Our backend names (cuda, mc, x86) where a source of confusion as the users had a certain (but wrong) idea of e.g. when to use x86.

The new names are (#1490):

gpu instead of cuda as the same backend works for HIP.
cpu_kfirst instead of x86, the innermost dimension is k, suitable for vertical stencils and architectures that emphasize caches over vector instructions.
cpu_ifirst instead of mc, the innermost dimension is i, suitable for modern CPUs where vector instructions are key for performance.

Additionally we introduced a new backend gpu_horizontal (#1445) which works only for pure horizontal (parallel) stencils.
Performance of gpu_horizontal is improved over gpu for most stencils, however we recommend to benchmark both backends.

Other API breaking changes

Backend declarations (traits) are removed from common/defs.hpp and are now provided in component specific headers for stencil, timer, gcl and storage (#1388).
We improved the code structure by introducing finer-grained namespaces (#1388)
The storage repository was removed (#1456)

New functionality

New sid::rename_dimensions (#1533)
New regression test illustrating c-arrays as SIDs (#1525)
A Python SID adapter including regression test for calling computations from Python (#1523)
Introduced the threadpool concept (#1484, #1498, #1504) and added an HPX threadpool (#1437)
Added an example for calling CUDA GridTools computations from Fortran with OpenACC (#1454)

Improved functionality

GCL is now header-only (-> all GridTools is now header-only)
The CMake build scripts are rewritten, see the documentation and examples for how to use GridTools CMake targets (#1421, #1441, #1442, #1450, #1509)

Bug Fixes / Cleanup

Fixes to SID concept helpers (#1524, #1527, #1531)
Fixes for CUDA 11 (#1529), thanks @lukasm91
Fixes for HIP compilation (#1488)
Better error diagnostics at the frontend (#1495)
Performance tests are now included in a single binary (#1453)
Layout transformations are refactored (#1388)
and many other small fixes

Infrastructure/Development

Environments are renamed to describe more precisely what they are (#1507)
Added testing on the new MeteoSwiss machine Tsa to Jenkins (#1452)
Moved tests from Travis to GitHub actions (#1446), added tests for different CMake setups (#1443).
Added a Gitpod configuration (#1423)
Added testing with Clang-based Cray compiler on Daint (#1382)

Contributions

This release contains contributions from
@anstaf, @fthaler, @havogt, @jdahm, @lukasm91, @mbianco, @tehrengruber, @wdeconinck.

Assets 2

29 Jul 07:22

havogt

v2.0.0rc2

50c5e50

GridTools version 2.0.0rc2 Pre-release

Pre-release

see final release

Assets 2

15 Jun 12:09

havogt

v2.0.0rc1

53910ee

GridTools version 2.0.0rc1 Pre-release

Pre-release

see final release

Assets 2

20 Jan 14:52

havogt

v1.1.3

d33fa6f

GridTools version 1.1.3

Performance fixes

Revert a #pragma unroll to be optimal for the COSMO dycore on V100 (#1400)

Other

CMake: Add a missing policy workaround_mpi.cmake (#1398)

Assets 2

12 Dec 08:12

havogt

v1.0.4

f45026d

GridTools version 1.0.4

Fixes

CMake: support for superbuilds (nesting gridtools with add_subdirectory/FetchContent) #1383

Assets 2

12 Dec 08:16

havogt

v1.1.2

6858804

GridTools version 1.1.2

Support for new targets

Support for clang-CUDA and HIP (#1361)

Fixes

Support custom block size in storage traits (#1392)
Add GT_FUNCTION to storage_info
CMake: export compilation type (#1387)

Infrastructure

Update testing environment after Piz Daint upgrade (squash of #1369, #1371, #1373, #1382)

Assets 2

06 Dec 11:48

havogt

v1.1.1

7cdf89a

GridTools version 1.1.1

Fixes

Make computation API thread compatible by making the allocator thread_local (#1380).
CMake: fix to make GridTools work as nested project in a "superbuild" setup.

Assets 2

07 Oct 12:02

havogt

v1.1.0

12ee091

GridTools version 1.1.0

GridTools

In GridTools v1.1.0 we set the default C++ standard to C++14 and drop compatibility for C++11. This requires at least CUDA 9.0.

Changes since v1.0.0

Full introduction of the SID concept

The backend is completely restructured based on the SID (stencil iteratable data) concept. There should be no user facing changes as long as user code was only using documented public API (*). The changes separate backend implementation from the core library to allow non intrusive extension of the library with new backends. Additionally maintainability of the gridtools infrastructure is significantly improved.
Performance should be improved in general, but might be worse for specific computations. A common pattern for performance improvement/degradation is not observed.

(*) There is one change which might trigger different behavior (though the old behavior was not documented): temporary fields are now implicitly 3 dimensional. Prior to this version the user could have abused a 2D temporary field for accumulating values between k-levels.

New

New example illustrating the type-erasure pattern for computations. #1318

Deprecation (support will be removed in GridTools v2.0.0)

Using the gridtools::c_bindings is deprecated. Switch to the standalone https://github.com/GridTools/cpp_bindgen.
global_accessor is deprecated, use in_accessor (without extents) instead.
make_global_parameter with backend as template parameter is deprecated. The backend is not needed anymore.

Fixes / Cleanup

Fix performance for CUDA 9.2 / 10.0 #1281 #1327 #1339
Use c++14 features. #1307
Use multiple threads in storage Initialization. #1300
Remove dependency on boost::mpl and boost::fusion
Fixes required to compile gridtools with HIP-Clang. Full support for AMD GPUs via HIP-Clang will come in a next release. #1363
Fix a bug in communication #1355.
The global_parameter doesn't require pre-allocated storage (as it is now passed via constant memory in case of CUDA), therefore global_parameter is a lightweight wrapper around the value type, which can be created without overhead, e.g. when passing it to computation.run().

Infrastructure/Development

The bash build script is replaced by a python driven build process, see wiki for how to get the environment. #1273 #1298 #1341
Improved jenkins performance plots. #1301 #1338
Googletest is now pulled-in with CMake's FetchConent instead of having it as part of the repository. #1310

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New features

Extended features

Non-functional changes

Performance

Bug fixes

Testing

Documentation

Contributions

Contributors

Bug fixes

Compatibility with new compilers

GridTools v2.0.0

Changes since v1.1.0

New API: Stencil Composition

New API: Storage Builder

API break: New Backend names

Other API breaking changes

New functionality

Improved functionality

Bug Fixes / Cleanup

Infrastructure/Development

Contributions

Performance fixes

Other

Fixes

Support for new targets

Fixes

Infrastructure

Fixes

GridTools

Changes since v1.0.0

Full introduction of the SID concept

New

Deprecation (support will be removed in GridTools v2.0.0)

Fixes / Cleanup

Infrastructure/Development

Releases: GridTools/gridtools

GridTools version 2.1.0

New features

Extended features

Non-functional changes

Performance

Bug fixes

Testing

Documentation

Contributions

Contributors

GridTools version 1.1.4

Bug fixes

Compatibility with new compilers

GridTools version 2.0.0

GridTools v2.0.0

Changes since v1.1.0

New API: Stencil Composition

New API: Storage Builder

API break: New Backend names

Other API breaking changes

New functionality

Improved functionality

Bug Fixes / Cleanup

Infrastructure/Development

Contributions

GridTools version 2.0.0rc2

GridTools version 2.0.0rc1

GridTools version 1.1.3

Performance fixes

Other

GridTools version 1.0.4

Fixes

GridTools version 1.1.2

Support for new targets

Fixes

Infrastructure

GridTools version 1.1.1

Fixes

GridTools version 1.1.0

GridTools

Changes since v1.0.0

Full introduction of the SID concept

New

Deprecation (support will be removed in GridTools v2.0.0)

Fixes / Cleanup

Infrastructure/Development