Releases: bsc-pm/ompss-2-releases
OmpSs-2 2024.05
Version 2024.05, Thu May 16, 2024
The OmpSs-2 2024.05 release includes the Directory/Cache (D/C) for Host and CUDA devices in Nanos6, several new features for the nOS-V tasking library, and performance and bugfixes. The libompv
in LLVM/OpenMP includes the implementation of OpenMP free-agents and instrumentation through ovni. This release removes the support for the Mercurium compiler.
Nanos6
- Add directory/cache (D/C) for Host and CUDA devices
- Add device memory allocation API for D/C-managed memory
- Improvements to the ovni instrumentation
nOS-V
- New batch submission API, which can accumulate tasks to submit them in batch once a certain threshold is reached
- Add
nosv_mutex_t
andnosv_barrier_t
as nOS-V aware alternatives to their pthread counterparts - Add instrumentation points for the
nosv_attach
andnosv_detach
calls - Add instrumentation for parallel tasks
- Activate the
turbo.enabled
configuration option by default, enabling flush-to-zero in x86-64 and aarch64 - Perform safety checks when the
turbo.enabled
configuration option is set to verify FPU flags are not modified by external libraries - Split instrumentation events for the scheduler to allow them to be more granularly controlled
- Allow nOS-V programs to call fork() without leaving the forked process in an incoherent state
- Other bugfixes and improvements
NODES
- Improve the error-handling of nOS-V return codes
- Improve descriptiveness of ovni instrumentation
- Various improvements related to API integrations (nOS-V, ALPI, ovni)
LLVM/OpenMP (libompv)
- Implement the OpenMP free-agents feature by setting
OMP_ENABLE_FREE_AGENTS=1
andOMP_WAIT_POLICY=passive
- Instrument through ovni by setting
OMP_OVNI=1
and enabling ovni instrumentation in nOS-V
LLVM/Clang
- Add
OPENMP_RUNTIME
environment variable to choose the runtime library to link against - Other bugfixes and improvements
Ovni
- New
ovni_thread_require
function to enable emulation models - Streams are marked as finished when calling
ovni_thread_free
- Support per-thread metadata
- Add manual page for
ovnidump
- Add support for
nosv_attach
andnosv_detach
events - Add support for
nosv_mutex_lock
,nosv_mutex_trylock
, andnosv_mutex_unlock
events - Add support for
nosv_barrier
events - Add OpenMP model to instrument the
libompv
implementation - Add new body model to support parallel tasks in nOS-V (
taskfor
directive) - Fix Paraver cfgs for Mac OS
- Other bugfixes and improvements
OmpSs-2 2023.11
Version 2023.11, Wed Nov 22, 2023
The OmpSs-2 2023.11 release includes performance and bugfixes for the runtime systems, several new features for the nOS-V tasking library, and performance improvements on the taskiter
construct implementation. It also implements the ALPI (version 1.0) in the runtime systems, which provides support for task-aware libraries. The LLVM/OpenMP includes a new OpenMP runtime called OpenMP-V (libompv
) that works on top of the nOS-V tasking library. A new instrumentation library called Sonar is provided to instrument MPI function calls through ovni.
General
- The OmpSs-2 runtime systems expose the ALPI generic low-level tasking interface
Nanos6
- Implement the ALPI interface (version 1.0)
- Allow embedding jemalloc allocator
- Embed hwloc and jemalloc by default
- Add
devices.cuda.prefetch
config option to control CUDA prefetching of data dependencies (enabled by default) - Install the
nanos6.toml
config file in$prefix/share
- Remove obsolete instrument.h public interface
- Remove obsolete stats and graph instrumentations
- Remove software dependency with libunwind and elfutils
- Fix execution when enabling extrae instrumentation
- Remove memory leaks
- Various bugfixes and corrections
nOS-V
- Implement the ALPI interface (version 1.0)
- Add
misc.stack_size
config option to change the stack size of nOS-V threads - Add
ovni.level
config option for fine-grained instrumentation control - Change
nosv_attach
API to not require an explicit task type and support multiple attaches - Implement parallel tasks which can be executed on multiple CPUs at once
- Allow calling
nosv_init
andnosv_shutdown
multiple times - Change error handling to return custom nOS-V error codes
- Allow early wake of deadline tasks with
nosv_submit
passing theNOSV_SUBMIT_DEADLINE_WAKE
flag - Add compatibility layer for calls to
sched_get/setaffinity
andpthread_get/setaffinity
- Add instrumentation points for the
nosv_create
andnosv_destroy
APIs - Various bugfixes and corrections
NODES
- Improve performance of the
taskiter
construct - Fix several bugs of the
taskiter
implementation - Ensure nOS-V library is at the first level of dependencies
- Use the updated attach/detach from nOS-V 2.0
- Drop support for nOS-V versions older than 2.0
LLVM/OpenMP
- Provide OpenMP runtime named OpenMP-V (
libompv
) working over the nOS-V tasking library (-fopenmp=libompv
) - Make OpenMP-V runtime compatible with task-aware libraries
- Drop support for task-aware libraries in vanilla OpenMP runtime
libomp
LLVM/Clang
- Fix task data dependencies' calculation for long double types
Ovni
- Add
OVNI_TRACEDIR
envar to change the trace directory (default isovni
) - Add the
ovniver
program to report the libovni version and commit - Add
ovni_version_get()
function - Add nOS-V API subsystem events for
nosv_create()
andnosv_destroy()
- Add TAMPI model with
T
code, subsystem events and cfgs - Add MPI model with
M
code, function events and cfgs - Don't hardcore destination directory names like lib, to use the ones in the destination host (like lib64)
Sonar
- Introduce the Sonar library that uses ovni for instrumenting MPI functions
Task-Aware Libraries
- Leverage the ALPI interface instead of the Nanos6-specific interface
- Drop support for OmpSs-2 versions older than 2023.11
- See other features and fixes in each task-aware libraries' CHANGELOG
OmpSs-2 2023.05.1
OmpSs-2 2023.05.1, Mon Jul 24, 2023
The OmpSs-2 2023.05.1 release includes several bug fixes and improvements with respect to the OmpSs-2 2023.05 release. These bug fixes are listed at the end of these release notes.
The OmpSs-2 2023.05 releases include new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.
General
- Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
- Add performance and usability improvements in Nanos6
- Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
- Add support for
atomic
andcritical
OmpSs-2 directives in the LLVM/Clang compiler - Drop support for
task for
clause - Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead
NODES Runtime and nOS-V Tasking Library
- Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
- Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the
taskiter
construct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications - Extend
-fompss-2
option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option valueslibnanos6
(default) andlibnodes
, respectively
Task-Aware Libraries
- Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
- Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
- Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications
ovni Instrumentation
- Add new graph-based design in ovni to support complex models like the new breakdown timeline
Changes with respect to the 2023.05 release
The OmpSs-2 2023.05.1 includes the following bug fixes and improvements with respect to the 2023.05 version:
Nanos6 Runtime
- Fix CUDA kernel launch configuration and improve performance of OmpSs-2@CUDA support
- Allow failures at CUDA prefetching without aborting the execution
- Fix linking with jemalloc when --as-needed linking flag is used
- Improve testing infrastructure and programs
- Update documentation regarding OmpSs-2@CUDA support
- Improve general documentation
LLVM/OpenMP Runtime
- Fix OpenMP potential use-after-free in polling tasks' mechanism
LLVM/Clang Compiler
- Fix unconditional break inside a for-loop which is encapsulated in a task
- Fix device tasks call order when capturing more information in other clauses
- Add support
shmem
clause in device tasks
OmpSs-2 2023.05
OmpSs-2 2023.05, Wed May 24, 2023
The OmpSs-2 2023.05 release includes new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.
General
- Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
- Add performance and usability improvements in Nanos6
- Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
- Add support for
atomic
andcritical
OmpSs-2 directives in the LLVM/Clang compiler - Drop support for
task for
clause - Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead
NODES Runtime and nOS-V Tasking Library
- Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
- Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the
taskiter
construct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications - Extend
-fompss-2
option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option valueslibnanos6
(default) andlibnodes
, respectively
Task-Aware Libraries
- Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
- Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
- Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications
ovni Instrumentation
- Add new graph-based design in ovni to support complex models like the new breakdown timeline