Skip to content

Releases: bsc-pm/ompss-2-releases

OmpSs-2 2024.05

16 May 12:12
Compare
Choose a tag to compare

Version 2024.05, Thu May 16, 2024

The OmpSs-2 2024.05 release includes the Directory/Cache (D/C) for Host and CUDA devices in Nanos6, several new features for the nOS-V tasking library, and performance and bugfixes. The libompv in LLVM/OpenMP includes the implementation of OpenMP free-agents and instrumentation through ovni. This release removes the support for the Mercurium compiler.

Nanos6

  • Add directory/cache (D/C) for Host and CUDA devices
  • Add device memory allocation API for D/C-managed memory
  • Improvements to the ovni instrumentation

nOS-V

  • New batch submission API, which can accumulate tasks to submit them in batch once a certain threshold is reached
  • Add nosv_mutex_t and nosv_barrier_t as nOS-V aware alternatives to their pthread counterparts
  • Add instrumentation points for the nosv_attach and nosv_detach calls
  • Add instrumentation for parallel tasks
  • Activate the turbo.enabled configuration option by default, enabling flush-to-zero in x86-64 and aarch64
  • Perform safety checks when the turbo.enabled configuration option is set to verify FPU flags are not modified by external libraries
  • Split instrumentation events for the scheduler to allow them to be more granularly controlled
  • Allow nOS-V programs to call fork() without leaving the forked process in an incoherent state
  • Other bugfixes and improvements

NODES

  • Improve the error-handling of nOS-V return codes
  • Improve descriptiveness of ovni instrumentation
  • Various improvements related to API integrations (nOS-V, ALPI, ovni)

LLVM/OpenMP (libompv)

  • Implement the OpenMP free-agents feature by setting OMP_ENABLE_FREE_AGENTS=1 and OMP_WAIT_POLICY=passive
  • Instrument through ovni by setting OMP_OVNI=1 and enabling ovni instrumentation in nOS-V

LLVM/Clang

  • Add OPENMP_RUNTIME environment variable to choose the runtime library to link against
  • Other bugfixes and improvements

Ovni

  • New ovni_thread_requirefunction to enable emulation models
  • Streams are marked as finished when calling ovni_thread_free
  • Support per-thread metadata
  • Add manual page for ovnidump
  • Add support for nosv_attach and nosv_detach events
  • Add support for nosv_mutex_lock, nosv_mutex_trylock, and nosv_mutex_unlock events
  • Add support for nosv_barrier events
  • Add OpenMP model to instrument the libompv implementation
  • Add new body model to support parallel tasks in nOS-V (taskfor directive)
  • Fix Paraver cfgs for Mac OS
  • Other bugfixes and improvements

OmpSs-2 2023.11

22 Nov 16:08
Compare
Choose a tag to compare

Version 2023.11, Wed Nov 22, 2023

The OmpSs-2 2023.11 release includes performance and bugfixes for the runtime systems, several new features for the nOS-V tasking library, and performance improvements on the taskiter construct implementation. It also implements the ALPI (version 1.0) in the runtime systems, which provides support for task-aware libraries. The LLVM/OpenMP includes a new OpenMP runtime called OpenMP-V (libompv) that works on top of the nOS-V tasking library. A new instrumentation library called Sonar is provided to instrument MPI function calls through ovni.

General

  • The OmpSs-2 runtime systems expose the ALPI generic low-level tasking interface

Nanos6

  • Implement the ALPI interface (version 1.0)
  • Allow embedding jemalloc allocator
  • Embed hwloc and jemalloc by default
  • Add devices.cuda.prefetch config option to control CUDA prefetching of data dependencies (enabled by default)
  • Install the nanos6.toml config file in $prefix/share
  • Remove obsolete instrument.h public interface
  • Remove obsolete stats and graph instrumentations
  • Remove software dependency with libunwind and elfutils
  • Fix execution when enabling extrae instrumentation
  • Remove memory leaks
  • Various bugfixes and corrections

nOS-V

  • Implement the ALPI interface (version 1.0)
  • Add misc.stack_size config option to change the stack size of nOS-V threads
  • Add ovni.level config option for fine-grained instrumentation control
  • Change nosv_attach API to not require an explicit task type and support multiple attaches
  • Implement parallel tasks which can be executed on multiple CPUs at once
  • Allow calling nosv_init and nosv_shutdown multiple times
  • Change error handling to return custom nOS-V error codes
  • Allow early wake of deadline tasks with nosv_submit passing the NOSV_SUBMIT_DEADLINE_WAKE flag
  • Add compatibility layer for calls to sched_get/setaffinity and pthread_get/setaffinity
  • Add instrumentation points for the nosv_create and nosv_destroy APIs
  • Various bugfixes and corrections

NODES

  • Improve performance of the taskiter construct
  • Fix several bugs of the taskiter implementation
  • Ensure nOS-V library is at the first level of dependencies
  • Use the updated attach/detach from nOS-V 2.0
  • Drop support for nOS-V versions older than 2.0

LLVM/OpenMP

  • Provide OpenMP runtime named OpenMP-V (libompv) working over the nOS-V tasking library (-fopenmp=libompv)
  • Make OpenMP-V runtime compatible with task-aware libraries
  • Drop support for task-aware libraries in vanilla OpenMP runtime libomp

LLVM/Clang

  • Fix task data dependencies' calculation for long double types

Ovni

  • Add OVNI_TRACEDIR envar to change the trace directory (default is ovni)
  • Add the ovniver program to report the libovni version and commit
  • Add ovni_version_get() function
  • Add nOS-V API subsystem events for nosv_create() and nosv_destroy()
  • Add TAMPI model with T code, subsystem events and cfgs
  • Add MPI model with M code, function events and cfgs
  • Don't hardcore destination directory names like lib, to use the ones in the destination host (like lib64)

Sonar

  • Introduce the Sonar library that uses ovni for instrumenting MPI functions

Task-Aware Libraries

  • Leverage the ALPI interface instead of the Nanos6-specific interface
  • Drop support for OmpSs-2 versions older than 2023.11
  • See other features and fixes in each task-aware libraries' CHANGELOG

OmpSs-2 2023.05.1

24 Jul 15:34
Compare
Choose a tag to compare

OmpSs-2 2023.05.1, Mon Jul 24, 2023

The OmpSs-2 2023.05.1 release includes several bug fixes and improvements with respect to the OmpSs-2 2023.05 release. These bug fixes are listed at the end of these release notes.

The OmpSs-2 2023.05 releases include new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.

General

  • Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
  • Add performance and usability improvements in Nanos6
  • Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
  • Add support for atomic and critical OmpSs-2 directives in the LLVM/Clang compiler
  • Drop support for task for clause
  • Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead

NODES Runtime and nOS-V Tasking Library

  • Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
  • Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the taskiter construct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications
  • Extend -fompss-2 option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option values libnanos6 (default) and libnodes, respectively

Task-Aware Libraries

  • Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
  • Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
  • Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications

ovni Instrumentation

  • Add new graph-based design in ovni to support complex models like the new breakdown timeline

Changes with respect to the 2023.05 release

The OmpSs-2 2023.05.1 includes the following bug fixes and improvements with respect to the 2023.05 version:

Nanos6 Runtime

  • Fix CUDA kernel launch configuration and improve performance of OmpSs-2@CUDA support
  • Allow failures at CUDA prefetching without aborting the execution
  • Fix linking with jemalloc when --as-needed linking flag is used
  • Improve testing infrastructure and programs
  • Update documentation regarding OmpSs-2@CUDA support
  • Improve general documentation

LLVM/OpenMP Runtime

  • Fix OpenMP potential use-after-free in polling tasks' mechanism

LLVM/Clang Compiler

  • Fix unconditional break inside a for-loop which is encapsulated in a task
  • Fix device tasks call order when capturing more information in other clauses
  • Add support shmem clause in device tasks

OmpSs-2 2023.05

24 May 11:01
Compare
Choose a tag to compare

OmpSs-2 2023.05, Wed May 24, 2023

The OmpSs-2 2023.05 release includes new software projects and several performance and usability improvements for the OmpSs-2 programming model. In the context of OmpSs-2, this release introduces the new NODES runtime system supporting OmpSs-2, a novel and efficient tasking library named nOS-V, new Task-Aware libraries for interoperability with GPU offloading models, and new features in the ovni instrumentation library.

General

  • Improve support for ovni instrumentation in the Nanos6 runtime and support for the idle CPUs view
  • Add performance and usability improvements in Nanos6
  • Allow embedding hwloc library into Nanos6 to avoid conflicts with other third-party software that use different hwloc versions
  • Add support for atomic and critical OmpSs-2 directives in the LLVM/Clang compiler
  • Drop support for task for clause
  • Mercurium is the OmpSs-2 legacy compiler, not supported anymore, and will not provide new features for OmpSs-2. Use the LLVM/Clang compiler instead

NODES Runtime and nOS-V Tasking Library

  • Introduce the new low-level nOS-V threading and tasking library, enabling co-execution of applications
  • Introduce the new NODES runtime system, built on top of nOS-V, that supports the OmpSs-2 model. This runtime implements the taskiter construct and leverages directed task graphs (DCTG) to optimize the execution of iterative applications
  • Extend -fompss-2 option from LLVM/Clang to choose between Nanos6 and NODES runtimes by accepting the option values libnanos6 (default) and libnodes, respectively

Task-Aware Libraries

  • Introduce the new Task-Aware CUDA (TACUDA), Task-Aware HIP (TAHIP) and Task-Aware SYCL (TASYCL) libraries. These task-aware libraries seamlessly integrate the CUDA, HIP and SYCL APIs for GPU offloading with the OmpSs-2 and OpenMP tasking models
  • Add performance improvements and bug fixes in the Task-Aware MPI (TAMPI) and Task-Aware GASPI (TAGASPI) communication libraries
  • Extend Task-Aware MPI (TAMPI) to support ovni instrumentation and allow tracing of multi-node hyrbid MPI+OmpSs-2 applications

ovni Instrumentation

  • Add new graph-based design in ovni to support complex models like the new breakdown timeline