Releases: LLNL/RAJAPerf
v0.5.2 Release
This release contains updates to use RAJA v0.10.0 as well as some bug fixes.
Please download the RAJAPerf-v0.5.2.tar.gz file below. The others will not work due to the way RAJAPerf uses git submodules.
v0.5.1
This release contains several documentation updates and changes to default CMake configuration so that RAJA tests and examples are not built unless explicitly turned on by passing the proper options to CMake.
Please download the RAJAPerf-v0.5.1.tar.gz file below. The others will not work due to the way RAJAPerf uses git submodules.
v0.5.0
This release contains several new kernels, plus substantial changes to many CUDA kernel variants to improve performance.
Please download the RAJAPerf-v0.5.0.tar.gz file below. The others will not work due to the way RAJAPerf uses git submodules.
Major changes include:
- Several new kernels in the polybench group.
- Update to RAJA v0.8.0 release.
- Exercise newer RAJA features in kernels, such as loop tiling, thread local memory, and GPU shared memory in CUDA variants.
- Build scripts have been updated to use newer compilers available on Livermore Computing platforms.
v0.4.0
This release contains two new kernels, plus substantial changes to the build process and existing kernels.
Please download the RAJAPerf-0.4.0.tar.gz file below. The others will not work due to the way RAJAPerf uses git submodules.
Major changes include:
- Two new kernels: DAXPY and ADI.
- Update to a newer RAJA development version (SHA hash a59e7c4a...) to exercise newer RAJA features.
- All kernels with nested loops have been converted to latest RAJA::kernel API, including OpenMP target.
- Some kernels that use RAJA Views now explicitly specify which indexing dimension is stride-1 to take advantage of new internal RAJA optimizations.
- When building with OpenMP target enabled all other kernel variants are disabled, except for Base_Seq. This is a (hopefully) temporary change to avoid mis-interpretation of kernel timings due to issues with some compilers that require disabling inlining to generate correct results. The executable will have "omptarget" in its name.
- A variety of newer build scripts have been added for Livermore Computing platforms.
v0.3.0
Release contains a few minor fixes to 'help' documentation and fixes a significant timer issue whereby asynchronous CUDA kernels were not being timed properly.
Please use the named release gzipped tarfile above. The others are generated by GitHub and will not work due to the way the Suite uses git submodules.
v0.2.3
Contains minor fixes to output text files based on initial feedback from some vendors who are working with the Suite.
Please use the named release gzipped tarfile above. The others are generated by GitHub and will not work due to the way the Suite uses git submodules.
v0.2.2
A few minor bug fixes and improvments. This version tracks RAJA v0.5.2 release.
Please use the named release gzipped tarfile above. The others will not work due to the was the Suite uses git submodules.
v0.2.1
Updates to git submodules and fixed release tarball.
Please use the named release gzipped tarfile.
First formal release
The RAJA Performance Suite currently contains 29 kernels from a variety of sources, including other benchmarks and real applications. Each kernel includes multiple variants: sequential, OpenMP multithreading (CPU), CUDA (GPU), and OpenMP target offload (GPU, if available). Each of these appears in a baseline variant (C-style, programmed explicitly to programming model) and a RAJA variant, where the selected programming model implementation is accessed via the RAJA portability abstraction layer.
Complete instructions for configuring and build the code as well as available options to run performance
experiments are described in the README file documentation.
OpenMP Target variants should be considered a work-in-progress, especially kernels with nested loops and kernels with reductions.
RAJA variants of kernels with nested loops will change in the near future due to API changes and improvements in RAJA