- Fix rotate_left and rotate_right behavior (it was swapped!)
- Fix compress implementation on RISC-V
- Improve RISC-V CI
- Fix clang-17 compilation on RISC-V
- Validate cmake integration
- Provide xsimd::transpose on 64 and 32 bits on most platforms
- Improve documentation
- Provide xsimd::batch_bool::count
- Fix interaction between xsimd::make_sized_batch_t and xsimd::batch<std::complex, ...>
- Fix vbmi, sve and rvv detection through xsimd::available_architectures
- Fix compilation on MS targets where
small
can be defined.- Change default install directory for installed headers.
- Support mixed-complex implementations of xsimd::pow()
- Improve xsimd::pow implementation for complex numbers
- Fix uninitialized read in lgamma implementation
- Most xsimd functions are flagged as always_inline
- Fix some xsimd scalar version (abs, bitofsign, signbit, bitwise_cast, exp10)
- Move from batch_constant<batch<T, A>, Csts...> to batch_constant<T, A, Csts...>
- Move from batch_bool_constant<batch<T, A>, Csts...> to batch_bool_constant<T, A, Csts...>
- Provide an as_batch() method (resp. as_batch_bool) method for batch_constant (resp. batch_bool_constant)
- New architecture emulated<N> for batches of N bits emulated using scalar operations.
- Remove the version method from all architectures
- Support xsimd::avg and xsimd::avgr vector operation
- Model i8mm arm extension
- Fix dispatching mechanism
- Update readme with a section on adoption, and a section on the history of the project
- Fix/avx512vnni implementation
- Fix regression on XSIMD_NO_SUPPORTED_ARCHITECTURE
- Fix various problems with architecture version handling
- Specialize xsimd::compress for riscv
- Provide stubs for various avx512xx architectures
- Fix sincos implementation to cope with Emscripten
- Upgraded minimal version of cmake to remove deprecation warning
- Fixed constants::signmask for GCC when using ffast-math
- Add RISC-V Vector support
- Generic, simple implementation fox xsimd::compress
- Disable batch of bools, and suggest using batch_bool instead
- Add an option to skip installation
- Provide shuffle operations of floating point batches
- Provide a generic implementation of xsimd::swizzle with dynamic indices
- Implement rotl, rotr, rotate_left and rotate_right
- Let CMake figure out pkgconfig directories
- Add missing boolean operators in xsimd_api.hpp
- Initial Implementation for the new WASM based instruction set
- Provide a generic version for float to uint32_t conversion
- Introduce XSIMD_DEFAULT_ARCH to force default architecture (if any)
- Remove C++ requirement on xsimd::exp10 scalar implementation
- Improve and test documentation
- Provide a generic reducer
- Fix
find_package(xsimd)
for xtl enabled xsimd, reloaded- Cleanup benchmark code
- Provide avx512f implementation of FMA and variant
- Hexadecimal floating points are not a C++11 feature
- back to slow implementation of exp10 on Windows
- Changed bitwise_cast API
- Provide generic signed /unsigned type conversion
- Fixed sde location
- Feature/incr decr
- Cleanup documentation
- Fix potential ABI issue in SVE support
- Disable fast exp10 on OSX
- Assert on unaligned memory when calling aligned load/store
- Fix warning about uninitialized storage
- Always forward arch parameter
- Do not specialize the behavior of
simd_return_type
for char- Support broadcasting of complex batches
- Make xsimd compatible with -fno-exceptions
- Provide and test comparison operators overloads that accept scalars
- Fix potential ABI issue in SVE support, making
xsimd::sve
a type alias to size-dependent type.
- Support fixed size SVE
- Fix a bug in SSSE3
xsimd::swizzle
implementation forint8
andint16
- Rename
xsimd::hadd
intoxsimd::reduce_add
, providexsimd::reduce_min
andxsimd::reduce_max
- Properly report unsupported double for neon on arm32
- Fill holes in xsimd scalar api
- Fix
find_package(xsimd)
for xtl enabled xsimd- Replace
xsimd::bool_cast
byxsimd::batch_bool_cast
- Native
xsimd::hadd
for float on arm64- Properly static_assert when trying to instantiate an
xsimd::batch
of xtl complex- Introduce
xsimd::batch_bool::mask()
andbatch_bool::from_mask(...)
- Flag some function with
[[nodiscard]]
- Accept both relative and absolute libdir and include dir in xsimd.pc
- Implement
xsimd::nearbyint_as_int
for NEON- Add
xsimd::polar
- Speedup double -> F32/I32 gathers
- Add
xsimd::slide_left
andxsimd::slide_right
- Support integral
xsimd::swizzles
on AVX
Add
xsimd::gather
andxsimd::scatter
Add
xsimd::nearbyint_as_int
Add
xsimd::none
Add
xsimd::reciprocal
Remove batch constructor from memory adress, use
xsimd::batch<...>::load_(un)aligned
insteadLeave to msvc users the opportunity to manually disable FMA3 on AVX
Provide
xsimd::insert
to modify a single value from a vectorMake
xsimd::pow
implementation resilient toFE_INVALID
Reciprocal square root support through
xsimd::rsqrt
NEON: Improve
xsimd::any
andxsimd::all
Provide type utility to explicitly require a batch of given size and type
Implement
xsimd::swizzle
on x86, neon and neon64Avx support for
xsimd::zip_lo
andxsimd::zip_hi
Only use
_mm256_unpacklo_epi<N>
on AVX2Provide neon/neon64 conversion function from
uint(32|64)_t
to(float|double)
Provide SSE/AVX/AVX2 conversion function from
uint32_t
tofloat
Provide AVX2 conversion function from
(u)int64_t
todouble
Provide better SSE conversion function from
uint64_t
todouble
Provide better SSE conversion function to
double
Support logical xor for
xsimd::batch_bool
Clarify fma support:
- FMA3 + SSE ->
xsimd::fma3<sse4_2>
- FMA3 + AVX ->
xsimd::fma3<avx>
- FMA3 + AVX2 ->
xsimd::fma3<avx2>
- FMA4 ->
xsimd::fma4
Allow
xsimd::transform
to work with complex typesAdd missing scalar version of
xsimd::norm
andxsimd::conj
- Fix neon
xsimd::hadd
implementation- Detect unsupported architectures and set
XSIMD_NO_SUPPORTED_ARCHITECTURE
if needs be
- Provide some conversion operators for
float
->uint32
- Improve code generated for AVX2 signed integer comparisons
- Enable detection of avx512cd and avx512dq, and fix avx512bw detection
- Enable detection of AVX2+FMA
- Pick the best compatible architecture in
xsimd::dispatch
- Enables support for FMA when AVX2 is detected on Windows
- Add missing includes / forward declaration
- Mark all functions inline and noexcept
- Assert when using incomplete
std::initializer_list
- Improve CI & testing, no functional change
- Do not use
_mm256_srai_epi32
under AVX, it's an AVX2 instruction
- Fix invalid constexpr
std::make_tuple
usage in neon64