You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
half8 "equals" and "not equals" operators don't conform to the IEEE 754 standard - Unity has not yet reacted to my bug-report in regards to their "half" implementation
Fixes
fixed triggered burst compilation error by "Sse4_1.blend_epi16" when compiling for SSE2 due to fallback code not using a constant value for "imm8"
fixed incorrect CPU feature checks for quarter vector type-conversion code when compiling for SSE2
fixed scalar (single value and C# fallback) "lzcnt" implementations for (s)byte and (u)short values and (u)long4 vectors
Additions
added "ulong countbits(void* ptr, ulong bytes)", which counts the number of 1-bits in a given block of memory, using Wojciech Mula's SIMD population count algorithm
added high performance and/or SIMD "gcd" a.k.a. greatest common divisor functions for (u)int, (u)long and all integer vector types, which always return unsigned types and vectors
added high performance and/or SIMD "lcm" a.k.a. least common multiple functions for (u)int, (u)long and all integer vector types, which always return unsigned types and vectors
added high performance and/or SIMD "intsqrt" - integer square root (floor(sqrt(x)) functions for all integer- and integer vector types, with the functions for signed integers and vectors throwing an ArgumentOutOfRangeException in case a value is negative
Improvements
performance improvements of "avg" functions for signed integer vectors
added SIMD implementations of the "transpose" functions for all matrix types
added SSE4 and SSE2 fallback code for variable bitshifts ("shl", "shrl" and "shra")
added SSE2 fallback code for (s)byte vector-by-vector division and modulo operations
added SSE2 fallback code for "all_dif" for (s)byte16, (u)short8 and (u)int8 vectors
added SSE2 fallback code for typecasting, propagating through the entire library
added SSE2 fallback code for "addsub" and "subadd" functions
bitmask32 and bitmask64 now allow for masks to be up to 32 and 64 bits wide, respectively
Changes
renamed "BurstCompilerException" to "CPUFeatureCheckException"
"shl", "shrl" and "shra" now have undefined behavior when bitshifting any value outside of the interval [0, 8 * sizeof(integer_type) - 1] for performance reasons and because of differences between SSE, AVX and managed C#
Fixed Oversights
added "shl", "shrl" and "shra" (varying per element) functions for (s)byte and (u)short vectors
added "ror" and "rol" (varying per element) functions for (s)byte and (u)short vectors
added "compareto" functions for all vector types except half- and quarter vectors
added "all_dif" functions for (s)byte32 vectors
added vshr/l and vror/l functions for (s)byte32 and (u)short16 vectors