You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
half8== and != operators don't conform to the IEEE 754 standard - Unity has not yet reacted to my bug-report in regards to their "half" implementation
(s)byte, (u)short vector and (U)Int128 multiplication, division and modulo operations by compile time constants are not optimal. For (U)Int128, it requires a new Burst feature à la T Constant.ForceCompileTimeEvaluation<T, U>(Func<U, T> code)(proposed); Currently work is being done on (s)byte and (u)short vectors in this regard, which will beat any compiler. The current (tested) state of all optimizations possible is included in this version.
pow functions with compile time constant exponents currently do not handle many decimal numbers - math.rsqrt would often be used in those cases for optimal performance but it is actually slower when the Unity.Burst.FloatMode is set to anything but FloatMode.Fast. To guarantee optimal performance, compile time access to the current FloatMode would be needed (proposed)
double(r)cbrt functions are currently not optimized
Fixes
linked float8rcp and rsqrt functions to Bursts' FloatMode and FloatPrecision
short.MinValue / -1 now correctly overflows to short.MinValue when dividing a short16 vector by another short16 vector when compiling for AVX or higher
fixed scalar quarter to double conversion for when the quarter value is negative
fixed scalar half to quarter conversion for when the half value is negative
fixed vector quarter to ulong conversion for when a quarter value is negative
fixed (u)short8 to quarter8 conversion
Additions
Added saturation arithmetic to the library for all scalar- and vector types. Saturation arithmetic clamps the result of an operation to type.MinValue and type.MaxValue if under- or overflow occurs, respectively and has single-instruction hardware support for (s)bytes and (u)shorts. The included functions are:
addsaturated
subsaturated
mulsaturated
divsaturated (only clamps division of floating point types and signed division of, for instance, sbyte.MinValue ( = -128) / -1 to sbyte.MaxValue ( =127), which would cause a hardware exception for ints and longs`)
castsaturated (all types to all other types with a smaller range),
csumsaturated
cprodsaturated
(U)Int128
added high performance (U)Int128 types with full library support, meaning: all operators and type conversions aswell as all functions support these types. Most operations of both types, in Burst code, compile down to optimal machine code. Exceptions: 1) signed 64x64 bit to 128 bit multiplication 2) *, /, % and divrem functions with a scalar compile time constant argument (See: Known Issues 2)
added Random128 XOR-Shift pseudo random number generator for generating (U)Int128s
Cube Root
added high performance & accuracy (r)cbrt - (reciprocal) cube root functions for scalar and vector float- and double types based on a research paper from 2021. An optional bool parameter allows the caller to decide whether or not negative input values should be handled correctly (which is not the case with math.pow(x, 1f/3f)), which is set to false by default
added high performance intcbrt - integer cube root functions for all scalar and vector integer types. For signed integer types, an optional bool parameter allows the caller to decide whether or not negative input values should be handled correctly (which is not the case with math.pow(x, 1f/3f)), which is set to false by default
Other Additions
added a log function to all scalar and vector float- and double types with a second parameter b, which is the logarithms' base
added reversebytes functions for all scalar- and vector types, which convert back and forth between big endian and little endian byte order, respectively. All of them (scalar, vector) compile down to single hardware instructions
added pow functions with scalar exponents for float and double scalars and vectors, with optimizations for selected constant exponents (not necessarily whole exponents)
added function overloads to all functions for scalar (s)bytes and (u)shorts in order to resolve function call resolution ambiguity which was already present in Unity.Mathematics, which may also improve performance in some cases
added a static readonly New property to RandomX XOR-Shift pseudo random generators. It calls Environment.TickCount internally (and is thus seeded somewhat randomly), makes sure it is non-zero and can be called from Burst native code
added fastrcp functions for float scalars and vectors, faster (and substantially less accurate) than FloatPrecision.Low, FloatMode.Fast Burst implementations
added fastrsqrt functions for float scalars and vectors, faster (and substantially less accurate) than FloatPrecision.Low, FloatMode.Fast Burst implementations
Improvements
added AVX and AVX2 code for float8sin, cos, tan, sincos, asin, acos, atan, atan2, sinh, cosh, tanh, pow, exp, exp2, exp10, log, log2, log10 and fmod (and the % operator)
optimized many /, %, * and divrem operations with a scalar compile time constant argument for (s)byte vectors (see 'Known Issues 2'), which were previously not optimized (...optimally/at all) by Burst.
added SSE2 fallback code for converting AVX vector types to SSE vector types and vice versa(for example: short16(256 bit) to byte16(128 bit))
scalar (s)byte and (u)shortrol and ror functions now compile down to single hardware instructions
improved performance and/or reduced code size of nearly all vector comparison operations (==, > etc.)
improved performance of - and added SSE2 fallback code for bitfield to boolean vector conversion (toboolX and thus also select(vector a, vector b, bitmask c));
improved performance of intpow functions in general and for when the exponent is a compile time constant
improved performance and reduced code size of compareto vector functions (especially for unsigned types)
added more optimizations to isdivisible
improved performance of intsqrt functions for (u)long and (s)byte scalar and vector types considerably
reduced code size of ispow2 vector functions
reduced code size of (s)byte vector-by-vector division
improved performance of Random64's (u)long4 generation if compiling for AVX2
improved performance of (s)byte matrix multiplication
reduced code size of (u)short- and up to (s)byte8 vector by vector division and divrem functions(and improved performance if compiling for SSE2 only)
reduced code size and improved performance of isinrange functions for (u)long vector types
reduced code size of ushort vector >= and <= operators for SSE2 fallback code by ~75%
improved performance and reduced code size of SSE2 down-casting fallback code
Changes
API BREAKING CHANGE: The various boolean to integer/floating point conversion functions (touint8/tof32 etc.) are now renamed to contain C# types in their names (tobyte/tofloat etc.)
API BREAKING CHANGE: If you use this library as intended, meaning you import it and Unity.Mathematics.math statically (using static MaxMath.maxmath;) and you use the pow functions with scalar bases and scalar exponents in those scripts, you will encounter the first ever function call resolution ambiguity. It is strongly recommended to always use the maxmath.pow function, because it optimizes any pow call enormously if the exponent is a compile time constant, which does NOT necessarily mean that such a call must declare the exponent as a literal value - the exponent may become a compile time constant due to constant propagation
quarter is now a readonly struct
quarter to sbyte, short, int and long coversions are now required to be declared explicitly