-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numerical ops #3
Comments
Notes. The ring u32 has 4 operations: The problem with this data type is it is almost entirely useless. It has two key uses:
Generally, the programmer desires two types they cannot have:
A sufficiently large ring can provide a small number of operations on a small number of values with results conforming to the corresponding abstract mathematical calculations. For natural numbers the supported operations are The representation of integers, on the other hand, is fraught with difficulties. The basic idea is that wrapping unsigned ring operations can model integers for a small number of small valued operations. Unfortunately, if the result of a computation is not representable, it is NOT possible to check with existing MVM instructions. For example, if two medium sized values are added and the result goes past the half way mark, the result would be interpreted as a negative value. However the checked operations would still allow this because the checks for integers and natural numbers are distinct. Conversely, the addition of two small negative numbers using u32 wrapping operations would give the correct value, but u32 checks would fail the addition as an overflow. Hence, to avoid gratuitous failures, the programmer would need to use wrapping u32 operations without checking to represent small integers. The core difficulty here is defining a type system which would be able to statically check MVM programs, which is precursor to the real problem, which is to develop a type system which can statically account for LLVM IR programs, which in turn is the precursor to defining a type based mapping from LLVM operations to MVM operations. In particular programmers of high level languages fed into LLVM regularly use signed integer types, even when they have no need for negative values, and often use them in lieu of a more correct type like an option type. Even worse, programs using high level type systems get munged by translators into encodings which safely abuse type systems to reduce the code to a lower level. In turn this means it is VERY HARD IF NOT IMPOSSIBLE to map arithmetic operations from LLVM to MVM correctly. In part this is because LLVM simply specifies that out of bounds values are poison and leaves it up to the back end (that's us!) to figure out what to do about it. The problem here is that one mans poison is another mans honey. "I don't need to check my array index calculations because my arrays cannot be big enough for the calculations to overflow". So I can use wrapping or unchecked operations and the results will always be the same and always correct. |
So I need to think about how to deal with LLVM and WASM numerical ops. The mapping needs to be mapped out with rigorous math. The core issue is that LLVM and WASM has signed division but midenVM does not. There are also other issue related to handling other popular sizes including 8 and 16 bits.
The mathematical core of this problem is that a subrange of integers 0..n-1 is a ring and can be equiped with the operations add, mul, udiv, umod, neg, sub where sub a - b = a + -b, and -b = b + n mod n, and the other operations are their natural values modulo n. udiv and mul are unrelated. We also have the usual comparisons.
Then we have signed integers which are some subrange of integers such as -n .. +n roughly. The operations on these are defined by the operations on integers, and fail if the bounds are exceeded. In the special case where the range has a power of 2 elements, the range -2^k .. 2^k-1 fits in 2k bits and with the equality of zero representations, we get a representation called two's complement which LLVM and Wasm mandate.
With the two's complement representation, add, sub, neg, and mul coincide with a ring mod 2^2k. However, for larger values where the integer result is unrepresentable, the modular operations give the wrong result. Miden has checked versions of these operations to use when the element type is an integer approximation, rather than a ring.
However division and modulus do not work at all and two extra operations are required for signed division and modulus. LLVM uses generic operator with typed operands for the size ONLY, and extra codes
nsw
andnuw
meaningno signed wrap
andno unsigned wrap
which hint at the correct interpretation somehow. Translation rules need to be determined.There's also the issue of integers of other types. i1 is just bool and is no problem. However 8 bit and 16 bit integers as well as 64 bit will be generated from LLVM front ends like Rust or C.
So the final outcome of this issue must be a rigidly defined translation specification for LLVM (I think Wasm is secondary). That is: when you see THIS operand with THESE types, generate THIS miden assembly.
It is NOT immediately clear this is possible: that is, the compiler may need to do some "type inference" to figure out the upstream programmers actual intent.
The text was updated successfully, but these errors were encountered: