Numerical ops #3

skaller · 2022-12-21T02:01:57Z

So I need to think about how to deal with LLVM and WASM numerical ops. The mapping needs to be mapped out with rigorous math. The core issue is that LLVM and WASM has signed division but midenVM does not. There are also other issue related to handling other popular sizes including 8 and 16 bits.

The mathematical core of this problem is that a subrange of integers 0..n-1 is a ring and can be equiped with the operations add, mul, udiv, umod, neg, sub where sub a - b = a + -b, and -b = b + n mod n, and the other operations are their natural values modulo n. udiv and mul are unrelated. We also have the usual comparisons.

Then we have signed integers which are some subrange of integers such as -n .. +n roughly. The operations on these are defined by the operations on integers, and fail if the bounds are exceeded. In the special case where the range has a power of 2 elements, the range -2^k .. 2^k-1 fits in 2k bits and with the equality of zero representations, we get a representation called two's complement which LLVM and Wasm mandate.

With the two's complement representation, add, sub, neg, and mul coincide with a ring mod 2^2k. However, for larger values where the integer result is unrepresentable, the modular operations give the wrong result. Miden has checked versions of these operations to use when the element type is an integer approximation, rather than a ring.

However division and modulus do not work at all and two extra operations are required for signed division and modulus. LLVM uses generic operator with typed operands for the size ONLY, and extra codes nsw and nuw meaning no signed wrap and no unsigned wrap which hint at the correct interpretation somehow. Translation rules need to be determined.

There's also the issue of integers of other types. i1 is just bool and is no problem. However 8 bit and 16 bit integers as well as 64 bit will be generated from LLVM front ends like Rust or C.

So the final outcome of this issue must be a rigidly defined translation specification for LLVM (I think Wasm is secondary). That is: when you see THIS operand with THESE types, generate THIS miden assembly.

It is NOT immediately clear this is possible: that is, the compiler may need to do some "type inference" to figure out the upstream programmers actual intent.

The text was updated successfully, but these errors were encountered:

skaller · 2022-12-29T07:07:44Z

Notes.

The ring u32 has 4 operations: add, neg, mul, and divmod denoted in MVM by wrapping operations. The operations cannot fail and always give the correct answer except that divmod fails with a divisor of 0. [Comparisons and other ordering operations are available too]

The problem with this data type is it is almost entirely useless. It has two key uses:

In systems programming
For implementing rings of useful sizes, for example for array indexing

Generally, the programmer desires two types they cannot have:

Natural numbers
Integers

A sufficiently large ring can provide a small number of operations on a small number of values with results conforming to the corresponding abstract mathematical calculations. For natural numbers the supported operations are add, mul and divmod. Subtraction and negation are not readily available for natural numbers. Whilst divmod cannot fail except for a divisor of 0, add and mul can have an unrepresentable natural result. MVM provides three varieties of operations to handle this: checked operations detect overflows and report errors, whilst unchecked operations are more efficient, but may lead to undefined behaviour on overflow. Unchecked operations can be used safely in conjunction with a proof of correctness. There are also overflowing operations which may be useful for other purposes.

The representation of integers, on the other hand, is fraught with difficulties. The basic idea is that wrapping unsigned ring operations can model integers for a small number of small valued operations. Unfortunately, if the result of a computation is not representable, it is NOT possible to check with existing MVM instructions.

For example, if two medium sized values are added and the result goes past the half way mark, the result would be interpreted as a negative value. However the checked operations would still allow this because the checks for integers and natural numbers are distinct.

Conversely, the addition of two small negative numbers using u32 wrapping operations would give the correct value, but u32 checks would fail the addition as an overflow. Hence, to avoid gratuitous failures, the programmer would need to use wrapping u32 operations without checking to represent small integers.

The core difficulty here is defining a type system which would be able to statically check MVM programs, which is precursor to the real problem, which is to develop a type system which can statically account for LLVM IR programs, which in turn is the precursor to defining a type based mapping from LLVM operations to MVM operations.

In particular programmers of high level languages fed into LLVM regularly use signed integer types, even when they have no need for negative values, and often use them in lieu of a more correct type like an option type. Even worse, programs using high level type systems get munged by translators into encodings which safely abuse type systems to reduce the code to a lower level. In turn this means it is

VERY HARD IF NOT IMPOSSIBLE

to map arithmetic operations from LLVM to MVM correctly. In part this is because LLVM simply specifies that out of bounds values are poison and leaves it up to the back end (that's us!) to figure out what to do about it. The problem here is that one mans poison is another mans honey.

"I don't need to check my array index calculations because my arrays cannot be big enough for the calculations to overflow". So I can use wrapping or unchecked operations and the results will always be the same and always correct.

starkoracles assigned grjte Dec 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numerical ops #3

Numerical ops #3

skaller commented Dec 21, 2022 •

edited

Loading

skaller commented Dec 29, 2022 •

edited

Loading

Numerical ops #3

Numerical ops #3

Comments

skaller commented Dec 21, 2022 • edited Loading

skaller commented Dec 29, 2022 • edited Loading

skaller commented Dec 21, 2022 •

edited

Loading

skaller commented Dec 29, 2022 •

edited

Loading