-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor to use an array backend Part 1 #104
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Relevant criticism of the division implementation in the wild: https://skanthak.homepage.t-online.de/division.html |
…M ... during compilation)
…l files [skip-ci]
…ntestable in CI, a pain to maintain and an intermediate serialization step instead of casting is cheap
jangko
force-pushed
the
complete-refactor
branch
from
June 12, 2023 13:26
59dd896
to
63a3212
Compare
jangko
changed the title
[WIP] Refactor to use an array backend
Refactor to use an array backend Part 1
Jun 12, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a refactoring to bring use an array backend. The actual bit representation will be 100%-bit compatible with the previous one (i.e. casting is possible). And as Stint integers are as if uint64 were extended to uintXXX it should be compatible with the newly exposed iXXX from LLVM/Clang _ExtInt http://blog.llvm.org/2020/04/the-new-clang-extint-feature-provides.html
Status
The status takes into account passing a whole test suite file.
Very light changes may be needed if the test suite was using internal proc or representations like
swapBytes
orUintImpl
The test suite is run at both compile-time and run-time.
Tests were only done on littleEndian architectures
Unsigned integers
when using an array backend than a recursive backend
Signed integers
Overview
This refactor will bring significant speed and code size improvements that were tested in Constantine.
It also uses the `{.push raises: [].} pragmas to enforce error handling
Addition / Substraction
The codegen should be optimal in terms of codesize with a single ADD + ADC + ADC + ADC for uint256 on x86
This uses the intrinsics
addcarry_u64
andsubborrow_u64
for GCC/Clang/MSVC.With uint128 fallback for ARM/WASM. (TODO, use Clang-only intrinsics __builtin_addc and __builtin_subc)
In particular this solves #87.
Note that it is strongly recommended to use anything but GCC (i.e. Clang, MSVC, ICC) as GCC is the absolute worst compiler for multiprecision arithmetic, see https://gcc.godbolt.org/z/2h768y
GCC
Clang
Multiplication
Multiplication uses the Comba multiplication technique (a.k.a Product Scanning) that reorder the operations to significantly reduces the number of carries at the price of a temporary buffer of size 3 words and convoluted loop scheduling:
nim-stint/stint/private/uint_mul.nim
Lines 28 to 40 in 01ad29e
The benefits of comba multiplication is 30% speed improvement due to better register usage.
Also the code allows mul with high bit truncation, extended precision multiplication (
mul(256, 256) -> 512
) and also multiplication while only keeping the higher bits so that modular arithmetic can use a more efficient Barret Reduction. This should significantly speed up the EVM.A specific squaring operation will be added to accelerate exponentiation (~20%)
Modular arithmetic
IO
Serialization from hex will be made significantly faster.
Notation: "w" the number of words in the big int, b the number of bytes to read in the hex strings
BigInt multiplication is O(w²) and for each hex bytes read we do a multiplication for O(b * w²) Instead we can:
Solution 2 will be done (but solution 3 can be kept in mind if 2 passes are a bottleneck, which is very unlikely compared to bigint operations)
Future
The refactoring is incidentally removing all casts which should make JS target easy: Allow JS compilation #16
Also Clang/LLVM ExtInt (wide-integer) will be very interesting for optimized WASM code http://blog.llvm.org/2020/04/the-new-clang-extint-feature-provides.html
Non-power of 2 sizes should be easy to support
As we compile with stackTraces / lineTraces, we might want to deactivate them within Stint as any checks is likely to pollute the carry flag and significantly worsen the codegen in particular in the EVM.