Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Notes on Field Agnosticity #390

Open
DavePearce opened this issue Nov 26, 2024 · 5 comments
Open

Notes on Field Agnosticity #390

DavePearce opened this issue Nov 26, 2024 · 5 comments
Assignees

Comments

@DavePearce
Copy link
Collaborator

DavePearce commented Nov 26, 2024

Overview

The goal here is to allow support for different underlying prime fields. Currently, the prime field is assumed to be BLS12-377. That requires a minimum of 252bitsof storage, and can comfortably hold u128 values for addition without overflow. Thus, representing a u256 value (as found in the EVM) requires a vector of two field elements. The challenge with supporting different fields is that many of interest are much smaller:

  • (Goldilocks) Elements on the goldilocks field fit into a single u64 whose prime p is defined as 2^64 - 2^32 + 1 = 18446744069414584321. This means it can hold the result of adding four u62 values without overflow. Thus, in order to represent a u256 value, we would need a vector of at least five field elements.

  • (Baby Bear) Elements on the Baby Bear field fit into a single u32 whose prime p is defined as 15*2^27 + 1 = 450359962737049. This means it can hold the result of adding two u30 values without overflow. Thus, in order to represent a u256 value, _we would need a vector of at least 9 elements.

The goal here is to give some example implementations of common operations, such as addition, multiplication, normalisation, comparison, etc. This is loosely based on an original set of notes.

@DavePearce DavePearce self-assigned this Nov 26, 2024
@DavePearce DavePearce changed the title Support Field Agnosticity Notes on Field Agnosticity Nov 26, 2024
@DavePearce DavePearce reopened this Nov 27, 2024
@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Constraints

The existing constraints split u256 values into two u128 limbs, and are hardcoded with an assumption that the underlying field is large enough. For example, a common operation is a byte decomposition which aims to show that a given value fits into a u256. Let use assume that our value is split into the RES_HI and RES_LO columns. Then, the byte decomposition will require an additional four columns (BYTE_HI, BYTE_LO, ACC_HI, ACC_LO) and, for now, assume it will take 16 rows (i.e. because a u128 has 16 bytes). Then, an example decomposition of RES_LO looks like this:

Row RES_LO BYTE_LO ACC_LO
0 0xFFEEDDCCBBAA99887766554433221100 0xFF 0xFF
1 0xFFEEDDCCBBAA99887766554433221100 0xEE 0xFFEE
2 0xFFEEDDCCBBAA99887766554433221100 0xDD 0xFFEEDD
3 0xFFEEDDCCBBAA99887766554433221100 0xCC 0xFFEEDDCC
...
14 0xFFEEDDCCBBAA99887766554433221100 0x11 0xFFEEDDCCBBAA998877665544332211
15 0xFFEEDDCCBBAA99887766554433221100 0x00 0xFFEEDDCCBBAA99887766554433221100

What we see is that the ACC_LO accumulates the values of BYTE_1 until it matches the original value of RES_LO. Since this accumulation must be done within 16 rows, it proves that RES_LO does indeed fit within a u128. A similar thing is done for RES_HI. Also, we can optimise this to reduce the number of rows required for small values. For example, if RES_LO actually only holds say 0x4c22 then we can do the decomposition in two rows.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Word Size

Given a field 𝔽 which can hold at least n bits of information, the question is what word size should be chosen. That is, what the size of each limb making up a u256 should be. For the BLS12-377 curve, the word size was u128 and thus two limbs are required for each u256 value. Different word sizes could be chosen, however, and these have different trade offs. For example, a word size could be chosen to ensure the underlying field element is big enough to hold the result of multiplying two limbs (i.e. u128) values without causing overflow. That is quite a strong assumption, and probably not optimal in most cases.

(Configurations) Generally speaking, we want to choose a word size such that the underlying field can hold the result of summing at least m words without overflow. For the Goldilocks field, for example, we could represent a u256 using a vector of five u52 limbs) --- this would allow the underlying field to hold the result of summing 4096 words without overflow. This ensures certain optimisations (see below) are possible. However, in contrast, suppose our prime field was 5 and our word size u4. Then, we could not even hold the result of summing two words without overlow --- meaning this configuration is not usable.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Addition

When adding two u52 bit values together, the result always fits within a u53 (where the most significant bit is also called the overflow bit). From electronic circuits, a common strategy for chaining adders together is a so-called ripple-carry adder. At a high-level it looks like this:

RippleCarryAdder(1)

Here, we are adding two values X+Y=Z where each argument is split across four limbs. The carry bit cN for each adder is the overflow bit from the addition. We can then implement our adder as follows:

(defcolumns
  (X3 :i16@prove) (X2 :i16@prove) (X1 :i16@prove) (X0 :i16@prove)
  (Y3 :i16@prove) (Y2 :i16@prove) (Y1 :i16@prove) (Y0 :i16@prove)
  (C3 :i1@prove)  (C2 :i1@prove)  (C1 :i1@prove)  (C0 :i1@prove)
  (Z3 :i16)       (Z2 :i16)       (Z1 :i16)       (Z0 :i16))

(defconst OVERFLOW 65536)

(defpurefun (ADDER cin arg1 arg2 out cout)
  (eq! (+ out (* cout OVERFLOW)) (+ arg1 arg2 cin)))

(defconstraint X_Y_Z ()
  (begin
   (ADDER 0 X0 Y0 Z0 C0)
   (ADDER C0 X1 Y1 Z1 C1)
   (ADDER C1 X2 Y2 Z2 C2)
   (ADDER C2 X3 Y3 Z3 C3)))

Here, columns for X and Y are marked as i16@prove to ensure these values fit within a u16. However, if this was part of a larger circuit where it was already known that X and Y were valid u16 then this could be avoided.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Zero Test

The zero test checks whether a given expression is zero or not. When a given u256 value is split up into n limbs it must check each limb is zero. When the underlying word size is large enough to sum all the limbs without overflow, this can be optimised to a single sum. The following illustrates:

(defcolumns
  (X3 :i16@prove) (X2 :i16@prove) (X1 :i16@prove) (X0 :i16@prove)
  (RES :binary@prove)
)

(defpurefun ((IS-ZERO :binary@loob :force) w3 w2 w1 w0) (+ w3 w2 w1 w0))

(defconstraint c1 ()
  (if (IS-ZERO X3 X2 X1 X0)
      (vanishes! RES) (eq! RES 1)))

In this example, we are assuming that the sum (+ w3 w2 w1 w0) cannot overflow the underlying field element. In such case, the only way for that sum to be 0 is if all the words w3 .. w0 are themselves 0.

@DavePearce
Copy link
Collaborator Author

DavePearce commented Nov 28, 2024

Normalisation

The normalisation operator projects a given value X into a binary value Y, such that Y==0 iff X==0 and Y==1 otherwise. Thus, it Y tells us when X was zero or non-zero. As above, when the underlying word size is large enough to sum all the limbs without overflow, this can be optimised to the normalisation of a single sum. The following illustrates:

(defcolumns
  (X3 :i16@prove) (X2 :i16@prove) (X1 :i16@prove) (X0 :i16@prove)
  (RES :binary@prove)
)

(defpurefun ((NORM :binary) w3 w2 w1 w0) (~ (+ w3 w2 w1 w0)))

(defconstraint c1 ()
  (eq! RES (NORM X3 X2 X1 X0)))

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant