MVlookup.tex

%\documentclass[10pt,a4paper,oneside]{article}
% !TEX TS-program = pdflatex
% !TEX encoding = UTF-8 Unicode

\documentclass[11pt]{article}

%\usepackage[utf8]{inputenc}
%\usepackage[T1]{fontenc}
\usepackage{lmodern}

\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsmath}
\usepackage[total={7in,9in}]{geometry}
\usepackage{graphicx}
\usepackage{lmodern}
\usepackage[bookmarks, colorlinks=false, pdftitle={Lookup Arguments based on Logarithmic Derivatives}, pdfauthor={Ulrich Haboeck}]{hyperref}
\usepackage[colorinlistoftodos]{todonotes}
\usepackage{tikz}
\usepackage{titlesec}
\usepackage{float}
\usetikzlibrary{shapes, fit}
\setcounter{tocdepth}{4}
\setcounter{secnumdepth}{4}
\setlength{\marginparwidth}{3cm}


\usepackage{url}
\usepackage{amsthm}
\usepackage{mathrsfs}
\usepackage{nicefrac}

\usepackage[n,advantage,operators,sets,adversary,landau,probability,notions,logic,ff,mm,primitives,events, complexity,asymptotics,keys]{cryptocode}

\usepackage{listings}
\usepackage{footnote}

\definecolor{dkgreen}{rgb}{0,0.6,0}
\definecolor{gray}{rgb}{0.5,0.5,0.5}
\definecolor{mauve}{rgb}{0.58,0,0.82}

\lstset{%frame=tb,https://www.overleaf.com/project/608bc77c801b16bbadb2210a
  language=sh,
  aboveskip=3mm,
  belowskip=3mm,
  showstringspaces=false,
  columns=flexible,
  basicstyle={\small\ttfamily},
  numbers=none,
  numberstyle=\tiny\color{gray},
  keywordstyle=\color{blue},
  commentstyle=\color{dkgreen},
  stringstyle=\color{mauve},
  breaklines=true,
  breakatwhitespace=true,
  tabsize=3
}

\RequirePackage{etex}

% Theorem environments %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newtheorem{thm}{Theorem}[]
\newtheorem*{thm*}{Theorem}
\newtheorem{cor}{Corollary}[]
\newtheorem{lem}[]{Lemma}
\newtheorem{prop}[]{Proposition}
\newtheorem{conj}[]{Conjecture}
\newtheorem{protocol}[]{Protocol}

\theoremstyle{definition}
\newtheorem{defn}[thm]{Definition}
\newtheorem*{defn*}{Definition}

\theoremstyle{remark}
\newtheorem{rem}[thm]{Remark}
\newtheorem{rems}[thm]{Remarks}
\newtheorem{rem*}[]{Remark}

% MATH %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\newcommand{\Q}{\mathbb{Q}}
\newcommand{\R}{\mathbb{R}}
\newcommand{\C}{\mathbb{C}}
\newcommand{\Z}{\mathbb{Z}}
\DeclareMathOperator{\N}{\mathbb{N}}
\renewcommand{\PP}{\mathbf{P}}
\newcommand{\OO}{\mathcal{O}}


\DeclareMathOperator{\param}{\mathsf{Par}}
\DeclareMathOperator{\gen}{\mathsf{Gen}}
\DeclareMathOperator{\setup}{\mathsf{Setup}}
\DeclareMathOperator{\indexer}{\mathsf{Index}}
\DeclareMathOperator{\comm}{\mathsf{Com}}
\DeclareMathOperator{\open}{\mathsf{Open}}
\DeclareMathOperator{\prove}{\mathsf{Prove}}
\DeclareMathOperator{\extract}{\mathsf{Extract}}
\DeclareMathOperator{\simulate}{\mathsf{Sim}}
\DeclareMathOperator{\RS}{\mathsf{RS}}
\DeclareMathOperator{\FFT}{\mathsf{FFT}}
\DeclareMathOperator{\Quotient}{\mathsf{Quotient}}
\DeclareMathOperator{\agree}{\mathsf{agree}}

\renewcommand{\adv}{\mathsf{Adv}}


\author{%
Ulrich Hab{\"o}ck
\\\\
Orbis Labs
\\
\texttt{team@orbislabs.com}
}

\begin{document}
%\frontmatter
\title{%
Multivariate lookups based on
logarithmic derivatives
}
\date{%
\today\footnote{%
This updated version includes a comparison of the univarate variant of our approach to the closely related flookup proof of radical  (Section 5 in \cite{flookup}) as well as plookup \cite{Plookup}.
}
}
\maketitle


\begin{abstract}
Logarithmic derivatives translate products of linear factors into sums of their reciprocals, turning zeroes into simple poles of same multiplicity.
Based on this simple fact, we construct an interactive oracle proof for batch-column lookups over the boolean hypercube, which makes use of a single multiplicity function instead of working with a rearranged union of table and witnesses.
For single-column lookups the performance is comparable to the well-known \cite{Plookup} strategy used by Hyperplonk+ \cite{Hyperplonk}.
However, the real power of our argument unfolds in the case of batch lookups when multiple columns are subject to a single-table lookup:  
While the number of field operations is comparable to the Hyperplonk+ lookup (extended to multiple columns), the oracles provided by our prover are much less expensive. 
For example, for columns of length $2^{12}$, paper-pencil operation counts indicate that the logarithmic derivative lookup is between $1.5$ and $4$ times faster, depending on the number of columns.
\end{abstract}

%Keywords: SNARKs, recursive proofs, aggregation scheme

%\begin{KeepFromToc}
 \tableofcontents
%\end{KeepFromToc}

%\mainmatter
\section{Introduction}

Lookup arguments prove a sequence of values being member of an, often prediscribed, table. 
They are an essential tool for improving the efficiency of SNARKs for statements which are otherwise expensive to arithmetize. 
Main applications are lookups for relations of high algebraic complexity, and interval ranges which are extensively used by zero-knowledge virtual machines enforcing execution trace elements being valid machine words.
Although closely related to permutation arguments \cite{shuffle, RAMs}, a first explicit occurence of lookups dates back to \cite{Arya}.
The break-through was achieved by Plookup \cite{Plookup}, a permutation argument based argument which improved over the one from \cite{Arya} and provided the first solution for arbitrary tables. 
Since then Plookup (and slight variants of it) is \textit{the} general purpose lookup argument used in many practical applications, for example \cite{Aztek, Halo2, Arkworks, Cairo, Miden}. 
%That strategy, which we call the Plookup strategy in the sequel, uses a rearranged concatenation of witness and table sequence

In this paper we describe a lookup argument based which is based on logarithmic derivatives instead of permutation arguments. 
As in classical calculus, formal logarithmic derivatives turn products $\prod_{i=1}^N (X - z_i)$ into sums of their reciprocals, 
\[
\sum_{i=1}^N \frac{1}{X - z_i},
\]
having poles with the same multiplicity as the zeros of the product.
Working with poles instead of zeros is extremly useful for lookup arguments.
%This fractional decomposition of the logarithmic derivative is extremly useful for lookup arguments:
While strategies for arguments about radicals of products are far from obvious, they turn trivial using logarithmic derivatives.
Concretely, given a sequence of field elements $(a_i)_{i=1}^N$ and another sequence $(t_j)_{j=1}^M$, then $\{a_i: i =1,\ldots, N\}\subseteq \{t_j : j=1,\ldots, M\}$ as sets, if and only if there exists a sequence of field elements $(m_j)_{j=1}^M$ (the multiplicities) such that
\begin{equation*}
\label{e:intro:lookup:eq}
\sum_{i=1}^N \frac{1}{X - a_i}  = \sum_{j=1}^M \frac{m_j}{X - t_j}.
\end{equation*}
 (This holds under quite mild conditions on the field, see Lemma \ref{lem:batchsetmembership} for details.)
%In particular for batch-column lookups, where several columns are subject to the same table, 
Based on this fractional identity we construct lookup protocols which are more  efficient than the Plookup approach, which argues via a sorted union of witness and table sequence.
This is particularly true in the case of \textit{batch-column} lookups, where several sequences (``columns'') are subject to the same table lookup.
%Independent on the number of columns to be looked up, one still works with a single multiplicity function. 
In our lookup the oracle costs, measuring the number and sizes of the oracles the prover needs to provide, are significantly lower than for a lookup based on the Plookup strategy.
For large numbers of columns it is the half, while the arithmetic costs of the interactive oracle prover remain comparable.
%Although our implementation is not ready yet, we compare the two strategies by detailed operation counts, using a benchmark-backed measure for the cost of multi-scalar multiplications in terms of field multiplications.  
%For columns of length $2^{12}$, and considering an elliptic curve over a 256 bit large prime field, our counts indicate a speedup by a factor between $1.5$ and $2.5$, depending on $M$ the number of columns. 
%(For a large numbers of plumns this factor is independent of $M$ and about $1.8$.) 
%Although we describe our lookups in the multivariate setting, we point out  

We stress the fact that we are not the only ones who exploit fractional decompositions for lookups. 
Concurrently, improving the work of \cite{Caulk} and \cite{CaulkPlus}, Gabizon and Khovratovich \cite{flookup} describe a bilinear argument for ``large-table'' lookups, the proving-time of which is independent of the table size. 
Its main contribution is a univariate oracle proof for the radical of a witness sequence (i.e. the sequence with multiple occurences removed) which is almost identical to our approach\footnote{%
Almost simultaneously, G. Roh, W. Dai and A. He published the same idea in a blog post \cite{DaiFlookupBlog}.
To underpin that our work was not influenced by concurrent efforts we refer to the verified commit on our Github repository, 
\url{https://github.com/Orbis-Tertius/MVlookups/commit/4a8816a04e8107e05d4bb0897ee52e72210c84b8} 
}.
Instead of using logarithmic derivatives, their prover explicitly provides the polynomial $R_T(X) = \sum_{j=1}^M m_j \cdot \frac{v_T(X)}{X - t_j}$, where $v_T(X)= \prod_{j=1}^M (X - t_j)$ is the precomputed table polynomial, and shows the identity 
\[
\sum_{i=1}^N \frac{v_T(X)}{X - a_i}  = R_T(X).
\]
While the oracle costs, measured by the number and sizes of the polynomials, are less than in our argument, that advantage is traded for the computation of $R_T(X)$ in the ring of polynomials, which in general consumes $O(M\cdot\log^2 M)$ field operations.
We give comparison with our approach (taken over to the univariate setting) in the appendix. 

This document focuses on multivariate lookups with asymptotic linear prover costs.  
In particular, we consider batch-column lookups with respect to a single, in practice medium-sized table, a use case that is extensively used in execution trace proofs.
%The document s organized as follows. 
In Section \ref{s:preliminaries}, we gather the preliminaries used in the sequel: 
The Lagrange kernel over the boolean hypercube,  basic facts on the formal logarithmic derivative, and a summary of the multivariate sumcheck argument. 
Besides that, we introduce Lagrange interactive oracle proofs, an oracle model we consider the best fit for arguments which are based on the Lagrange representation of polynomials rather than their coefficients.   
In Section \ref{s:lookups} we describe our batch-column lookup based on the logarithmic derivative. 
The protocol comes in two variants, one for a ``small'' number of columns, and another one which performs better for large numbers of columns (and which is asymptotically linear in the instance size). 
For comparison reasons, we add an extra section (Section \ref{s:hyperplonk}) in which we sketch batch-column lookups using the Plookup strategy, adapted to the boolean hypercube. 
These rely on the time shift from Hyperplonk \cite{Hyperplonk}, and we consider them state-of-the-art in the multivariate setting.
%Eventually, in Appendix \ref{s:appendix} we recap the univariate Plookup argument, sketch inner product arguments for Lagrange queries,  and show how to turn the multivariate KZG \cite{Kate} commitment scheme into a scheme for Lagrange queries, which does not access the coefficients at any point in time.

We finally point out, that although our protocols are written for the multilinear setting,  their translation into univariate proofs is straight-forward.
A comparison of the univariate protocol with plookup and the proof of radical from \cite{flookup} is given in Appendix \ref{s:appendix}.


\section{Preliminaries}
\label{s:preliminaries}

\subsection{The Lagrange kernel of the boolean hypercube}

Let $F$ denote a finite field, and $F^*$ its multiplicative group.
Throughout the document we regard the boolean hypercube $H= \{\pm 1\}^n$  as a multiplicative subgroup of  $(F^*)^n$.
For a multivariate function $f(X_1,\ldots, X_n)$, we will often use the vector notation $\vec X = (X_1,\ldots, X_n)$ for its arguments, writing $f(\vec X) := f(X_1,\ldots, X_n)$.
 

%Given a function $f: H\rightarrow F$, its \textit{Langrange interpolation} is the unique multilinear polynomial $p(\vec X)$ in $\vec X = (X_1,\ldots, X_n)$ such that $p(\vec x) = f(\vec x)$ for every $\vec x\in H$.
%As $H$ has an increasing sequence of subgroups $\{1\} = H_0\subset H_1 \subset \ldots \subset H_n = H$, each $H_i$ having order $|H_i| = 2^i$,  Lagrange interpolation can be done in only $2^n$ field multiplications\footnotemark and $n\cdot 2^n$ additions and substractions, compared to $n\cdot 2^{n-1}$ multiplications and additions as for univariate interpolation from order $2^n$ subgroups of $F^*$.
%\footnotetext{%
%The field multiplications are due to the normalization of $f$ by the factor $\frac{1}{2^n}$, and can be entirely omitted an IOP. 
%One ``commits'' to the non-normalized Lagrange interpolant and corrects the queried evaluations.
%}%
%See Appendix \ref{s:appendix} for details. 
The \textit{Lagrange kernel} of $H$ is the multilinear polynomial
\begin{equation}
\label{e:LagrangeKernel}
L_H(\vec X, \vec Y)  = \frac{1}{2^n}\cdot \prod_{j=1}^n (1 + X_j\cdot Y_j).
\end{equation}
Notice that $L_H(\vec X, \vec Y)$ is symmetric in $\vec X$ and $\vec Y$, i.e. $L_H(\vec X, \vec Y)=L_H(\vec Y, \vec X)$, and that \eqref{e:LagrangeKernel} is evaluated within only $\bigO{\log|H|}$ field operations.
Whenever $\vec y \in H$ we have that $L_H(\vec X, \vec y)$ is the Lagrange polynomial on $H$, which is the unique multilinear polynomial which satisfies $L_H(\vec x, \vec y) = 1$ at $\vec x = \vec y$, and zero elsewhere on $H$.
In particular for a function $f: H\rightarrow F$  the inner product evaluation formula 
\[
\left\langle f ,L_H(\:.\:, \vec y)\right\rangle_H := \sum_{\vec x\in H} f(\vec x) \cdot L_H(\vec x, \vec y) = f(\vec y).
\]
is valid for every $\vec y\in H$.
This property extends beyond $H$, as the following Lemma shows.
\begin{lem}
\label{lem:Lagrange}
Let $p(\vec X)$ be the unique multilinear extension of $f: H\rightarrow F$. 
Then for every $\vec y\in F^n$,
\begin{equation}
\label{e:LagrangeScalarProduct}
\left\langle f ,L_H(\:.\:, \vec y)\right\rangle_H = \sum_{x\in H} f(\vec x) \cdot L_H(\vec x, \vec y) = p(\vec y).
\end{equation}
\end{lem}
\begin{proof}
%This is straight forward from the Lagrange representation $p(X)=\sum_{\vec z\in H} f(\vec z) \cdot L_H(\:.\:, \vec z)$.
Since $p(\vec y) = \sum_{\vec x\in H} f(\vec z)\cdot L_H(\vec X,\vec z)$, it suffices to show the claim for $p(X) = L_H(\vec X,\vec z)$, with $\vec z\in H$.
By the property of $L_H(\vec X,\vec z)$, we have $\big\langle L_H(\:.\:, \vec z), L_H(\:.\:,\vec y) \big\rangle_H =L_H(\vec y,\vec z)$, which by symmetry is equal to $L_H(\vec X,\vec y)$ at $\vec X=\vec z$.
This completes the proof of the Lemma.
\end{proof}


%This is tantamount to the tensor-query to point-query paradigm from \cite{TensorIOP} used in a line of work, e.g. \cite{TensorCodes, TensorR1CS, TensorRothblum, TensorR1CSarbitraryF}.
Note that for any $\vec y\in F^n$, the domain evaluation of $L_H(\vec X, \vec y)$ over $H$ can be computed in 
$\bigO{|H|}$ field operations, by recursively computing the domain evaluation of the partial products 
$
p_k(X_1,\ldots, X_k, y_1,\ldots, y_k)= \frac{1}{2^n}\cdot \prod_{j=1}^k (1 + X_j\cdot y_j)
$ 
over $H_k =\{\pm 1\}^k$ from the domain evaluation of $p_{k-1}$, where one starts with $f_0 = \frac{1}{2^n}$ over the single-point domain $H_0$.
Each recursion step costs $|H_{k-1}|$ field multiplications, denoted by $\mathsf M$, and the same number of additions, denoted by $\mathsf A$,  yielding overall
\begin{equation}
\label{e:lagrange:cost}
\sum_{k=1}^{n} |H_{k-1}| \cdot (\textsf M + \textsf A) < |H| \cdot  (\textsf M + \textsf A).
\end{equation}


\subsection{The formal derivate}

Given a univariate polynomial $p(X) =\sum_{k=0}^{d} c_k\cdot X^k$ over a general (possibly infinite) field $F$, its \textit{derivative} is defined as 
\begin{equation}
\label{e:DerivativePoly}
p'(X) := \sum_{k=1}^{d} k \cdot c_k \cdot X^{k-1}.
\end{equation}
As in calculus, the derivative is linear, i.e. for every two polynomials $p_1(X), p_1(X)\in F[X]$, and coefficients $\lambda_1,\lambda_2\in F$,
\begin{equation*}
%\label{e:DerivativeLinear}
(\lambda_1 \cdot p_1(X) + \lambda_2 \cdot p_1(X))' = \lambda_1\cdot p_1'(X) + \lambda_2\cdot p_2'(X)
\end{equation*}
 and we have the product rule
\begin{equation*}
%\label{e:ProductRule}
(p_1(X)\cdot p_2(X))' = p_1'(X)\cdot p_2(X) + p_1(X)\cdot p_2'(X).
\end{equation*}
For a function $\frac{p(X)}{q(X)}$ from the rational function field $F(X)$, the derivative is defined as the rational function
\begin{equation}
\label{e:DerivativeQuotient}
\left(\frac{p(X)}{q(X)}\right)' := \frac{p'(X)\cdot q(X) - p(X)\cdot q'(X)}{q(X)^2}.
\end{equation}
By the product rule for polynomials, the definition does not depend on the representation of $\frac{p(X)}{q(X)}$.
Both linearity as well as the product rule extend to rational functions. 
%For a proof of these facts, as well as alternative definitions for the formal derivative, we refer to standard literature. 

For any polynomial $p(X)\in F[X]$, if $p'(X)=0$ then $p(X)= g(X^p)$ where $p$ is the characteristic of the field $F$.
In particular, if $\deg p(X) < p$, then the polynomial must be constant.
As the analogous fact for fractions is not as commonly known, we give a proof of the following lemma.

\begin{lem}
\label{lem:DerivativeFraction}
Let  $F$ be a field of characteristic $p\neq 0$, and $\frac{p(X)}{q(X)}$ a rational function over $F$ with both  $\deg p(X) < p$ and $\deg q(X) < p$.
If the formal derivative $\left(\frac{p(X)}{q(X)}\right)' = 0$, then $\frac{p(X)}{q(X)} = c$ for some constant $c\in F$.
\end{lem}

\begin{proof}
If $q(X)$ is a constant, then the assertion of the Lemma follows from the corresponding statement for polynomials.
Hence we assume that $\deg q(X)>0$.
Use polynomial division to obtain the representation
\[
\frac{p(X)}{q(X)} = m(X) + \frac{r(X)}{q(X)},
\]
with $m(X), r(X) \in F[X]$, $\deg m(X) \leq \deg p(X)$, and $\deg r(X) < \deg q(X)$ whenever $r(X)\neq 0$.
By linearity of the derivative, we have
$
0 =  \left(\frac{p(X)}{q(X)}\right)' = m'(X) + \left(\frac{r(X)}{q(X)}\right)',
$
and therefore
%\[
%\frac{r'(X)\cdot q(X) - r(X)\cdot q'(X)}{q(X)^2} = - m'(X)
%\]
\begin{equation}
\label{e:der}
r'(X)\cdot q(X) - r(X)\cdot q'(X) = - m'(X)\cdot q(X)^2.
\end{equation}
Comparing the degrees of left and right hand side in \eqref{e:der}, we conclude that  $m'(X) = 0$.
Since $\deg m(X) \leq  \deg p(X) < p$ we have $m(X)= c$ for some constant\footnotemark $c\in F$. 
\footnotetext{%
For general degrees of $p(X)$ we would only be able to conclude that $m(X) = g(X^p)$ for some polynomial $g(X)$. 
}% 
Furthermore, if we had $r(X)\neq 0$ then the leading term of the left hand side in \eqref{e:der} would be
\[
%m\cdot d_m \cdot X^{m-1}\cdot c_n \cdot X^n + d_m \cdot X^{m-1}\cdot n\cdot  c_n \cdot X^{n-1} = 
(k - n) \cdot c_n\cdot d_{k} \cdot X^{n + k - 1},
\]
with $c_n \cdot X^n$, $n>0$, being the leading term of $q(X)$, and  $d_k \cdot X^k$, $0\leq k < n$, the leading term of $r(X)$.
As  $0 < n - k < p$, and both $c_n\neq 0$ and $c_m\neq 0$, the leading term of the left hand side of \eqref{e:der} would not vanish.
Therefore it must hold that  $r(X) = 0$ and the proof of the lemma is complete.
\end{proof}


\subsection{The  logarithmic derivative}

The \textit{logarithmic derivate} of a polynomial $p(X)$ over a (general) field $F$ is the rational function
\begin{equation*}
\frac{p'(X)}{p(X)}.
\end{equation*}
Note that the logarithmic derivative of the product $p_1(X)\cdot p_2(X)$ of two polynomials $p_1(X), p_2(X)$ equals the sum of their logarithmic derivatives, since by the product rule we have 
\[
\frac{(p_1(X)\cdot p_2(X))'}{p_1(X)\cdot p_2(X)} = \frac{p_1'(X)\cdot p_2(X) + p_1(X)\cdot p_2'(X)}{p_1(X)\cdot p_2(X)} 
= \frac{p_1'(X)}{p_1(X)} + \frac{p_2'(X)}{p_2(X)}.
\]
In particular the logarithmic derivative of a product $p(X) = \prod_{i=1}^n (X + z_i)$, with each $z_i\in F$, is equal to the sum
\begin{equation}
\label{e:LogDerivativeProduct}
\frac{p'(X)}{p(X)} %= \sum_{i=1}^n \frac{\prod_{j\in \{1,\ldots, n\}\setminus \{i\}} (X - z_j) }{p(X)} 
= \sum_{i=1}^n \frac{1}{X + z_i}.
\end{equation}

The following lemma is a simple consequence of Lemma \ref{lem:DerivativeFraction} and essentially states that, under quite mild conditions on the field $F$, if two normalized polynomials have the same logarithmic derivative then they are equal. 
We state this fact for our use case of product representations.
\begin{lem}
\label{lem:LogarithmicDerivative}
Let $(a_i)_{i=1}^n$ and $(b_i)_{i=1}^n$ be sequences  over a field $F$ with characteristic $p > n$. 
Then
$
\prod_{i=1}^n \left(X + a_i \right) =\prod_{i=1}^n \left(X + b_i \right)
$
in $F[X]$ if and only if  
\begin{equation*}
%\label{e:LogDerivativeSum}
\sum_{i=1}^n \frac{1}{X + a_i} =\sum_{i=1} ^n\frac{1}{X + b_i}
\end{equation*}
in the rational function field $F(X)$.
\end{lem}

\begin{proof}
If  $p_a(X) = \prod_{i=1}^n \left(X + a_i\right)$ and $p_b(X) = \prod_{i=1}^n \left(X + b_i\right)$
coincide, so do their logarithmic derivatives.
To show the other direction, assume that 
$
\frac{p_a'(X)}{p_a(X)}  = \frac{p_b'(X)}{p_b(X)}.
$
Then 
\[
\left(\frac{p_a(X)}{p_b(X)}\right)'  = \frac{p_a'(X)\cdot p_b(X) - p_a(X)\cdot  p_b'(X)} {p_b^2(X)} = 0.
\]
Hence by Lemma \ref{lem:DerivativeFraction} we have $\frac{p_a(X)}{p_b(X)} = c$ for some constant  $c \in F$.
As both $p_a(X)$ and $p_b(X)$ have leading coefficient equal to $1$, we conclude that $c =1$, and the proof of the Lemma is complete.
\end{proof}
\begin{rem}
\label{rem:LogarithmicDerivativeFunctionField}
We stress the fact that Lemma \ref{lem:LogarithmicDerivative} also applies to the case where $F$  is the function field $F_p(Y_1,\ldots, Y_k)$ over a finite field $F_p$ of characteristic $p$.
This observation will be useful when generalizing the permutation argument to the case where $a_i$ and $b_i$ are multilinear polynomials in $Y_1, \ldots, Y_n$.
\end{rem}

Given a product $p(X)=\prod_{i=1}^N (X + a_i)$ we can gather the poles of its logarithmic derivative obtaining the fractional decomposition  
\begin{align*}
\frac{p'(X)}{p(X)} = \sum_{a\in F} \frac{m(a)}{X + a},
\end{align*}
where $m(a)\in \{1,\ldots, N\}$ is the multiplicity of the value $a$ in $(a_i)_{i=1}^N$.
Fractional decompositions are unique, as shown by the following lemma.
\begin{lem}
\label{lem:UniqueFractionalRep}
Let $F$ be an arbitrary field and $m_1, m_2: F\rightarrow F$ any functions.
Then
$
\sum_{z\in F} \frac{m_1(z)}{X - z} = \sum_{z\in F} \frac{m_2(z)}{X - z}
$
in the rational function field $F(X)$, if and only if $m_1(z)=m_2(z)$ for every $z\in F$.
\end{lem}
\begin{proof}
Suppose that the fractional decompositions are equal.  
Then $\sum_{z\in F} \frac{m_1(z)-m_2(z)}{X - z} = 0$, and therefore
\[
p(X) = \prod_{w\in F} (X - w)\cdot\sum_{z\in F} \frac{m_1(z)-m_2(z)}{X - z} = \sum_{z\in F} (m_1(z)-m_2(z))\cdot \prod_{w\in F\setminus\{z\}} (X - w) = 0.
\]
In particular,
$
p(z) = (m_1(z) - m_2(z)) \cdot  \prod_{w\in F\setminus\{z\}} (z - w)= 0
$
for every $z\in F$.
Since   $\prod_{w\in F\setminus\{z\}} (z - w) \neq 0$ we must have $m_1(z)  = m_2(z)$ for every $z\in F$. 
The other direction is obvious.
\end{proof}

This leads to the following algebraic criterion for set membership, which is the key tool for our lookup arguments.
\begin{lem}[Set inclusion]
\label{lem:batchsetmembership}
Let $F$ be a field of characteristic $p>N$, and suppose that $(a_i)_{i=1}^N$, $(b_i)_{i=1}^N$ are arbitrary sequences of field elements.
Then $\{a_i \}\subseteq \{b_i\}$ as sets (with multiples of values removed), if and only if there exists a sequence $(m_i)_{i=1}^N$ of field elements from $F_q\subseteq F$ such that
\begin{equation}
\label{e:fracs}
\sum_{i=1}^N \frac{1}{X + a_i} = \sum_{i=1}^N \frac{m_i}{X + b_i}  
\end{equation}
in the function field $F(X)$.
Moreover, we have equality of the sets $\{a_i\} = \{b_i\}$, if and only if $m_i\neq 0$, for every $i=1,\ldots, N$.
\end{lem}

\begin{proof}%[Proof of Lemma \ref{lem:batchsetmembership}]
Let us denote by $m_a(z)$ the multiplicity of a field element $z$ in the sequence $(a_i)_{i=1}^N$.
Likewise, we do for $(b_i)_{i=1}^N$.
Note that since $N < p$, the multiplicities can be regarded as non-zero elements from $F_p$ as a subset of $F$.
Suppose that $\{a_i\}\subseteq \{b_i\}$ as sets. 
Set $(m_i)$ as the normalized multiplicities
$
m_i = \frac{m_a(b_i)}{m_b(b_i)}.
$
This choice of $(m_i)$ obviously satisfies \eqref{e:fracs}.

Conversely, suppose that \eqref{e:fracs} holds.
Collecting fractions with the same denominator we obtain fractional representations for both sides of the equation \eqref{e:fracs},  
\begin{align*}
\sum_{i=1}^N \frac{1}{X + a_i} &= \sum_{z\in F} \frac{m_a(z)}{X + z},
\\
\sum_{i=1}^N \frac{m_i}{X + b_i} & = \sum_{z\in F} \frac{\mu (z)}{X + z}.
\end{align*}
Note that since $N < p$, we know that for each $z\in \{a_i\}$ we have $m_a(z)\neq 0$. 
By the uniqueness of fractional representations,  Lemma \ref{lem:UniqueFractionalRep}, $m_a(z) = \mu(z)$ for every $z\in \{a_i\}$, and therefore each $z\in \{a_i\}$ must occur also in $\{b_i\}$. 
\end{proof}


\subsection{Lagrange interactive oracle proofs}

The oracle proofs of many general purpose SNARKs such as Plonk \cite{Plonk} or algebraic intermediate representations \cite{Starks} rely on witnesses that are given in Lagrange representation, i.e. by their values over a domain $H$.
Their multivariate variants may completely avoid the usage of fast Fourier transforms whenever the polynomial commitment scheme can be turned into one that does not need to know the coefficients, neither when computing a commitment nor in an opening proof.
Exactly this property is captured by \textit{Lagrange oracle proofs}, rather than polynomial ones \cite{Dark}.

A \textit{Lagrange interactive oracle proof} (\textit{Lagrange IOP}) over the boolean hypercube $H=\{\pm 1\}^n$ is an interactive protocol between two parties, the ``prover'' and the ``verifier''.
In each round, the verifier sends a message (typically a random challenge) and the prover computes one or several functions over the boolean hypercube, and gives the verifier oracle access to them.
From the moment on it is given access, the verifier is allowed to query the oracles for their inner products with the Lagrange kernel $L_H(\:.\:, \vec y)$, associated with an arbitrary vector $\vec y\in F^n$. 

The security notions for Lagrange IOPs, such as completeness, (knowledge) soundness, and zero-knowledge, are exactly the same as for other interactive oracle proofs.
We assume that the reader is familiear with these, and refer to \cite{IOPs} or \cite{Dark} for their formal definitions.

Lagrange IOPs are turned into arguments by instantiating the Lagrange oracles by a  \textit{Lagrange commitment scheme}.
A Lagrange commitment scheme is a commitment scheme for functions over $H$ that comes with an evaluation proof for Lagrange queries.
For example, inner product arguments \cite{BootleGroth} can be directly used to construct Lagrange commitment schemes,
% (see Appendix \ref{s:appendix: IPA}), 
but also the multilinear variant \cite{MVKZG} of the \cite{Kate} commitment scheme is easily modified to completely avoid dealing with coefficients. 
We suppose that this is well-known, and therefore we omit an explicit elaboration in this paper.


\subsection{The sumcheck protocol}

We give a concise summary on the multivariate sumcheck protocol \cite{sumcheck}.
Given a multivariate polynomial $p(X_1,\ldots, X_n)\in F[X_1,\ldots, X_n]$, a prover wants to convince a verifier upon that
\begin{equation*}
s = \sum_{(x_1,\ldots, x_n) \in \{\pm 1\}^n} p(x_1, \ldots, x_n).
\end{equation*}
This is done by a random folding procedure which, starting with $H_0=\{\pm 1\}^n$, which stepwise reduces a claim on the sum over $H_i = \{\pm 1\}^{n-i}$, $i=0,\ldots, n-1$, to one over the hypercube $H_{i+1}$ of half the size. 
Eventually, one ends up with a claim over a single-point sum, which is paraphrased as the value of $p(X_1,\ldots, X_n)$ at a random point $(r_1,\ldots, r_n)\in F^n$ sampled in the course of the reduction steps.

%The reduction principle is best explained in the case of multilinear polynomials $p(X_1,\ldots, X_n)$.
%In the first round the prover provides the values of the partial sums
%\[
%s_1(x_1) = \sum_{(x_2,\ldots, x_n) \in \{\pm 1\}^{n-1}} p(x_1, x_2, \ldots, x_n),
%\]
%for $x_1\in\{\pm 1\}$, which the verifier checks to sum up to the claimed value, i.e. $v= s_1(-1) + s_1(+1)$. 
%If so,  the verifier samples a random $r_1\sample F$ and both prover and verifier continue on the protocol on the linear combination
%\begin{align*}
%r_1 \cdot s_1(+1) + (1 - r_1) \cdot s_1(-1) &= r_1\cdot\hspace*{-1cm}\sum_{(x_2,\ldots, x_n) \in \{\pm 1\}^{n-1}}  p( +1 , x_2, \ldots, x_n) +  (1-r_1)\cdot\hspace*{-1cm}\sum_{(x_2,\ldots, x_n) \in \{\pm 1\}^{n-1}}  p( -1 , x_2, \ldots, x_n)
%\\
%&= \sum_{(x_2,\ldots, x_n) \in \{\pm 1\}^{n-1}} p(r_1 , x_2, \ldots, x_n),
%\end{align*}
%where the latter equality holds as $p(X_1,\ldots, X_n)$ is a linear polynomial in $X_1$.
%After $n$ reduction steps of this kind, the initial claim is eventually reduced to the evaluation claim for $p(X_1,\ldots, X_n)$ at 
%$(X_1,\ldots, X_n) = (r_1, \ldots, r_n)$.
 
%\begin{protocol}[Sumcheck protocol, \cite{sumcheck}]
%Let $p(X_1,\ldots, X_n)$ be a multivariate polynomial over a finite field $F$. %with individual degrees $d_i=\deg_{X_i} p(X_1,\ldots, X_n)$.
%The sumcheck protocol, in which a prover wants to convince the verifier upon the sum $s = \sum_{(x_1,\ldots, x_n) \in \{\pm 1\}^n} p(x_1, \ldots, x_n)$, is as follows.
%\begin{itemize}
%\item 
%In the first round $i=1$, the prover sends (the coefficients of) the univariate polynomial 
%\[
%s_1(X) = \sum_{(x_{2},\ldots, x_n) \in \{\pm 1\}^{n-1}} p(X ,x_{2}, \ldots, x_n)
%\]
%of degree $d_1\leq \deg_{X_1} p(X_1,\ldots, X_n)$, to the verifier.
%%(This polynomial is computed in linear time from its values over a set $D_1\supseteq \{\pm 1\}$ of size $|D_1| = d_1 + 1$.) 
%The verifier checks if 
%\[
%v = s_1(-1) + s_1(+1),
%\] 
%and if so it responds with a random challenge $r_1$ sampled uniformly from $F$.
%
%\item
%In each of the further rounds $i=2,\ldots, n$, the prover sends the univariate polynomial of degree $d_i \leq \deg_{X_i} p(X_1,\ldots, X_n)$ given by
%\[
%s_i(X) = \sum_{(x_{i+1},\ldots, x_n) \in \{\pm 1\}^{n-i}} p(r_1,\ldots, r_{i-1},X ,x_{i+1}, \ldots, x_n),
%\]
%where $r_1, \ldots, r_{i-1}$ are the randomnesses received in the previous rounds.
%%(Again, the computation is done by interpolation from the values over a set $D_i\supseteq \{\pm 1\}$ of size $|D_i| = d_i + 1$.)
%The prover sends the coefficients of $s_{i}(X)$ to the verifier, which checks whether 
%\[
%s_{i-1}(r_{i-1}) = s_{i}(+1) + s_{i}(-1).
%\] 
%If so, the verifier sends another random challenge $r_i\sample F$ to the prover.
%\end{itemize}
%After these rounds the verifier checks if $s_n(r_n) = p(r_1,\ldots, r_n)$. 
%If so, the verifier accepts (otherwise it rejects).  
%\end{protocol}

\begin{protocol}[Sumcheck protocol, \cite{sumcheck}]
\label{p:Sumcheck}
Let $p(X_1,\ldots, X_n)$ be a multivariate polynomial over a finite field $F$. %with individual degrees $d_i=\deg_{X_i} p(X_1,\ldots, X_n)$.
The sumcheck protocol, in which a prover wants to convince the verifier upon the sum $s = \sum_{(x_1,\ldots, x_n) \in \{\pm 1\}^n} p(x_1, \ldots, x_n)$, is as follows.
We write $s_0(X)$ for the constant polynomial $s_0 =s$.
\begin{itemize}
\item
In each round $i=1,\ldots, n$, the prover sends the coefficients of the univariate polynomial 
\[
s_i(X) = \sum_{(x_{i+1},\ldots, x_n) \in \{\pm 1\}^{n-i}} p(r_1,\ldots, r_{i-1},X ,x_{i+1}, \ldots, x_n),
\]
of degree $d_i \leq \deg_{X_i} p(X_1,\ldots, X_n)$, where $r_1, \ldots, r_{i-1}$ are the randomnesses received in the previous rounds. (In the first round $i=1$ there are no previous randomnesses, and $p(r_1,\ldots, r_{i-1},X ,x_{i+1}, \ldots, x_n)$ is meant to denote $p(X,x_2,\ldots, x_n)$.)
%(Again, the computation is done by interpolation from the values over a set $D_i\supseteq \{\pm 1\}$ of size $|D_i| = d_i + 1$.)
The prover sends the coefficients of $s_{i}(X)$ to the verifier, which checks whether the received polynomial $s_i(X)$ is in fact of the expected degree and that
\[
s_{i-1}(r_{i-1}) = s_{i}(+1) + s_{i}(-1).
\] 
(Again, in the first round $i=1$ there is no $r_0$, and the verifier checks wheather $s_0 = s_1(+1) + s_1(-1)$.) 
If so, the verifier samples random challenge $r_i\sample F$ uniformly from $F$ and sends it to the prover.
\end{itemize}
After these rounds the verifier checks that $s_n(r_n) = p(r_1,\ldots, r_n)$. 
If so, the verifier accepts (otherwise it rejects).  
\end{protocol}


Soundness of the sumcheck protocol is proven by a repeated application of the Schwartz-Zippel lemma. 
We omit a proof, and refer to \cite{sumcheck} or \cite{SumcheckThaler}. 
\begin{thm}[\cite{sumcheck}]
The sumcheck protocol (Protocol \ref{p:Sumcheck}) has soundness error
\begin{equation}
\label{e:SumcheckSoundness}
\varepsilon_{sumcheck} \leq \frac{1}{|F|}\cdot \sum_{i=1}^n \deg_{X_i} p(X_1,\ldots, X_n).
\end{equation}
\end{thm}

The sumcheck protocol is easily extended to the sumcheck for a batch of polynomials $p_i(X_1,\ldots, X_n)$, $i=0, \ldots, L$, by letting the verifier sample a random vector $(\lambda_1,\ldots, \lambda_L)\sample F^L$, and a subsequent sumcheck protocol for the random linear combination
\[
\bar p (X_1, \ldots, X_n) = p_0(X_1,\ldots, X_n) + \sum_{i=1}^{L} \lambda_i \cdot p_i(X_1,\ldots, X_n).
\]
The soundness error bound increases only slightly,
\begin{equation}
\label{e:BatchSumcheckSoundness}
\varepsilon_{sumcheck} \leq \frac{1}{|F|}\cdot \left(1 + \sum_{i=1}^n \deg_{X_i} p(X_1,\ldots, X_n)\right).
\end{equation}

%\begin{rem}
\subsubsection*{Computational cost}

Let us discuss the prover cost of the sumcheck protocol for the case that  $p(\vec X) = p(X_1,\ldots, X_n)$ is of the form
\[
p(\vec X) = Q(w_1(\vec X), \ldots, w_m(\vec X)),
\]
with each $w_i(\vec X)\in F[X_1,\ldots, X_n]$ being multilinear, and 
\[
Q(Y_1,\ldots, Y_m) = \sum_{(i_1,\ldots, i_m)\in \{0,1\}^m} c_{i_1,\ldots, i_m} \cdot Y_1^{i_1}\cdots Y_m^{i_m}
\]
 is a multivariate polynomial having (a typically low) absolute degree $d$.
We denote the arithmetic complexity, i.e. the number of field multiplications $\mathsf M$, substractions $\mathsf S$ and additions $\mathsf A$ to evaluate $Q$ by $|Q|_\textsf M$, $|Q_\textsf S|$ and $|Q|_\textsf A$, respectively.
Each of the univariate polynomials $s_i(X)$, $i=1,\ldots, n$, is of degree at most $d$ the absolute degree of $Q$, and is computed from its values over a set $D\supseteq \{\pm 1\}$ of  size $|D| = d + 1$.
In each step $i=1,\ldots, n$, the values of $s_i(z)$ for $z\in D$ are obtained by linear interpolation of the domain evaluations of each
\[
w_j (r_1,\ldots, r_{i-1}, \pm 1, X_{i+1}, \ldots, X_n)
\]
over $H_{i}=\{\pm 1\}^{n-i}$ as given from the previous step, to the domain evaluation
\[
w_j (r_1,\ldots, r_{i-1}, z, X_{i+1}, \ldots, X_n), 
\]
the values of which are used for computing $s_i(z) = \sum_{(x_{i+1},\ldots, x_n)\in H_{i}} Q(r_1,\ldots, r_{i-1}, z, x_{i+1}, \ldots, x_n)$.
Given the random challenge $r_i$ from the verifier, the domain evaluation of each   
\[
w_j(r_1,\ldots, r_{i-1}, r_i, X_{i+1},\ldots, X_n)
\]
is computed by another linear interpolation.
Linear interpolation costs $|H_i|$ multiplications and the same number of additions/substractions for each multilinear polynomial, the values of $Q$ are obtained within $|Q|_\textsf M \cdot \textsf{M} +   |Q|_\textsf S \cdot \textsf S + |Q|_\textsf A \cdot \textsf A$.  
In terms of field multiplications $\mathsf M$, substractions $\mathsf S$ and additions $\mathsf A$, step $i$ consumes 
% Interpolation of m multilinear polynomials for |D_i| - 2 + 1 many points:
%     m*  |H_i| S for the differences
%     m * (|D_i| - 2) * |H_i| (M + A) for the domain evals for every z in D_i \setminus\{\pm 1\} 
%     m * |H_i| (M + A) for the domain eval at r_i
% Domain evaluation for Q at each point in D_i
%   |D_i| * |H_i| * (Q_M * M + Q_S * S + Q_A * A)
% Sum over |H_i| for Q at each point z in D_i
%  |D_i| * |H_i| * A
\begin{align*}
m\cdot |H_i|\cdot \textsf S + m \cdot (|D| - 1)\cdot |H_{i}| \cdot (\textsf M + \textsf A)
+  |D|\cdot |H_{i}| \cdot ( |Q|_\textsf M \cdot \textsf{M} +   |Q|_\textsf S \cdot \textsf S + |Q|_\textsf A \cdot \textsf A) 
+ |D|\cdot |H_{i}| \cdot \textsf A,
%\\
%< |D|  \cdot |H_{i}| \cdot \big((m + |Q|_M)\cdot\textsf M +  (m+ |Q|_\textsf S) \cdot\textsf S + (m + |Q|_\textsf A + 1)\cdot \textsf A\big).
\end{align*}
where the last term is for the domain sums.
Since $\sum_{i=1}^{n} |H_{i}| = |H| - 1$, the overall cost for the prover is bounded by 
\begin{equation}
\label{e:sumcheck:cost:precise}
|H|\cdot \left(1-\frac{1}{|H|}\right)\cdot \big( (d\cdot m + (d+1)\cdot |Q|_\textsf M)\cdot\textsf M +
 (m + (d + 1)\cdot |Q|_\textsf S) \cdot\textsf S +
(d\cdot m + (d+1)\cdot (|Q|_\textsf A + 1))\cdot\textsf A
\big).
\end{equation}
We shall use this formula for the operation counts of our lookup protocol.


\section{Lookups based on the logarithmic derivative}
\label{s:lookups}

Assume that $F$ is a finite field, and that $f_1, \ldots, f_M$ and  $t: H\rightarrow F$ are functions over the Boolean hypercube $H=\{\pm 1\}^n$. 
By Lemma \ref{lem:batchsetmembership}, it holds that $\bigcup_{i=1}^M \{f_i(\vec x)\}_{\vec x\in H}\subseteq \{t(\vec x)\}_{\vec x\in H}$ as sets, if and only if there exists a function $m: H\rightarrow F$ such that
\begin{equation}
\label{e:lookup:fractional:identity}
\sum_{\vec x\in H} \sum_{i=1}^M \frac{1}{X + f_i(\vec x)} = \sum_{\vec x\in H} \frac{m(\vec x)}{X + t(\vec x)},
\end{equation}
assuming that the characteristic of $F$ is larger than $M$ times the size of the hypercube.
If $t$ is injective (which is typically the case for lookup tables) then $m$ is the multiplicity function, counting the number of occurences for each value $t(\vec x)$ in $f_1,\ldots, f_M$ altogether, i.e.
$m(\vec x) = m_f(t(\vec x)) = \sum_{i=1}^M|\{\vec y \in H: f_i(\vec y) = t(\vec x))|$.
If $t$ is not one-to-one, we set $m$ as the \textit{normalized} multiplicity function 
\begin{equation}
\label{e:lookup:m}
m(\vec x) = 
\frac{m_f(t(\vec x))}{m_t(t(\vec x))} = \frac{ \sum_{i=1}^M |\{\vec y \in H: f_i(\vec y) = t(\vec x))|}{ |\{\vec y \in H: t(\vec y) = t(\vec x))|}.
\end{equation}
The plot for proving that $\bigcup_{i=1}^M \{f_i(\vec x)\}_{\vec x\in H}\subseteq \{t(\vec x)\}_{\vec x\in H}$ is as follows.
Given a random challenge $x\sample F$, the prover shows that the rational identity \eqref{e:lookup:fractional:identity} holds at $X= x$, whenever evaluation is possible. 
However, in order to make \eqref{e:lookup:fractional:identity} applicable to the sumcheck argument, the prover needs to provide multilinear helper functions for the rational expressions. 
We shall discuss two different different approaches for doing that.
In  the first one, explained in Section \ref{s:lookup:small}, we use a single multilinear function for the entire fractional expression in \eqref{e:lookup:fractional:identity}, which is subject to a domain identity over $H$ which has $\bigO{M}$ variables and absolute degree $\bigO{M}$.
This will lead to a protocol with a $\bigO{M^2}$ prover. 
However, if $M$ is not too large this approach will be more performant than the second one, discussed in Section \ref{s:lookup:large}, in which we essentially use helper functions for each of the reciprocals $\frac{1}{x + f_i(\vec x)}$, and $\frac{1}{x + t(\vec x)}$. 
This second variant has a sumcheck polynomial in $\bigO{M}$ many variables, but the absolute degree is bounded, henceforth having a $\bigO M$ prover.


\subsection{An argument for not too many columns}
\label{s:lookup:small}

In this variant we provide a single helper function
\begin{equation}
\label{e:lookup:h}
h(\vec x) = \sum_{i=1}^M \frac{1}{x + f_i(\vec x)} - \frac{m(\vec x)}{x + t(\vec x)},
\end{equation}
subject $\sum_{\vec x\in H} h(\vec x) = 0$.
Correctness of $h$ is ensured by the domain identity
\begin{equation}
\label{e:lookup:h:identity}
\big(h(\vec x) \cdot  (x + t(\vec x)) + m(\vec x)\big) \cdot \prod_{i=1}^M (x + f_i(\vec x)) = (x + t(\vec x))\cdot \sum_{i=1}^M \prod_{j\neq i} (x + f_j(\vec x))
\end{equation}
over $H$, and we apply the Lagrange kernel $L_H(\:.\:, \vec z)$ at a randomly chosen $\vec z\sample F^n$ to reduce the domain identity to another sumcheck over $H$. 
Both sumchecks, the one for $h$ and the one for the domain identity, are then combined into a single one, using another randomness $\lambda\sample F$.

\begin{protocol}[Multi-column lookup over $H=\{\pm 1\}^n$]
\label{prot:lookup}
Let $M\geq 1$ be an integer, and $F$ a finite field with characteristic $p > M\cdot 2^n$. 
Given any functions $f_1, \ldots, f_M, t :H\rightarrow F$ on the boolean hypercube $H=\{\pm 1\}^n$, the Lagrange IOP for that $\bigcup_{i=1}^M\{f_i(\vec x) : \vec x\in H\}\subseteq \{t(\vec x) : \vec x\in H\}$ as sets is as follows.
\begin{enumerate} 
\item
The prover determines the (normalized) multiplicity function $m:H\rightarrow F$ as defined in \eqref{e:lookup:m},
and sends the oracle for $m$ to the verifier.
The verifier answers with a random sample $x\sample F\setminus \{- t(\vec x) : \vec x\in H\}$. 

\item
\label{i:lookup:step1}
Given the challenge $x$ from the verifier, the prover computes the randomized functions $\varphi_i(\vec x) = x + f_i(\vec x)$, $i=1,\ldots, M$, and $\tau(\vec x) =  x + t(\vec x)$.
It determines the values for
\begin{equation}
\label{e:lookup:h:phi}
h(\vec x) = \sum_{i=1}^M \frac{1}{\varphi_i(\vec x)} - \frac{m(\vec x)}{\tau(\vec x)},
\end{equation}
over $H$, and sends the oracle for $h$ to the verifier.

\item
\label{i:lookup:step2}
The verifier responds with a random vector $\vec z \sample F^n$ and a batching randomness $\lambda\sample F$.
Now, both prover and verifier engage in the sumcheck protocol (Protocol \ref{p:Sumcheck}) for 
\begin{align*} 
%\label{e:sumcheckh}
\sum_{\vec x \in H} Q(L_H(\vec x, \vec z),  h(\vec x), m(\vec x),  \varphi_1(\vec x), \ldots, \varphi_M(\vec x),  \tau(\vec x))&= 0,
\end{align*}
where %$Q$ is the degree $M+2$ polynomial
\begin{equation}
\label{e:lookup:Q}
Q(L,h,m, \varphi_1,\ldots, \varphi_M,  \tau) =   
L \cdot \left((h \cdot \tau + m) \cdot \prod_{i=1}^M \varphi_i - \tau\cdot \sum_{i=1}^M \prod_{j\neq i} \varphi_j\right)
+  \lambda \cdot h.
\end{equation}
The sumcheck protocol outputs the expected value $v$ for the multivariate polynomial 
\begin{equation}
\label{e:lookup:QinX}
\begin{aligned}
Q(L_H(\vec X, \vec z), h(\vec X), m(\vec X), \varphi_1(\vec X),\ldots, \varphi_M(\vec X),  \tau(\vec X))
\end{aligned}
\end{equation}
at $\vec X=\vec r$ sampled by the verifier in the course of the protocol.

\item
The verifier queries $[f_1], \ldots, [f_M], [t], [m], [h]$ for their inner product with $L_H(\:.\:,\vec r)$, and uses the answers 
to check whether \eqref{e:lookup:QinX} equals the expected value $v$ at $\vec X = \vec r$. 
(The value $L_H(\vec r, \vec z)$ is computed by the verifier.)
\end{enumerate}
\end{protocol}

\begin{rem}
\label{rem:lookup:completeness}
We imposed the condition $x\notin \{- t(\vec x)\}_{\vec x \in H}$ merely for completeness. 
However in some applications it may be not be desirable, or even not possible, to sample $x$ from outside the range of $t$.
There are several ways to handle this.
One can simply omit the constraint on $x$, letting the verifier sample $x\sample F$ and the prover set $h$ arbitrary whenever \eqref{e:lookup:h:phi} is not defined.
This comes at no extra cost, but the obtained protocol is only overwhelmingly complete. 
That is, with a probability of at most $\frac{|H|}{|F|}$ in the verifier randomness $x$, the honest prover does not succeed.
In practice this is often considered acceptable, and many lookup implementations have a non-zero completeness error. 
Whenever this is not acceptable, one may modify the domain identity \eqref{e:lookup:h:identity} to 
\begin{equation}
\label{e:lookup:h:identity:complete}
\left((h \cdot \tau + m) \cdot \prod_{i=1}^M \varphi_i - \tau\cdot \sum_{i=1}^M \prod_{j\neq i} \varphi_j\right)\cdot \tau\cdot  \prod_{i=1}^M \varphi_i  = 0
\end{equation}
over $H$, which imposes no condition on $h(\vec x)$ whenever $\tau(\vec x)= 0$. 
However, this approach comes at the cost of almost doubling the absolute degree of $Q$.
\end{rem}

Let us point out two variations of Protocol \ref{prot:lookup}.
In the single-column case $M=1$ the lookup argument can be turned into a multiset check for the ranges of $f_1$ and $t$, by setting $m$ as the constant function $m(\vec x) = 1$.
In this case only $h$ needs to be provided by the prover.
More interestingly, Protocol \ref{prot:lookup} is easily extended to a proof of range equality, showing that $\bigcup_{i=1}^M \{f_i(\vec x)\}_{\vec x\in H} = \{\tau (\vec x)\}_{\vec x\in H}$ as sets.
For this the prover additionally shows that $m \neq 0$ over $H$, which is done by providing another auxiliary function $h_m: H\rightarrow F$ subject to $h_m\cdot m = 1$ over $H$.
However, we are not aware of any application of this fact.


\subsection{A variant for a large number of columns}
\label{s:lookup:large}

%The polynomial $Q$ from \eqref{e:lookup:Q} has $\bigO{M}$ variables, and absolute degree $\bigO{M}$.
%This leads to a cost of the overall sumcheck which is \textit{quadratic} in $M$, and it will depend on the size of $M$ whether Protocol \ref{prot:lookup} performs better than a strategy which is linear in $M$, but which uses a linear number of oracles.
%Instead of doing an $M$-fold application of Protocol \ref{prot:lookup} for a single column, we will compare with following optimized strategy.
Assume that $M=2^m$, so that we can index the columns to be looked up by $f_{\vec z}$, where $\vec z\in \{\pm 1\}^m$.
We patch these columns into a single function $f$ over the extended hypercube $\bar H = \{\pm 1\}^m \times H$ by
\[
f(\vec y, \vec x) = \sum_{\vec z\in \{\pm 1\}^m} L_{m}(\vec y, \vec z)\cdot f_{\vec z}(\vec x),
\]
where $L_m(\vec y, \vec z)$ is the Lagrange polynomial for $\{\pm 1\}^m$. 
Given the random challange $x\sample F\setminus \{-t(\vec x) : \vec x\in H\}$ from the verifier, the prover supplies an oracle for the values of
\begin{equation}
\label{e:lookup:large:h}
h(\vec y, \vec x) = \frac{1}{x + f(\vec y, \vec x)} - \frac{\tilde m(\vec x)}{x + t(\vec x)}
\end{equation}
over the extended hypercube, where $\tilde m(\vec x)= \frac{1}{2^m}\cdot m(\vec x)$. 
The supplementary function $h$ is subject to the domain identity
%\[
%h(\vec y, \vec x) \cdot (x + f(\vec y, \vec x))\cdot (x + t(\vec x)) = x + t(\vec x)  - \tilde m(\vec x)\cdot (x + f(\vec y, \vec x))
%\]
\[
\big(h(\vec y, \vec x) \cdot (x + t(\vec x)) + \tilde m(\vec x) \big) \cdot (x + f(\vec y, \vec x)) - (x + t(\vec x)) = 0
\]
over $\bar H$, and 
\[
\sum_{\vec y \in \{\pm 1\}^m}\sum_{\vec x\in H} h(\vec y, \vec x) = 0.
\]
Again, the domain identity is turned into a sumcheck over $\bar H$ by applying the Lagrange kernel $L_{\bar H}(\:.\:, \vec z)$, where $\vec z$ is now sampled from $F^{m + n}$.
Combining the two sumchecks using a random $\lambda\sample F$ leads to the overall sumcheck  
\begin{align*} 
%\label{e:sumcheckh}
\sum_{\vec y \in\{\pm 1\}^m}\sum_{\vec x \in H} Q(L_{\bar H}((\vec y,\vec x), \vec z),  h(\vec y, \vec x), \tilde m(\vec x),  \varphi(\vec y, \vec x), \tau(\vec x))&= 0,
\end{align*}
over $\bar H=\{\pm 1\}^m \times H$, with $\varphi(\vec y, \vec x) = x + f(\vec y, \vec x)$ and $\tau(\vec x)= x + t(\vec x)$, and where 
$Q$ is 
\begin{equation}
\label{e:lookup:Q:linear}
Q(L,h, \tilde m, \varphi, \tau) =   
L \cdot \left((h \cdot \tau +\tilde  m) \cdot \varphi - \tau \right)
+  \lambda \cdot h.
\end{equation}
In this variant, providing $h$ amounts to $M$ oracles over $H$, yielding an overall equivalent of $M+1$ oracles of size $|H|$. 
However, $Q$ has only $\nu=5$ variables and its absolute degree is  independent of the number of columns.
%\end{rem}


\subsection{Soundness}
\label{s:lookup:soundness}

The soundness analysis of Protocol \ref{prot:lookup} is a straight-forward application of the Schwartz-Zippel lemma and the Lagrange-query to point-query correspondence stated by Lemma \ref{lem:Lagrange}.
We merely sketch it.
The univariate rational lookup identity \eqref{e:lookup:fractional:identity} is turned into a polynomial identity of degree at most $|H|\cdot (M+1) - 1$ by multiplying it with the common denominator  
\begin{equation}
\label{e:lookup:common:denominator}
p(X) = \prod_{\vec x\in H} (X + t(\vec x)) \cdot \prod_{i=1}^M (X + f_i(\vec x)).
\end{equation}
Since we sample $x$ from a set of size at least $|F|-|H|$, the soundness error of Step \ref{i:lookup:step1} of the protocol is at most 
\begin{equation}
\label{e:lookup:epsilon1}
\varepsilon_1 \leq \frac{(M+1)\cdot |H| - 1 }{|F|-|H|}.
\end{equation}
The soundness error due to the reduction of the domain identity  \eqref{e:lookup:h:identity}  to the Lagrange kernel based sumcheck is 
\[
\varepsilon_2 \leq \frac{1}{|F|},
\]
as scalar products with the Lagrange kernel translate to point evaluation of the multilinear extension.
This yields the following theorem. 

\begin{thm}
\label{thm:lookup:soundness}
 The interactive oracle proof described Protocol \ref{prot:lookup} has soundness error
\[
\varepsilon < \frac{(M+1)\cdot |H| - 1 }{|F|-|H|} + \varepsilon_{sumcheck},
\]
where $\varepsilon_{sumcheck}$ is the soundness error of the sumcheck argument \eqref{e:BatchSumcheckSoundness} over $H$ for a multivariate polynomial in $M+4$ variables with maximum individual degree $M+3$.
\end{thm}
\begin{rem}
The $\bigO M$-variant described in Section \ref{prot:lookup} has same soundness error, with $\varepsilon_{sumcheck}$ being the soundness error of the sumcheck argument over the extended hypercube of size $M\cdot |H|$ for a multivariate polynomial in $\nu=5$ variables and maximum individual degree $4$. 
\end{rem}

\begin{rem}
\label{s:pa:Generalizations}
Protocol \ref{prot:lookup} and its variant from Section \ref{s:lookup:large} are easily generalized to functions with multilinear values, 
\begin{align*}
t(\vec x) &= \sum_{(j_1,\ldots, j_k)\in\{0,1\}^k} t_{j_1,\ldots, j_k}(\vec x)\cdot Y_1^{i_1}\cdots Y_k^{j_k},
\\
f_i(\vec x) &= \sum_{(j_1,\ldots, j_k)\in\{0,1\}^k} f_{i, j_1,\ldots, j_k}(\vec x)\cdot Y_1^{i_1}\cdots Y_k^{j_k},
\end{align*}
$i=1,\ldots, M$, without changing the soundness error bound from Theorem \ref{thm:lookup:soundness}.
As $F[X, Y_1,\ldots, Y_k]$ is a unique factorization domain, and polynomials of the form $X -  \sum_{(i_1,\ldots, i_k)\in\{0,1\}^k} c_{i_1,\ldots, i_k}\cdot Y_1^{i_1}\cdots Y_k^{i_k}$ are irreducible, we may apply Lemma \ref{lem:batchsetmembership} to see that $\bigcup_{i=1}^M \{f_i(\vec x)\}_{\vec x\in H}\subseteq \{t(\vec x)\}_{\vec x\in H}$ as sets in the rational function field $F(X, Y_1,\ldots, Y_k)$, if and only if there exists a function $m: H\rightarrow F$ such that
\begin{equation}
\label{e:lookup:fractional:identity:general}
\sum_{\vec x\in H} \sum_{i=1}^M \frac{1}{X + f_i(\vec x)(\vec Y)} = \sum_{\vec x\in H} \frac{m(\vec x)}{X + t(\vec x)(\vec Y)}.
\end{equation}
The only change to Protocol \ref{prot:lookup} is that the verifier now samples $x$ from $F$ and $\vec y = (y_1,\ldots, y_k)$ from $F^k$, and continues with $x - f_i(\vec x)$ and $x - t(\vec x)$ replaced by $x - f_i(\vec x)(\vec y)$ and $x - t(\vec x)(\vec y)$.
%As the individual degrees with respect to $Y_1$, \ldots, $Y_k$ are again bounded by $|H|$, the soundness error does not change.
%Alternatively, at the cost of a slight increase of the soundness error one can choose $f, g: H\longrightarrow F[X]$ with values being polynomials of degree at most $k-1$ . 
\end{rem}

\subsection{Computational cost}
\label{s:lookup:cost}

The polynomial $Q$ from \eqref{e:lookup:Q} has $\nu= M + 4$ variables, and absolute degree $d = M + 3$.
Let us discuss an domain evaluation strategy for the values of $Q$, which makes use of batch inversion. 
This strategy allows us to evaluate $Q$ much more efficiently than using \eqref{e:lookup:Q}, but demands a modification of the sumcheck operation count formula \eqref{e:sumcheck:cost:precise}.
%Let us elaborate an evaluation strategy for $Q$, which makes use of 
% Arithmetic complexities, using the inverses of phi_1, ..., phi_{M-1}
% 	M  - 1 multiplications for p = phi_1 * ... * phi_{M-1} * phi_M
% 	M - 1 multiplications to obtain the partial products p_i = \prod_{j\neq i} \phi_j for i=1,..,M-1. (the product p_M is already known)
%	2 muls for m * p + tau * (h * p + \sum_{i} p_i)
%	another 2 muls for the product with L and lambda.
Assume that the inverses of $\varphi_1, \ldots, \varphi_{M-1}$ are given.
Then we may evaluate $Q$ by the fractional representation
\begin{align*}
%Q &= L\cdot  \prod_{i=1}^M\varphi_i\cdot \left(m + \tau\cdot \left(h  - \sum_{i=1}^M \frac{1}{\varphi_i} \right) \right) + \lambda\cdot h,
%\\
Q &= L\cdot  \prod_{i=1}^{M-1}\varphi_i\cdot \left(\varphi_M\cdot m + \tau\cdot \left(\varphi_M\cdot h - \left(\sum_{i=1}^{M-1} \frac{1}{\varphi_i} +1 \right)\right) \right) + \lambda\cdot h.
\end{align*}
This costs  $M + 4$ multiplications, one substraction, and $M+1$ additions, and hence
the arithmetic complexities are $|Q_\mathsf M| =  M +  4$, $|Q_\mathsf S| = 1$, $|Q_\mathsf A| = M + 1$.
Now, to attribute the inverses in formula \eqref{e:sumcheck:cost:precise}, we increase the multiplicative complexity by $3\cdot (M - 1)$, which represents the fractional cost of the batch inversion\footnotemark of $\varphi_1, \ldots, \varphi_{M-1}$. 
\footnotetext{%
Batch, or Montgomery inversion, of a sequence $(a_i)_{i=1}^N$ computes the inverses of $a_i^{-1}$ by recursively computing the cumulative products $p_i = a_1\cdot\ldots \cdot a_n$, $i=0,\ldots,n$, then calculating their inverses $q_i = \frac{1}{p_i}$ in a reverse manner   
starting with $q_n = \frac{1}{p_n}$, and putting $q_{i-1} = q_i \cdot a_i$, where $i$ goes from $n$ down to $1$. 
The inverses are then derived via $a_i^{-1} = p_{i-1}\cdot q_{i}$, where $p_0:=1$.
The overall cost of the batch inversion is $3\cdot (N-1)$ multiplications and a single inversion.
}%
This yields the following equivalent complexities
 \[
|Q_\mathsf M| = 4\cdot M + 1 , |Q_\mathsf S| = 1, |Q_\mathsf A| = M + 1,
\]
which we may plug into formula \eqref{e:sumcheck:cost:precise}.

Therefore the prover cost of Protocol \ref{prot:lookup} is as follows:
Given the values of $f_1, \ldots, f_M$ and $t$ over $H$, computing $\varphi_1 = x + f_1, \ldots, \phi_M = x + f_M$,  $\tau = x + t$ costs $|H|\cdot (M+ 1) \cdot\mathsf A$, and their reciprocals $\frac{1}{\varphi_1},  \ldots, \frac{1}{\varphi_M}$, $\frac{1}{\tau}$ are obtained within $3\cdot |H| \cdot (M+1) \cdot\mathsf M$, using batch inversion. 
With these reciprocals we obtain the values for
\[
h = \sum_{i=1}^M \frac{1}{\varphi_i} - \frac{m}{\tau} 
\]
by $|H|\cdot (1\cdot \mathsf S +(M-1)\cdot\mathsf A)$. 
By the remark following Lemma \ref{lem:Lagrange}, the values for $L_H(\vec X, \vec y)$ over $H$ are obtained within $|H|\cdot (\mathsf M + \mathsf A)$ operations. 
Hence the total cost of the preparation phase is
\[
|H|\cdot ((3\cdot M + 4)\cdot\mathsf M + 1\cdot\mathsf S + (2\cdot M + 1) \cdot\mathsf A.
\]
According to \eqref{e:sumcheck:cost:precise} the sumcheck costs  
\begin{equation*}
|H| \cdot \left(1-\frac{1}{2^{n}}\right)\cdot \big((5\cdot M^2 + 24 \cdot M + 16) \cdot\textsf M + (2\cdot M + 8)\cdot\textsf S + (2\cdot M^2 + 13\cdot M + 20) \cdot \textsf A\big).
\end{equation*}
However, as we may reuse the reciprocals of $\varphi_1,  \ldots, \varphi_{M-1}$ in the first step of the sumcheck, we correct the sumcheck cost by substracting $|H|\cdot (3\cdot (M-1))$. 
Neglecting the $\nicefrac{1}{2^{n}}$-term, the overall cost of the prover is 
\begin{equation}
\label{e:lookup:cost}
|H| \cdot  \big((5\cdot M^2 + 24 \cdot M + 23) \cdot\textsf M 
+ (2\cdot M + 9)\cdot\textsf S 
+ (2\cdot M^2 + 15\cdot M + 21) \cdot \textsf A\big),
\end{equation}
whereas it provides two $H$-sized oracles. 
The cost is $\bigO{|H|}$ but depends quadratically in $M$ the number of columns to be looked up. 
This quadratic occurence is due to the fact that both, the number of function as well as the degree of $Q$ grow linearly in $M$. 

The cost for the $\bigO{M}$ strategy from Section \ref{s:lookup:large} is as follows.
There, the prover provides $h$ over the extended hypercube of size $|\bar H|=M\times |H|$, and $Q$ from \eqref{e:lookup:Q:linear} has $\nu = 5$ variables, absolute degree $d=4$, and arithmetic complexities  $|Q_\mathsf M|= 4$, $|Q_\mathsf S|= 1$, $|Q_\mathsf A|=2$.
Computing the values of $h$ over $\bar H$ using batch inversion costs 
\[
M\cdot |H| \cdot(3 \cdot \mathsf M + \mathsf A),
\]
and the values for $L_{\bar H}(\:,\:, \vec z)$ over $\bar H$ are determined in $M\cdot |H|\cdot (\mathsf M + \mathsf A)$.
The overall sumcheck costs
\[
M\cdot |H| \cdot \left(1-\frac{1}{M\cdot |H|}\right)\cdot \big(40 \cdot \textsf M + 10\cdot\textsf S + 35 \cdot \textsf A\big).
\]
Neglecting the $\frac{1}{M\cdot |H|}$-term, the overall cost for the prover is
\begin{equation}
\label{e:lookup:large:cost}
M\cdot |H|\cdot (44\cdot\mathsf M + 10\cdot\mathsf S + 37\cdot\mathsf A),
\end{equation}
but the prover needs to provide the oracles for one function over  a domain of size $M\cdot|H|$, and one over $H$. 


Let us estimate the range for $M$ where Protocol \ref{prot:lookup} is more efficient than the $\bigO M$ variant. 
For this we use the benchmarks from Table \ref{tab:pippenger} which measure the equivalent number of field multiplication for a multi-scalar multiplication in an elliptic curve over $256$ bit large field.
Based on this equivalent, and our operation counts for the oracle prover, we obtain the following break even points.
\begin{table}[h!]
\caption{%
The estimated number of columns $M$ where the $\bigO{M}$ strategy starts to perform better than Protocol \ref{prot:lookup}. 
The numbers are based on the operation counts \eqref{e:lookup:cost} and \eqref{e:lookup:large:cost} for the oracle prover, and the 
benchmarks for a multi-scalar multiplication over the Pallas curve, see Table \ref{tab:pippenger}.
}
\vspace*{0.5cm}
\centering
\begin{tabular} {|c|c|c|c|c|}
\hline
$\log|H|$ & 12 & 14 & 16 & 18
\\\hline
$M$ & 114 & 95 & 81 & 73
\\\hline
\end{tabular}
\end{table}


%%
%% Poseidon hash comparison
%%
%As a lower bound for elliptic curve based schemes, we take the number of field operations for $x^5$-Poseidon with rate $r=2$, capacity $c=1$, $R_F=8$ full rounds and $R_p=57$ partial rounds. (These are the parameters from \cite{Poseidon} for a security level of $128$ bits over a $255$ bit large base field.) 
%Each permutation, which processes $r=2$ many elements costs
%$
%604\cdot\mathsf M + 591\cdot\mathsf A, 
%$ 
%using the optimized evaluation strategy from Appendix B in \cite{Poseidon}.
%Hence computing the hash of two functions over $H$ costs
%\[
%|H|\cdot (604\cdot\mathsf M + 591\cdot\mathsf A). 
%\]
%Therefore a $M$ single-column lookups cost 
%\begin{equation}
%\label{lookup:cost:Mcolumn:naive}
%|H|\cdot ((604\cdot M + 42)\cdot\mathsf M + 10\cdot\mathsf S + (591\cdot M + 30)\cdot\mathsf A),
%\end{equation}
%and the batch-column lookup for $M$ columns
%\begin{equation}
%\label{lookup:cost:Mcolumn:batch}
%|H| \cdot  \big((5\cdot M^2 + 19 \cdot M + 1226) \cdot\textsf M 
%+ (2\cdot M + 8)\cdot\textsf S 
%+ (2\cdot M^2 + 13\cdot M + 621) \cdot \textsf A\big).
%\end{equation}
%Comparing \eqref{lookup:cost:Mcolumn:naive} and \eqref{lookup:cost:Mcolumn:batch} yields  a maximum speedup by a factor of $5$ for $M=11$, more than double as fast for $3\leq M\leq 58$, and a break-even point at about $M=124$ columns.
%


%Compared to the cost of the univariate lookup PIOP (see Appendix \ref{s:uv:lookup}), the multivariate lookup has its break-even point at about $\log |H| \approx 2.3$.
%However, this is without taking into account that in the univariate case a total of $3$ polynomials of size $|H|$ need to be provided, which is one more than in our protocol (neglecting the overall quotient polynomial). 
%Thus it is safe to assume that Protocol \ref{prot:pa} is throughout faster than the univariate case, and we expected it to be more than three times as performant at $\log|H| \approx 12$.

\section{Lookups based on the Hyperplonk shift}
\label{s:hyperplonk}

In this section we informally discuss batch-column lookups based on the Plookup strategy and the time shift from Hyperplonk+ \cite{Hyperplonk}. 
We  give detailed operation counts and compare with our lookup arguments from Section \ref{s:lookups}.

On of the main contributions of \cite{Hyperplonk} is the introduction of an (almost) transitive time shift $T: H\rightarrow H$ which translates to multilinear extensions in a tame manner.  
The shift is derived from the multiplication by a primitive root in $GF(2^n)$, 
\[
T(x_1, \ldots, x_n) = \frac{1 + x_n}{2} \cdot (1, x_1,\ldots, x_{n-1}) + \frac{1 - x_n }{2} \cdot (- 1,(-1)^{1 - c_1}\cdot x_1,\ldots, (-1)^{1 - c_{n - 1}}\cdot x_{n-1}),
\]  
where the $c_i\in\{0,1\}$ are the coefficients of a primitive polynomial $1 + \sum_{i=1}^{n-1} c_i\cdot X^i + X^n$ over $GF(2)$.
The time shift acts transitively on the punctuated hypercube $H' = H\setminus\{\vec 1\}$ (as a group automorphism it has $\vec 1$ as a fixed point), and more importantly, evaluations of a shifted function $f(T(\vec x))$ can be simulated from two evaluations of $f$ by
\begin{equation}
\label{e:hyperplonk:shifted:function}
f(T(x_1, \ldots, x_n)) =  \frac{1 + x_n}{2} \cdot f(1, x_1,\ldots, x_{n-1}) + \frac{1 - x_n }{2} \cdot f(- 1,(-1)^{1 - c_1}\cdot x_1,\ldots, (-1)^{1 - c_{n - 1}}\cdot x_{n-1}).
\end{equation}

%Although the Hyperplonk lookup uses the , 
Using the time shift $T$ allows for the same strategy for the univariate Plookup argument. (See Appendix \ref{s:uv:lookup} for a recap.)
The argument is based on the fact that, given two sequences of field elements $(a_i)_{i=0}^{N-1}$ and $(t_i)_{i=0}^{N-1}$, then $\{a_i : j= 0,\ldots, N-1\}\subseteq \{t_i : i = 0, \ldots, N-1\}$ as sets, if and only if there exists a sequence $(s_i)_{i=0}^{2N-1}$ of double the size, which satisfies the lookup identity
\begin{equation}
\label{e:lookup:Gabizon}
\prod_{i=0}^{N-1} (X + s_{i}+ s_{i+1\bmod N}\cdot Y)
= \prod_{i=0}^{N-1}  (X + a_i + a_i\cdot Y)\cdot (X + t_i + t_{i+1\bmod N}\cdot Y).
\end{equation}
(The sequence $(s_i)_{i=0}^{2N-1}$ is  the concatenation of the $(a_i)$ and $(t_i)$, ordered by value in the same way as given by $t$.)
We again discuss two approaches\footnotemark for dealing with the grand product obtained from \eqref{e:lookup:Gabizon}, when sampling random $(\alpha, \beta)\sample F^2$ for $(X,Y)$.
\footnotetext{%
We point out that the presented strategies slightly differ from the one in \cite{Hyperplonk}, which uses the more expensive grand product argument from \cite{Quarks}.
}
The first one applies a batched grand product argument over $H$, independent of $M$ the number of columns, leading to a $\bigO{M^2}$ prover for similar reasons as Protocol \ref{prot:lookup} does.
The second one proves the grand product argument over an extended hypercube $\bar H$ of size $M\cdot |H|$, which on the one hand leads to larger functions to be provided, but on the other hand has a $\bigO M$ prover which outperforms the first approach at a high number of columns.

 
%In terms of performance both are equivalent.)

\subsection{The argument for not too many columns}
\label{s:hyperplonk:small}

Let $t: H'\rightarrow F$ be the lookup table, and $f_i:  H'\rightarrow F$, $i=1,\ldots,M$, the functions subject to the lookup.
Although the functions are defined over the punctuated hypercube $H'$, we assume arbitrary values at $\vec 1$.
(These will be ignored by the lookup argument.)
The prover provides the ordered merge of the $f_i$  together with $t$ via additional functions $s_i: H'\rightarrow F$, $i=1,\ldots, M+1$, subject to the generalized lookup identity
\begin{multline*}
\prod_{\vec x\in H'} \prod_{i=1}^{M} (X + s_{i}(\vec x) + s_{i+1}(\vec x)\cdot Y)\cdot (X + s_{M+1}(\vec x) + s_1(T(\vec x))\cdot Y)
\\
= \prod_{\vec x\in H'} \prod_{i=1}^{M} (X + f_i(\vec x) + f_i(\vec x)\cdot Y)\cdot (X + t(\vec x) + t(T(\vec x))\cdot Y).
\end{multline*}
The identity is reduced to a grand product over $H'$ using random samples $\alpha, \beta\sample F$ for $X$ and $Y$, yielding 
\[
\prod_{\vec x\in H'} h(\vec x) = 1,
\]
where 
\[
h(\vec x) = \frac{\alpha + s_{M+1}(\vec x) + s_1(T(\vec x))\cdot \beta}{\alpha + t(\vec x) + t(T(\vec x))\cdot \beta}\cdot 
\prod_{i=1}^M \frac{(\alpha + s_i(\vec x) + s_{i+1}(\vec x)\cdot \beta)}{
(\alpha + f_i(\vec x) + f_i(\vec x)\cdot \beta)}.
\]
%\[
%\prod_{\vec x\in H'}\frac{\alpha + s_{M+1}(\vec x) + s_1(T(\vec x))\cdot \beta}{\alpha + t(\vec x) + t(T(\vec x))\cdot \beta}\cdot 
%\prod_{i=1}^M \frac{(\alpha + s_i(\vec x) + s_{i+1}(\vec x)\cdot \beta)}{
%(\alpha + f_i(\vec x) + f_i(\vec x)\cdot \beta)} = 1.
%\]

%The lookup IOP is as follows. 
%(The IOP in \cite{Hyperplonk} uses the grand product argument from \cite{Quarks}, but the protocol is easily improved by taking over the univariate strategy)
The prover computes the cumulative products of the values $h(\vec x)$ along the orbit of the time shift $T$ on $H'$, starting with $\phi(-\vec 1) = 1$, and setting
\[
\phi\big(T^{k}(-\vec 1)\big) = \phi\big(T^{k-1}(-\vec 1)\big) \cdot h\big(T^{k-1}(-\vec 1)\big),
\]
for $k=1, \ldots, |H|-1$.
At the remaining point $\vec x = \vec 1$ outside $H'$, the prover sets $\phi(\vec x)$ to zero.
Correctness of the grand product is proven by the constraint on its initial value $\phi(-\vec 1) = 1$, and the domain identity
\begin{equation}
\phi(T( \vec x)) \cdot  \tau(\vec x)\cdot \prod_{i=1}^M  \varphi_{i}(\vec x) 
- \phi(\vec x)\cdot \sigma_{M+1}(\vec x)\cdot \prod_{i=1}^M \sigma_{i}(\vec x)
= 0, 
\end{equation}
for all $\vec x \in H$, where $\tau(\vec x) = \alpha + t(\vec x) + \beta\cdot t(T( \vec x))$ and
\begin{align*}
\varphi_{i} (\vec x) &=\alpha +  (1+\beta)\cdot f_i(\vec x),
\\
\sigma_{i}(\vec x) &= \alpha + s_i(\vec x) + \beta\cdot s_{i+1}(\vec x), 
\end{align*}
 for $i=1,\ldots, M$, except $\sigma_{M+1}(\vec x) =  \alpha + s_{M+1}(\vec x) + \beta\cdot s_1(T( \vec x))$.
As in Protocol \ref{prot:lookup}, both constraints on $\phi$, the initial value condition and the domain identity, are reduced to sumchecks over $H$ by help of the Lagrange polynomials $L_H(\:.\:, -\vec 1)$ and $L_H(\:.\:, \vec y)$, where $\vec y\sample F^n$, and then combined into a single one by a batching randomness $\lambda\sample F$.
The resulting overall sumcheck is
\[
\sum_{\vec x\in H} Q(L_H(\vec x, \vec y), L_H(\vec x, -\vec 1),  \tau(\vec x), \varphi_{1}(\vec x),\ldots, \varphi_{M}(\vec x), \sigma_{1}(\vec x),\ldots, \sigma_{M+1}(\vec x),  \phi(\vec x), \phi(T(\vec x))) = 0,
\]
where
\begin{equation}
\label{e:lookup:hyperplonk:Q}
Q(L_H, L,   \tau, \varphi_1,\ldots, \varphi_M, \sigma_1,\ldots, \sigma_{M+1}, \phi, \phi_T) 
= L_H \cdot \Big(\phi_T \cdot \tau \cdot\prod_{i=1}^M \varphi_i
- \phi \cdot \prod_{i=1}^{M+1} \sigma_i \Big)
+ \lambda\cdot L \cdot (\phi - 1). 
\end{equation}
Note that $Q$ has  absolute degree $d = M+3$, which is the same as in Protocol \ref{prot:lookup}, but  about the double of variables, $\nu= 2\cdot M+ 6$.
Its arithmetic complexities are $|Q_\mathsf M|= 2\cdot(M+1) + 3$, $|Q_\mathsf S|= 2$, $|Q_\mathsf A|= 1$. 


\subsubsection*{Prover costs}

The prover cost for the multi-colum Plookup is as follows.
Computing the values for $\tau$ and all $\varphi_i,  \sigma_i$ over $H$ consumes overall
\[
|H|\cdot (2\cdot(M+1)\cdot \mathsf M + (3 M+ 4)\cdot \mathsf A),
\]
the quotient $h(\vec x)$ is obtained within $|H|\cdot (2\cdot M + 4)\cdot\mathsf M$, using batch inversion.
From these values $\phi(x)$ over $H$ is derived by another  $|H|\cdot \mathsf M$.
The domain evaluation for $L_H(\vec X, \vec y)$ is obtained within $|H|\cdot (\mathsf M + \mathsf A)$ operations, and the sumcheck costs
\begin{equation*}
|H| \cdot \left(1-\frac{1}{|H|}\right)\cdot \big((4\cdot M^2 + 25\cdot M + 38)\cdot \mathsf M +  (6\cdot M + 20)\cdot\textsf S + (2\cdot M^2 + 14\cdot M + 26) \cdot \textsf A\big).
\end{equation*}
Neglecting the $\nicefrac{1}{|H|}$-term, the overall cost of the prover is 
\begin{equation}
\label{e:hyperplonk:cost}
|H| \cdot  \big((4\cdot M^2 + 29\cdot M + 46)\cdot \mathsf M +  (6\cdot M + 20)\cdot\textsf S + (2\cdot M^2 + 17\cdot M + 31) \cdot \textsf A\big).
\end{equation}
This is comparable with \eqref{e:lookup:cost}, but does not take into account that the prover needs to supply a total of $M+1$ functions over $H$ instead of two. 


\subsection{A variant for a large number of columns}
\label{s:hyperplonk:large}

Suppose that we have a table function $t:H'\rightarrow F$ defined over the punctuated hypercube $H' = H\setminus\{\vec 1\}$, and $M$ column functions $f_1,\ldots, f_M: H\rightarrow F$ defined on the \textit{entire} hypercube.
%Again, $t$ takes an arbitrary value at $\vec 1$, which is ignored by the lookup 
We assume that  $M + 1=2^m $ for an integer $m\geq 1$, so that we can index the column functions as $f_{\vec z}$, with $\vec z$ from $\{\pm 1\}^m \setminus \{\vec 1\}$.
The ordered concatenation of the column functions and the table function has now $2^m \cdot |H| - 1$ entries, which we arrange in a single function $s$ over $\bar H\setminus \{\vec 1\}$ along the orbit of the time shift $T_{\bar H}$ on $\bar H$, where  $\bar H = \{\pm 1\}^m\times H$.
The value of $s$ at $\vec 1$ can be set arbitrary (it will be ignored by the lookup argument).
Using the patched  function
\[
f(\vec y, \vec x) = \sum_{\vec z \in \{\pm 1\}^m\setminus\{\vec 1\}} L_m(\vec y, \vec z)\cdot f_{\vec z}(\vec x),
\]
over $\bar H$, where $L_m$ is the Lagrange kernel for $\{\pm 1\}^m$, the lookup identity at random $(\alpha,\beta)\sample F^2$ reads now as
\[
\prod_{(\vec y, \vec x)\in \bar H'}  (\alpha + s(\vec y, \vec x) + \beta\cdot s(T_{\bar H} (\vec y, \vec x))) = \prod_{(\vec y, \vec x)\in \bar H'}  (\alpha + (1 + \beta)\cdot f(\vec y, \vec x) + L_{m}(\vec y, \vec 1)\cdot (t(\vec x) + \beta \cdot t(T_H(\vec x))),
\]
where $\bar H' = \bar H\setminus\{\vec 1\}$.
(Notice that the $\alpha$-term for $t$ on the right hand side is combined with the $\alpha$-term for $f$.)
In other words the prover needs to show that
\[
\prod_{(\vec y, \vec x)\in \bar H'}\frac{\varphi(\vec y, \vec x)}{\sigma(\vec y,\vec x)} = 1,
\]
where 
\begin{align*}
\varphi(\vec y, \vec x) &= \alpha + (1+\beta)\cdot f(\vec y, \vec x) + L_m(\vec y, \vec 1)\cdot (t(\vec x) + \beta \cdot t(T_H(\vec x))), 
\\
\sigma(\vec y, \vec x) &= \alpha + s(\vec y, \vec x) + \beta\cdot s(T_{\bar H} (\vec y, \vec x)).
\end{align*} 

As before, the prover provides a function $\phi$ over $\bar H'$ for the cumulative products of the values of $h(\vec x) = \frac{\varphi(\vec y, \vec x)}{\sigma(\vec y,\vec x)}$ along the orbit of $T_{\bar H}$, starting with $\phi(-\vec 1)= 1$, and setting
\[
\phi(T_{\bar H}^k (-\vec 1)) = \phi(T_{\bar H}^{k-1} (-\vec 1)) \cdot h(T_{\bar H}^{k-1} (-\vec 1)), 
\] 
for $k=1, \ldots, |\bar H| - 2$. 
At the remaining point $\vec 1$ we set $\phi(\vec 1) = 0$. %, so that the above recursive conditions holds even over the orbit of $\vec 1$, regardless of the values  $t(\vec 1)$ and $s(\vec 1, \vec 1)$.
Correctness of $\phi$ is ensured by $\phi(-\vec 1) = 1$, and the domain identity
\begin{equation*}
\phi(T_{\bar H}(\vec y, \vec x)) \cdot \varphi(\vec y, \vec x)
- \phi(\vec y, \vec x) \cdot \sigma(\vec y, \vec x) = 0
\end{equation*}
over $\bar H$.
Applying the Lagrange polynomials $L_{\bar H}(\:.\:, \vec z)$, $\vec z \sample F^{m+n}$, and $L_{\bar H}(\:.\:, -\vec 1)$, the constraints on $\phi$ are reduced to sumchecks over $\bar H$, which are then combined into a single one using a random $\lambda\sample F$.
The overall sumcheck is
\[
\sum_{(\vec y, \vec x)\in \bar H} Q(L_{\bar H}((\vec y, \vec x) , \vec z), L_{\bar H}((\vec y, \vec x), -\vec 1), \varphi(\vec y, \vec x), \sigma(\vec y, \vec x), \phi(\vec y, \vec x),  \phi(T_{\bar H}(\vec y, \vec x))) = 0,
\]
where 
\begin{equation}
Q(L_H, L, \varphi, \sigma,\phi, \phi_T) = 
L_H\cdot (\phi_T\cdot \varphi - \phi\cdot \sigma) + \lambda\cdot L\cdot (\phi - 1).
\end{equation}
The polynomial $Q$ has $\nu=6$ variables, absolute degree $d=3$, and its arithmetic complexities are $|Q_\mathsf M|= 5$, $|Q_\mathsf S| = 2$, $|Q_\mathsf A|= 1$.


\subsubsection*{Prover cost}

Computing the values of the patched function $\varphi$  over $\bar H$ takes $|\bar H |\cdot (1\cdot \mathsf M + 2\cdot \mathsf A)$,
and the same number of operations are needed for $\sigma$.
The quotient $h$ over $\bar H$ is obtained by batch inversion, consuming
\[
|\bar H|\cdot (3 + 1) \cdot \mathsf M,
\]
and the values of $\phi$ are computed within another $|\bar H|$ multiplications.
Last but not least, computing $L_{\bar H}(\:.\:, \vec z)$ over $\bar H$ costs $|\bar H|\cdot (\mathsf M + \mathsf A)$, and the sumcheck 
\[
|\bar H| \cdot \left(1 + \frac{1}{|\bar H|}\right) \cdot (38 \cdot \mathsf M + 20 \cdot \mathsf S + 26\cdot\mathsf A).
\] 
Neglecting the $\frac{1}{|\bar H|}$-term, the overall cost of the prover is
\begin{equation}
\label{e:hyperplonk:large:cost}
(M+1)\cdot |H| \cdot (45 \cdot \mathsf M + 20 \cdot \mathsf S + 30\cdot\mathsf A),
\end{equation}
whereas it needs to provide the functions $s$ and $\phi$ over $\bar H$, which amounts the equivalent of $2\cdot (M+1)$ functions over $|H|$.
Based on our benchmark-backed equivalent number of field operations for a multi-scalar multiplications (Table \ref{tab:pippenger}), we obtain the following break even points.
\begin{table}[h!]
\caption{%
The estimated number of columns $M$ where the $\bigO{M}$ strategy starts to perform better than the protocol from Section \ref{s:hyperplonk:small}. 
The numbers are based on the operation counts \eqref{e:hyperplonk:cost} and \eqref{e:hyperplonk:large:cost} for the oracle prover, and the benchmarks for a multi-scalar multiplication over the Pallas curve, see Section \ref{s:appendix:benches}.
}
\vspace*{0.5cm}
\centering
\begin{tabular} {|c|c|c|c|c|}
\hline
$\log|H|$ & 12 & 14 & 16 & 18
\\\hline
$M$ & 143 & 120 & 103 & 92
\\\hline
\end{tabular}
\end{table}

\subsection{Comparison with logarithmic derivative lookups}

The main advantage of logarithmic derivative lookups over the ones from the current section is not the arithmetic cost  as reflected by the operation counts  \eqref{e:lookup:cost}, \eqref{e:lookup:large:cost} and \eqref{e:hyperplonk:cost}, \eqref{e:hyperplonk:large:cost}, but the number and sizes of oracle functions the prover needs to provide.
That oracle costs depend on the used commitment scheme.
To give an estimate for these costs in the case of an elliptic curve based Lagrange commitment scheme (such as the IPA), we 
rely on a benchmark-backed equivalent of field multiplications for the multi-scalar multiplication in the Pallas curve.
These are found in Table \ref{tab:pippenger}.

\begin{table}[h!]
\caption{%
Benchmark of Halo2's Pippenger multi-scalar multiplication in the Pallas curve, varying the number $N$ of scalars.
The benchmarks where done on a AMD Ryzen 7 PRO 4750U, 32GB RAM DDR4,  restricting to a single core.
}
\label{tab:pippenger}
\begin{center}
\begin{tabular}{|c|c|c|c|}
\hline
$\log N$ &  Pippenger of size $N$ & $2^8\cdot N$ field mult. & equivalent field mult.
\\\hline
12 & $46.010$ ms &  $21.11$ ms & $N\cdot 557\cdot \mathsf M$
\\
14 & $153.67$ ms & $84.78$ ms & $N\cdot 464 \cdot \mathsf M$
\\
16 & $522.13$ ms & $169.70$ ms & $N\cdot 394\cdot\mathsf M$
\\
18 &       $1.869$  ms  & $679.27$ ms & $N\cdot 351\cdot\mathsf M$
\\\hline
\end{tabular}
\end{center}
\end{table}

With these equivalence measure and the operation counts from  \eqref{e:lookup:cost}, \eqref{e:lookup:large:cost} and \eqref{e:hyperplonk:cost}, \eqref{e:hyperplonk:large:cost}, we obtain the following ratios of field multiplications, see Table \ref{tab:comparison}.

\begin{table}[h!]
\caption{%
The estimated performance advantage of the logarithmic derivative lookups over the ones from this section, as the ratio of their number of field multiplications $r =\nicefrac{\mathsf M({Plookup})}{\mathsf M({logD})}$.
The numbers are based on the equivalent number of field multiplication from Table \ref{tab:pippenger}.
For hypercube sizes $|H|$ ranging from $2^{12}$--$2^{18}$ we describe the maximum ratio $r_{max}$ over the number of columns $M$, as well as the ranges for $M$ over which $r$ is larger than $2$ and $3$, respectively.
The minimum ratio is throughout $r= 1.5$ and obtained in the single-column setting $M=1$. 
}
\label{tab:comparison}
\vspace*{0.5cm}
\centering
\begin{tabular} {|c|c|c|c|c|}
\hline
$\log|H|$ & $r\geq 3$ & $r\geq 2$ & $r_{max}$ 
\\\hline
12 & $M\in [5, 41]$ & $M\in [3, 87]$ & $4.1$ (at $M =15$)
\\
14 & $M\in [6, 32]$ & $M\in [3, 71]$ & $3.8$ (at $M= 13$)
\\
16 & $M\in [6, 26]$  & $M\in [3, 59]$ & $3.5$ (at $M= 12$)
\\
18 & $M\in [6, 21]$  & $M\in [3, 52]$ & $3.2$ (at $M= 12$)
\\\hline
\end{tabular}
\end{table}

\section{Acknowledgements}

The author would like to thank Rayan Matovu and Morgan Thomas for giving me the space and time to dwell on batch-column lookups. 
Special thanks to Marcin Bugaj for helping out with the Pippenger benchmarks.
Furthermore, I would like to thank Ariel Gabizon for his feedback and useful discussions.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\bibliographystyle{alpha}
\bibliography{bibfileSNARKs}

\appendix
%\newpage
\section{Appendix: the univariate case}
\label{s:appendix}

In this section we compare the univariate variant of Protocol \ref{prot:lookup} with the lookups from \cite{Plookup} and \cite{flookup}, generalized to batches of columns.
As the ``large number of columns'' batching strategy from Section \ref{s:lookup:large}  again yields a break-even point beyond practical interest\footnotemark, we skip a separate analysis of it.
\footnotetext{%
For example, at $2^{12}$ constraints we obtain a break-even point at about $M=40$ columns.
}%
The protocols are formulate as Lagrange IOPs, even though the advantage over  polynomial oracles is modest:
For most polynomials the prover needs to compute their coefficients anyway.
%\footnotetext{%
%However, we are aware that these protocols can be transformed into FFT-less ones by using the technique from \cite{UnivariateHypercubeSumcheck}, which translates the hypercube sumcheck to univariate domains.
%}%

We throughout assume that $F$ is an FFT-friendly finite field, having a multiplicative subgroup $H = \{x\in F: x^n = 1\}$ of order $n$, and denote by $g$ a generator of it.  
The Lagrange kernel of $H$ is the symmetric bivariate polynomial
\[
L_H(X, Y) =\frac{1}{n}\cdot\left(1 + \sum_{i=1}^{n-1} X^i\cdot Y^{n-i}\right) = \frac{1}{n}\cdot \frac{Y\cdot v_H(X) - X\cdot v_H(Y)}{X - Y},
\]
where $v_H(X)= X^n - 1$ is the vanishing polynomial of $H$.
Again, for $y\in H$ the polynomial $L_H(X, y)$ is the Lagrange polynomial which is equal to $1$ at  $x=y$, and zero elsewhere on $H$.
In all the protocols, the prover is given $M\geq 1$ functions $f_i:H\rightarrow F$, $i=0, \ldots, M-1$, and wants to prove that their ranges are contained in that of a prediscribed table $t: H\rightarrow F$, i.e. 
\[
\bigcup_{i=0}^{M-1} \{f_i(x)\}_{x\in H}\subseteq \{t(x)\}_{x\in H}.
\]
For notational convenience, will use the same notation for the function $f: H\rightarrow F$ and its sequence of values $(f(g^k))_{k=0}^{n-1}$. 

Our cost analysis is based on the following approximate formula:
Assume that  we have Lagrange representation of $\nu\geq 1$ polynomials $p_1(X)$, \ldots, $p_\nu(X)$ over $H$, where all $\deg p_i(X) < |H|$, and an $\nu$-variate polynomial $Q$ over $F$ with absolute degree $d$ and multplicative complexity $Q_\mathsf M$.
Then the number of field multiplications for computing the Lagrange representation of the quotient polynomial $q(X)$ in
\[
Q(p_1(X), \ldots, p_\nu(X)) = 0 \mod v_H(X),
\]
over the ``multiplication domain'' $H'$ of size $d\cdot |H|$ (except the trivial coset $H$) is 
\begin{equation}
\label{e:uv:quotientpoly:cost}
\begin{aligned}
\nu  \cdot  \left(\frac{|H|}{2} \cdot\log|H| + \frac{d\cdot |H|}{2} \cdot\log(d\cdot|H|) \right)+ d\cdot |H|\cdot ( Q_M + 1) 
%\\
%= d \cdot |H| \cdot\left(\frac{\nu}{2}\cdot\log|H| + Q_\mathsf M + 1 + \frac{\nu}{2} \cdot\log d  \right).
\end{aligned}
\end{equation}
The first term corresponds to the computation of the coefficients of $p_1(X)$, \ldots, $p_\nu(X)$, plus their values over the multiplication domain, and the second term is for the pointwise evaluation of $Q$ over that domain, plus the multiplication with the precomputed inverses of the vanishing polynomial for obtaining the values of $q(X)$.
Theses values over the $d-1$ (non-trivial) cosets of $H$, enumerated as $a^i\cdot H$, $i=1,\ldots, d-1$,  where $a$ is any element of $H'\setminus H$, define the Lagrange representations of ``component polynomials'' $q_0(X),\ldots, q_M(X)$ of degree $\deg p_i(X) <|H|$ in the decomposition
\begin{equation}
\label{e:q:decomposition}
q(X) = \frac{1}{d}\cdot \sum_{i=1}^{d-1}\frac{v_{H'}(a^{-i}\cdot X)}{v_H(a^{-i}\cdot X)} \cdot q_i(a^{-i}\cdot X).
\end{equation}
We point out that when embedding lookups in other IOPs (for example Plonk), so that their overall identities share the same quotient polynomial, then the fractional cost for the lookup is less. 


\subsection{The logarithmic derivate approach}
\label{s:uv:lookup}

The lookup condition is equivalent to that the rational function
\begin{equation}
\label{e:lookup:h}
h(x) = \sum_{i=0}^{M-1} \frac{1}{\alpha + f_i(x)} - \frac{m(x)}{\alpha + t(x)},
\end{equation}
sums up to zero over $H$, where $\alpha\sample F$ is the first verifier challenge.
For this the prover provides the Lagrange representation of $\phi(X)$ subject to the domain identity
\[
\phi(g\cdot x) - \phi(x) = \sum_{i=0}^{M-1} \frac{1}{\alpha + f_i(x)} - \frac{m(x)}{x + t(x)},
\]
or 
\begin{equation}
\label{e:uv:lookup:overall:identity}
 \tau(X) \cdot
\prod_{i=0}^{M-1} \varphi_i(X) \cdot (\phi(g\cdot X) - \phi(X))= 
 \tau(X) \cdot \prod_{i=0}^{M-1} \varphi_i(X)\cdot \left(\sum_{i=0}^{M-1} \frac{1}{\varphi_i(X)} - \frac{m(X)}{\tau(X)}\right) \mod v_H(X),
\end{equation}
where the right hand side is a polynomial, and 
\begin{align*}
\varphi_i(X) &= \alpha + f_i(X), \quad i=0,\ldots, M-1,
\\
 \tau(X) &= \alpha + t(X).
\end{align*}
The overall identity \eqref{e:uv:lookup:overall:identity} is of the form
\[
Q(\tau(X), \varphi_0(X), \ldots, \varphi_{M-1}(X), \phi(X), \phi(g\cdot X)) = 0 \mod v_H(X),
\]
where $Q$ has $\nu = M + 3$ variables and absolute degree $d= M + 2$. 
The prover computes the Lagrange representation of the overall quotient polynomial over the multiplication domain $H'$ of size $(M + 1)\cdot |H|$, except $H$,
and sends the Lagrange representations of the component polynomials $q_1(X),\ldots, q_{M}(X)$ as defined by \eqref{e:q:decomposition} to the verifier.
The verifier checks the overall identity at a random $\beta\sample F$ by querying the oracles at the needed points.

\subsubsection*{Cost analysis}

As in Section \ref{s:lookups}, we use the fractional representation from \eqref{e:uv:lookup:overall:identity} for computing domain evaluations of $Q$. 
Using batch inversion, the equivalent multiplicative complexity for computing $Q$ is 
\[
Q_\mathsf M = M + 1 + 3\cdot (M+1) + 2 = 4\cdot M + 6,
\] 
taking into account the proportional cost of batch inversion as $3$ multiplications.  

The prover cost, simplified to the number of field multiplications, is as follows.
Computing the values of $\varphi_0, \ldots, \varphi_{M-1}$ and $\tau$ over $H$ does not cost a single field multiplication, their inverses are obtained within
\[
(M+1)\cdot 3\cdot |H| 
\]
multiplications, from which $h(x)$ over $H$ and the values of the sumcheck polynomial $\phi$ are obtained by another overll $2\cdot |H|$ multiplications.
%The interpolants of $\varphi_0,\ldots, \varphi_{M-1}$, $\tau$ and $\phi$ cost
%\[
%(M + 2) \cdot \mathsf{FFT}(|H|) = |H|\cdot \frac{M + 2}{2} \cdot \log|H|
%\] 
%field multiplications.
Since the values of the shifted function $\phi(g\cdot X)$ are directly obtained from those of $\phi$, we may apply 
\eqref{e:uv:quotientpoly:cost} for $\nu = M + 2$, and computing the Lagrange representation of the overall quotient takes
\[
(M + 2) \cdot \left(\frac{|H|}{2}\cdot \log|H| + \frac{(M+2)\cdot |H|}{2} \cdot\log((M+2)\cdot|H|) \right) + (M+2)\cdot |H|\cdot (Q_\mathsf M + 1)
%= d \cdot |H| \cdot\left(\frac{M + 2}{2}\cdot\log|H| + Q_\mathsf M +1 + \frac{M + 2}{2} \cdot\log d  \right)
\]
multiplications.
Overall, the oracle prover demands
%\begin{equation}
% |H| \cdot\left((d+1) \cdot \frac{M + 2}{2}\cdot \log|H| +  \frac{M + 2}{2} \cdot d\cdot \log d + d\cdot Q_M + 3\cdot M + 5   \right)
%\end{equation}
\begin{equation}
 |H| \cdot\left(\frac{(M + 2)^2}{2}\cdot (\log|H| + \log (M+2)) +  (M + 2)\cdot(4\cdot M+6) + \frac{M+2}{2} \cdot \log |H| + 4\cdot M + 7   \right)
\end{equation}
field multiplications, and it provides the equivalent of $M + 3$ functions of size $|H|$, i.e. the values of $m$, $\phi$  and the component polynomials $q_1, \ldots, q_{M+1}$ over $H$.
%
% Large batching strategy,
%
% the $f_i$ together with $t$ are represented by a single function $ft$ over the extended domain $\bar H$ of size $(M+1)*|H|$
% the sumcheck and its $\phi$ is over $\bar H$, the overall identity
%  phi(g X) - phi(X) = m(X) * "L(H, x)" * 1/ ft(X) mod \bar H,
% is of degree $d= 2$, and the quotient poly is of size $|\bar H| =  (M+1) *|H|$. (The "L(H,x)" is equal to v_{\bar H}(X)/ v_{H}(X). 
% The overall oracle cost is the equivalent of $2* (M + 1) + 1$ polys of size $|H|$. 
% $Q_M = 1$ and therefor 
% 

\subsection{Plookup}
\label{s:uv:plookup}

We describe the \cite{LookupsBlog} variant of plookup \cite{Plookup}, generalized to batch-lookups.
Consider the concatenation of the sequences $(f_i(g^k))_{k=0}^{n-1}$, $i=0,\ldots, M-1$, and $(t_k)$, ordered as they occur in $(t_k)$:
\[
\bar s = (\bar s_i)_{i=0}^{(M + 1)\cdot n - 1} = (\underbrace{t_0, \ldots, t_0}_{1 + m_0 \text{ times}}, \underbrace{t_1, \ldots, t_1}_{1 + m_1 \text{ times}}, \ldots, \underbrace{t_{n-1}, \ldots, t_{n-1}}_{1 + m_{n-1} \text{ times}}),
\]
and split it into $(M+1)$ sequences of length $n$,  
\[
s_i = (\bar s_{i + k\cdot (M+1)})_{k = 0}^{n-1},
\] 
with $i= 0, \ldots, M$.
We regard these sequences again as functions on $H$.
Then
\begin{equation}
\label{e:uv:lookup:multiset}
\bigcup_{i=0}^{M-1} \{s_i(x), s_{i+1}(x) \}_{x\in H} \cup \{(s_M(x), s_0(g\cdot x))\}_{x\in H} = \bigcup_{i=0}^{M-1}\{ (f_i(x), f_i(x))\}_{x\in H} \cup \{(t(x), t(g\cdot x))\}_{x\in H}
\end{equation}
as multisets.
Moreover, this multiset equality is equivalent to $\bigcup_{i=0}^{M-1}\{f_i(x)\}_{x\in H}\subseteq \{t(x)\}_{x\in H}$.
That is, the set inclusion holds if and only if there exists functions $s_0,\ldots, s_M: H\rightarrow F$ satisfying \eqref{e:uv:lookup:multiset}, which is rephrased by the lookup identity
\begin{equation}
\label{e:uv:plookup:identity}
\begin{aligned}
\prod_{x\in H} \prod_{i=0}^{M-1} (X + s_i(x) + s_{i+1}(x)\cdot Y)\cdot (&X + s_M(x) + s_0(g\cdot x)\cdot Y) 
\\
&= \prod_{x\in H} \prod_{i=0}^{M-1} (X + f_i(x) + f_i(x)\cdot Y)\cdot (X + t(x) + t(g\cdot x)\cdot Y).
\end{aligned}
\end{equation}

The lookup protocol is as follows.
The prover sends $s_0$, \ldots, $s_{M}$ subject to \eqref{e:uv:plookup:identity} to the verifier, which returns random samples $\alpha, \beta\sample F$ for $X$ and $Y$, reducing the lookup identity to the grand product
\begin{equation}
\label{e:uv:plookup:grandproduct}
\prod_{x\in H}  \frac{\sigma_{M}(x)}{\tau(x)}\cdot \prod_{i=0}^{M-1} \frac{\sigma_{i}(x)}{\varphi_{i}(x)}   = 1,
\end{equation}
where
\begin{align*}
\tau(x) &= \alpha + t(x) + \beta\cdot t(g\cdot x), 
\\
\varphi_i (x) &=\alpha +  (1+\beta)\cdot f_i(x),\quad  i=0,\ldots, M-1,
\\
\sigma_i(x) &= \alpha + s_i(x) + \beta\cdot s_{i+1}(x),\quad  i= 0, \ldots, M-1,
\\
\sigma_{M}(x) &=  \alpha + s_M(x) + \beta\cdot s_0(g\cdot x).
\end{align*}
To prove \eqref{e:uv:plookup:grandproduct}, the prover provides the cumulative products polynomial $\phi$ for the function $h(x) =  \frac{\sigma_{M}(x)}{\tau(x)}\cdot \prod_{i=0}^{M-1} \frac{\sigma_{i}(x)}{\varphi_{i}(x)} $, which is characterized by the  domain identities
\begin{equation*}
\phi(g\cdot x) \cdot \tau(x) \cdot \prod_{i=0}^{M-1} \varphi_i (x)
- \phi(x)\cdot \prod_{i=0}^M \sigma_i (x)
= 0,
\end{equation*}
and
\begin{equation*}
L_H(x, 1)\cdot (\phi(x) - 1) = 0,
\end{equation*}
for all $x\in H$.
The identities are combined into a single one using a random $\lambda \sample F$ from the verifier, yielding the overall identity
\begin{align}
\label{e:plookup:overall:identity}
Q(L_H(X,1), \varphi_0(X), \ldots, \varphi_{M-1}(X), \sigma_0(X),\ldots,\sigma_M(X), \tau(X), \phi(X), \phi(g\cdot X)) = 0 \mod v_H(X),
\end{align}
where
\begin{equation}
\label{e:plookup:Q}
Q(L, \varphi_0, \ldots, \varphi_{M-1}, \sigma_0,\ldots,\sigma_M, \tau, \phi, \phi_g) =
\phi_g \cdot \tau \cdot \prod_{i=0}^{M-1} \varphi_i
- \phi \cdot \prod_{i=0}^M \sigma_i
+  \lambda \cdot L \cdot (\phi - 1).
\end{equation}
The prover provides (the Lagrange representation) of the  quotient components $q_1(X), \ldots, q_{M+1}(X)$, and the verifier checks the overall identity  \eqref{e:plookup:overall:identity} at a random $\gamma\sample F$, by querying the involved polynomials at the needed points.

\subsubsection*{Cost analysis}

The polynomial $Q$ in the overall identity \ref{e:plookup:overall:identity} has $\nu = 2\cdot (M + 1) + 3$ variables, absolute degree $d = M+2$, and multiplicative complexity $Q_M = 2\cdot (M +1) + 2$.
Note that the values of $\phi(g\cdot X)$ over $H$ or any larger domain can be obtained by shifting, hence we can reduce the number of variables by one when using the cost formula \eqref{e:uv:quotientpoly:cost}.
The values over $H$ for all $\varphi_i$, $\sigma_i$, and $\tau$ (overall $2\cdot (M+1)$ functions) consume
\[
|H|\cdot 2\cdot (M+1),
\]
field multiplications, the rational function $h(x)$ over $H$ is derived within
\[
6\cdot |H|,
\] 
using batch inversion, and computing the cumulative product function $\phi$ over $H$ costs another $|H|\cdot \mathsf M$.
Assuming that the values of $L_H(x, 1)$ over the multiplication domain are precomputed, and that the values of $\phi(g\cdot X)$ over that domain are obtained by shifting, the cost of computing the overall quotient polynomial is 
\[
(2 \cdot M + 3) \cdot \left(\frac{|H|}{2}\cdot\log |H| +  \frac{d\cdot |H|}{2} \cdot\log(d\cdot|H|)\right) + d\cdot |H|\cdot (Q_M  + 1).
\]
The overall cost of the oracle prover is  
\begin{equation}
%|H| \cdot\left( \left((d + 1)\cdot M  + \frac{3}{2} + 2\cdot d\right)\cdot\log|H| + 2M + 9 + 1 + (M+2)\cdot d\cdot \log d  \right)
%\\
|H| \cdot\left( \frac{(2\cdot M + 3)\cdot (M + 3)}{2}\cdot\log|H| +  (M+2)\cdot (2\cdot M + 4) + \frac{(2\cdot M+3)\cdot (M+2)}{2}\cdot \log d + 2\cdot M + 9 \right),
\end{equation}
% M+ 1 + 1 + d - 1 = 2(M+1) 
while the oracle costs are equvialent to $2\cdot (M + 1) + 1$ domain evaluations of size $|H|$, corresponding to $s_0, \ldots, s_M$, $\phi$, and the quotient components $q_1, \ldots, q_{M+1}$.


\subsection{The flookup strategy}
\label{s:uv:flookup}

Section 5 of \cite{flookup} describes a polynomial IOP for lookups, which is almost identical to the logarithmic derivative approach. 
We present its generalization to batch-lookups. 
For showing that the ranges of witness function $f_i: H\rightarrow F$, $i=0, \ldots, M-1$, are contained in the range of a table $t: H\rightarrow F$, the fractional logaritmic derivative identity
%\[
%\sum_{x\in H} \frac{1}{X + f(x)} = \sum_{x\in H} \frac{m(x)}{X + t(x)}
%\]
is turned into the polynomial identity
\[
\sum_{x\in H} \sum_{i=0}^{M-1} \frac{v_T(X)}{X + f_i(x)} = \sum_{x\in H} m(x) \cdot \frac{v_T(X)}{X + t(x)},
\]
by multiplying with the precomputed table polynomial $v_T(X) = \prod_{x\in H} (X + t(x))$. 
Instead of the multiplicity function $m$, the prover explicitly provides the polynomial\footnote{%
In a variant of the protocol, communicated by A. Gabizon, the prover provides the Lagrange representation of $R_T(X)$ with respect to the Lagrange basis of the table set $\{t(x) : x\in H\}$. 
As Lagrange IOPs with respect to several Lagrange bases have different impacts on the Lagrange commitment scheme (e.g., it leads to a separate commitment in a KZG-like scheme, or a separate inner product argument for an IPA-like scheme), we do not compare with this variant.
}% 
\[
R_T(X) = \sum_{x\in H} m(x) \cdot \frac{v_T(X)}{X - t(x)},
\]
and engages with the verifier in an IOP for showing that
\begin{equation}
\sum_{x\in H}  \sum_{i=0}^{M-1} \frac{v_T(X)}{X + f_i(x)} = R_T(X).
\end{equation}
The verifier queries $v_T(X)$ and $R_T(X)$ at a random challenge $\alpha\sample F$, and both prover and verifier run a sumcheck argument for 
\[
\sum_{x\in H}  \sum_{i=0}^{M-1} \frac{1}{\varphi_i(x)} = \frac{R_T(\alpha)}{v_T(\alpha)},
\]
where $\varphi_i(x) = \alpha + f_i(x)$.
For this the prover provides the Lagrange representation of the sumcheck polynomial $\phi(X)$  subject to the domain identity
\[
\phi(g\cdot x) - \phi(x) + \frac{R_T(\alpha)}{|H|\cdot v_T(\alpha)} = \sum_{i=0}^{M-1} \frac{1}{\varphi_i(x)}
\]
for all $x\in H$, or as the polynomial identity
\begin{equation}
\label{e:flookup:overall:identity}
\left(\phi(g\cdot X) - \phi(X) + \frac{R_T(\alpha)}{|H|\cdot v_T(\alpha)}\right) \cdot \prod_{i=0}^{M-1} \varphi_i(X) =  \prod_{i=0}^{M-1} \varphi_i(X)\cdot  \sum_{i=0}^{M-1} \frac{1}{\varphi_i(x)}  \mod v_H(X).
\end{equation}
The overall identity is of the form 
\[
Q(\phi_0(X), \ldots, \phi_{M-1}(X), \phi(X), \phi(g\cdot X)) = 0 \mod v_H(X),
\]
with $Q$ having $\nu = M + 2$ variables and absolute degree $d = M + 1$.
The prover provides the Lagrange representation of the component polynomials $q_1(X), \ldots, q_M(X)$ for the overall quotient.
The verifier samples $\beta\sample F$ and checks \eqref{e:flookup:overall:identity} at $X=\beta$ by quering the involved polynomials at the needed points. 

\subsubsection*{Cost analysis}
%The main advantage of the protocol is that \eqref{e:flookup:overall:identity}  is quadratic over $H$, wheras in the univariate variant 
%of Protocol \ref{prot:lookup} the overall identity is cubic.
%%\[
%%\left(\phi(g\cdot X) - \phi(X)\right)\cdot (\alpha + t(X))\cdot (\alpha + f(X)) =   \alpha + t(X) +  (\alpha + f(X)) \cdot m(X) + q(X) \cdot v_H(X) 
%%\]
%The quadratic degree leads to lower oracles costs, with a difference of $|H|$ in favor of the flookup protocol.
%On the other hand, the computational costs for $R_T(X)$ are $\bigO{|H|\cdot \log^2|H|}$, by interpolating the multiplicities $m$ as function on the table range $T=\{t(x) : x\in H\}$. 
%Concretely, using pre-computed intermediate products of table factors $X + t(x)$ (both the coefficients and their values used for obtaining them), the  interpolation algorithm uses
%\[
%\frac{|H|}{8} \left(3\cdot \log^2|H| + 11\cdot\log|H| + 3 \right)
%\] 
%field multiplications. 

Let us count the number of field operations of the oracle prover.
Computing the coefficients of $R_T(X)$ by interpolating the multiplicities $m$ as a function on the table range $T=\{t(x) : x\in H\}$ costs 
\[
\frac{|H|}{8} \left(3\cdot \log^2|H| + 11\cdot\log|H| + 3 \right)
\] 
field multiplications, using pre-computed intermediate products of the table factors $X + t(x)$ (both the coefficients and their values used for obtaining them). 
See \cite{flookup} and the references therein for details on Lagrange interpolation over general sets.
The Lagrange representation of $R_T(X)$ over $H$ is then obtained by another 
\[
\frac{|H|}{2} \cdot \log |H|
\]
multiplications.
The values of the reciprocals $h(x) = \frac{1}{\alpha + f(x)}$ over $H$ cost $3\cdot |H|$ multiplications (using batch inversion), the computation of $\phi$ does not involve a single field multiplication.
Using the fractional representation \eqref{e:flookup:overall:identity} for evaluating $Q$ using batch inversion yields the equivalent multiplicative complexity
\[
Q_\mathsf M = M + 1 +3\cdot M = 4\cdot M + 1,
\]
and by \eqref{e:quotientpoly:cost} computing the components $q_1, \ldots, q_M$ of the overall quotient costs 
\[
%\nu  \cdot  \left(\frac{|H|}{2} \cdot\log|H| + \frac{d\cdot |H|}{2} \cdot\log(d\cdot|H|) \right)+ d\cdot |H|\cdot ( Q_M + 1) 
|H|\cdot (M+1) \left(\frac{1}{2} \cdot\log|H| + \frac{(M+1)}{2} \cdot\log((M+1)\cdot|H|) \right)+ (M+1)\cdot |H|\cdot (4\cdot M + 1). 
\]
The overall cost of the oracle prover is 
\begin{equation}
\label{e:uv:flookup:cost}
\begin{aligned}
|H|\cdot \Bigg((M+1) \left(\frac{M+2}{2} \cdot\log|H| + \frac{(M+1)}{2} \cdot\log(M+1) \right) + &(M+1) \cdot (4\cdot M + 1)
\\
&+ \frac{3}{8}\cdot \log^2|H| + \frac{15}{8}\cdot\log|H| + \frac{3}{8} + 5
\Bigg),
\end{aligned}
\end{equation}
whereas the oracle costs are equivalent to $M+2$ domain evaluations of size $|H|$, corresponding to $R_T$, $\phi$, and $q_1, \ldots, q_M$.

%
%Due to the quite small $\bigO{\log^2 |H|}$-term and the lower degree of the overall identity, our operations counts indicate that the flookup strategy performs better than the univariate variant of Protocol \ref{prot:lookup}, with a break-even point beyond $|H| = 2^{20}$.
%(However, considering batch-column lookups, the performance gap between the two IOPs shrinks with larger numbers of columns.)

%The same holds for a variant\footnotemark of the flookup IOP, which has a $\bigO{|H|\cdot\log|H|}$ prover.
%\footnotetext{%
%This variant was reported to us by A. Gabizon.
%}%
%In this variant the prover provides the polynomial $R_T(X)$ with respect to the non-normalized Lagrange basis $\frac{v_T(X)}{X - z}$, $z\in T$ of the table set, whereas the other polynomials $\phi$ and $h$ are provided with respect to the Lagrange basis of $H$.
%This approach removes the interpolation cost for $R_T(X)$, but comes at the cost of a polynomial commitment scheme with an opening proof for two different Lagrange representations.
%For the \cite{Kate} commitment scheme such a hybrid Lagrange evaluation proof increases the number of multi-scalar multiplications by $|H|$, whereas for inner product arguments that increase is higher.
%We shall elaborate on the details in another document.

%The computational cost of the prover is as follows.
%According to \cite{ModernAlgebra}, the polynomial $R_T(X)$ can be obtained from the multiplicity function $\tilde m: T\rightarrow F$ over  $T= \{t(x): x\in H\}$ by optimized Lagrange interpolation.
%Using precomputations this can be done within
%\[
%\big( 3 \cdot \textsf{FFT}(2\cdot |H|) + 2\cdot |H|\big) \cdot  \log |H| 
%\approx 6\cdot |H|\cdot \log^2 |H| + 8 \cdot |H|\cdot\log|H|
%\]
%field multiplications.
%
%
%Computing the value $v_T(\alpha)$ costs $1\cdot |H|\cdot (\textsf M + \textsf A)$, hence  
%given the values of $f(x)$ over $H$, the domain evaluation of $\frac{v_T(\alpha)}{\alpha + f(x)}$ is obtained within
%\[
%(1\cdot |H| + 3\cdot |H|)\cdot\mathsf M + |H|\cdot\mathsf A,
%\]
%using batch inversion.

\subsection{Comparison}

Based  on the equivalent number of field multiplications from Table \ref{tab:pippenger}, we obtain the following estimation of 
the logarithmic derivative lookups over plookup, as the ratio of their number of field multiplications $r =\nicefrac{\mathsf M({Plookup})}{\mathsf M({logD})}$:
For all domain sizes $\log|H| = 12, 14, 16, 18$ we get a maximum advantage of about $r_{max} \approx 1.8$, whereas 
\[
r =\nicefrac{\mathsf M({plookup})}{\mathsf M({logD})} \geq 1.75
\]
whenever $M\geq 12$.
Although having worse asymptotic complexity, the flookup strategy outperforms the logarithmic derivative plookup at these domain sizes, see Table \ref{tab:uv:comparison}.
Its maximum advantage is about $1.25$ -- $1.3$ is throughout at $M=1$, wheras that advantage tends to $1$ at high number of columns (for $M\geq 33$ it is less than $5\%$). 

%\begin{table}[h!]
%\caption{%
%The estimated performance advantage of the logarithmic derivative lookups over plookup, as the ratio of their number of field multiplications $r =\nicefrac{\mathsf M({Plookup})}{\mathsf M({logD})}$.
%The numbers are based on the equivalent number of field multiplication from Table \ref{tab:pippenger}.
%For hypercube sizes $|H|$ ranging from $2^{12}$--$2^{18}$ we describe the maximum ratio $r_{max}$ over the number of columns $M$, as well as the ranges for $M$ over which $r$ is larger than $2$ and $3$, respectively.
%The minimum ratio is throughout $r= 1.5$ and obtained in the single-column setting $M=1$. 
%}
%\label{tab:uv:comparison}
%\vspace*{0.5cm}
%\centering
%\begin{tabular} {|c|c|c|c|c|}
%\hline
%$\log|H|$ & $r\geq 1.75$ & $r\geq 1.5$ & $r_{max}$ 
%\\\hline
%12 & $M\geq 12$ & $M\geq 3$ & $1.8$ 
%\\
%14 & $M\geq 12$ & $M\geq 3$ & $1.8$ 
%\\
%16 &$M\geq 12$ & $M\geq 3$ & $1.8$ 
%\\
%18 & $M\geq 11$  & $M\geq 3$ & $1.8$ 
%\\\hline
%\end{tabular}
%\end{table}
%
\begin{table}[h!]
\caption{%
The estimated performance advantage of the flookup strategy over the logarithmic derivate approach, as the ratio of their number of field multiplications $r =\nicefrac{\mathsf M({flookup})}{\mathsf M({logD})}$.
The numbers are based on the equivalent number of field multiplication from Table \ref{tab:pippenger}.
For hypercube sizes $|H|$ ranging from $2^{12}$--$2^{18}$ we describe the maximum ratio $r_{max}$ over the number of columns $M$, as well as the ranges for $M$ over which $r$ is smaller than $1.10$ and $1.05$, respectively.
}
\label{tab:uv:comparison}
\vspace*{0.5cm}
\centering
\begin{tabular} {|c|c|c|c|c|}
\hline
$\log|H|$ & $r\leq 1.10$ & $r\leq 1.05$ & $r_{max}$ 
\\\hline
12 & $M\geq 12$ & $M\geq 32$ & $1.3$ (at $M = 1$) 
\\
14 & $M\geq 12$ & $M\geq 33$ & $1.3$ (at $M = 1$)  
\\
16 &$M\geq 13$ & $M\geq 34$ & $1.27$ (at $M = 1$)  
\\
18 &$M\geq 13$ & $M\geq 34$ & $1.25$ (at $M = 1$)  
\\\hline
\end{tabular}
\end{table}

\end{document}