Skip to content

Latest commit

 

History

History
1202 lines (1016 loc) · 41.6 KB

DOC.md

File metadata and controls

1202 lines (1016 loc) · 41.6 KB

Getting started with SSProve

This document shall serve as a non-exhaustive guide to SSProve.

This document assumes that you have Coq and SSProve installed and have already some knowledge of Coq.

🚧 This document tries to be as exhaustive as possible, but can still be improved. If you find something missing or not clear enough, feel free to open an issue. 🚧

Overview

  1. Writing packages
    1. Raw code
    2. Specialised types
    3. Distributions
    4. Valid code
    5. Packages
  2. High-level SSP proofs
    1. Package algebra
    2. Adversarial advantage
  3. Probabilistic relational program logic
    1. Proving perfect indistinguishability
    2. Massaging relational judgments
    3. Proving relational judgments
    4. Crafting invariants

Writing packages

SSProve defines a language of code which can feature probabilistic sampling, assertions, memory storing and accesses, but also external procedure import. It is a shallow embedding meaning that one can inject any Coq/Gallina expression into it by using the ret (standing for return) operation which we will expose below.

Raw code

The main notion of code is defined as the type raw_code A which represents a program returning a value of type A. This type A is typically—but not limited to—of type choice_type.

Before detailing how to construct them, here is a first example with no particular meaning.

#import {sig #[0] : 'nat → 'bool × 'nat } as f ;;
#import {sig #[1] : 'bool → 'unit } as g ;;
'(b,k) ← f 0 ;;
if b then (
  g false ;;
  m ← sample uniform 2 ;;
  ret 0
)
else (
  o ← get ℓ ;;
  #assert (isSome o) as oSome ;;
  let n := getSome o oSome in
  put n := Some (2 + n) ;;
  ret n
)

where is defined as

Definition ℓ : Location := ('option 'nat ; 0).

It first imports two procedures with respective identifiers 0 and 1 and types 'nat → 'bool × 'bool and 'bool → 'unit, calling them f and g. We take the result of f (the external procedure) applied to 0 as a pair (b,k) and then do a case-analysis on b. In the else branch, we read memory location , assert that it contains a Some, reusing this fact (called oSome) to get the value itself. We then increment this value twice and place it back in memory before returning the original value.

Return constructor ret

ret : ∀ {A}, A → raw_code A

Injects any pure value into raw_code.

Memory access

A Location is a pair of a type in choice_type and a natural number representing an identifier, for instance ('nat ; 12) : Location. One can read memory as follows:

x ← get ℓ ;; k x
(* Or with pattern matching *)
'p ← get ℓ ;; k p

where k is a continuation, i.e. raw code which can depend on x. One can write to a memory location as follows:

put ℓ := v ;; k

where v is a value of the right type and k a continuation, which this time doesn't expect any value from the writing.

Probabilistic sampling

x ← sample op ;; k x
(* Or alternatively *)
x <$ op ;; k x
(* Or with pattern matching *)
'p ← sample op ;; k p

Here op : Op is a (sub-)distribution. See Distributions.

Failure

fail : ∀ {A : choice_type}, raw_code A

Represents a failure in a program. It is obtained by sampling on the null sub-distribution.

Assertion

#assert b as e ;; k e
(* Alternatively, if the continuation doesn't need the equality *)
#assert b ;; k

Assert that the boolean b is true and store an equality e : b = true to be reused by the continuation. #assert true as e ;; k e simplifies to k erefl while #assert false as e ;; k e simplifies to fail.

Import

x ← op sig ⋅ arg ;; k x

Represents application of imported/assumed procedure of signature sig : opsig to argument arg, calling the result x before passing it to continuation k. See Specialised types for how to define sig. A preferred alternative to writing imports it to use the following notation

#import sig as f ;; k' f

where f bound in the continuation is a function which can be applied via bind. For instance if sig is {sig #[n] : 'nat → 'bool × 'bool } then f has type nat → raw_code (bool * bool) and can be used as

x ← f arg ;; k x

Bind

raw_code is a monad and as such it supports a bind operator. With value reuse it can be written as follows:

x ← v ;; k x
(* Or with pattern matching *)
'p ← v ;; k p

and without, as

v ;; k

This operation is not a primitive/constructor and will reduce to the above constructions when v is concrete.

Specialised types

We have a special type called choice_type which contains codes for specific types that we use in our packages. These are the types used in Location and in opsig or even in Op.

To differentiate them from actual types while retaining some familiarity we usually style them with a quotation mark in front of the type they represent. This is the case for instance of 'nat, 'bool, 'unit and 'option which are self-explanatory.

We also provide 'fin n which is the inhabited finite type of size n. Under the hood, Coq will attempt to prove that n is non-zero. In case it fails, the user should provide instances or hints for the Positive type-class.

We also have the product type chProd x y which is interpreted to Coq's product prod. For instance chProd 'nat 'bool corresponds to nat * bool.

Finally we have the type of finite maps chMap x y where x is the type of keys, and y the type of values.

Further notation in specific settings

When defining signatures (opsig), interfaces (see Valid code), or packages (see Packages), one can further use handy notations for chProd and chMap, as exemplified below:

'nat × 'bool
{map 'nat → 'option 'nat }

Signatures

A signature (opsig) is given by an identifier (a natural number), an argument type and a return type (both in choice_type). Once can for instance write (37, ('nat, chProd 'unit 'unit)).

We provide the following nicer notation:

{sig #[37] : 'nat → 'unit × 'unit }

Distributions

The user can sample using pretty much any distribution that can be expressed in mathcomp-analysis provided that its support is in choice_type. Writing them by hand might not be very convenient.

For the time being we provide uniform n which represents a uniform distribution landing in 'fin n. As such n must be positive (Coq will look for an instance of Positive n).

Valid code

Raw code as described above is well-typed but does not have any guarantees with respect to what it imports and which location it uses. We therefore define a notion a validity with respect to an import interface and a set of locations.

Set of locations

The set of locations is expected as an {fset Location } using the finite sets of the extructures library. For our purposes, it is advisable to write them directly as list which of locations which is then cast to an fset using the fset operation, as below:

fset [:: ℓ₀ ; ℓ₁ ; ℓ₂ ]

This is the best way to leverage the automation that we introduced. Nevertheless, in some cases it might be more convenient to use the union (:|:) operator of extructures.

Interfaces

An interface is a set of signatures (opsig) corresponding to the procedures that a piece of code can import and use. Rather than writing them as fset directly, we provide special convenient notations, as well the type Interface.

Interfaces are wrapped in the [interface] container which behaves like lists. They are of the form

[interface d₀ ; d₁ ; … ; dₙ ]

where the dᵢ are signatures, given using a special syntax:

val #[ id ] : src → tgt

where id is a natural number / identifier, and src and tgt are codes of types in choice_type given using the special syntax (see Specialised types).

Here are examples of interfaces:

[interface]
[interface
  val #[ 0 ] : 'nat → 'nat ;
  val #[ 1 ] : 'option 'bool → 'unit × {map 'nat → 'bool }
]

Validity of code

Validity of code c with respect to set of locations L and import interface I is denoted by the class ValidCode L I c. We derive from it the type code L I A of valid code.

Raw code c can be cast to code by using the notation

{code c }

For instance, in the following, we declare a simple code by just giving the raw_code and using the {code} notation:

Definition foo : code fset0 [interface] bool :=
  {code ret true }.

The fact that this is a class means that in practice, the validity proof should automatically be inferred by Coq. In case where automation doesn't work, it is still possible to leverage it to find which sub-goal it did not solve for you by using the ssprove_valid tactic.

Here is an example using Equations that allows us to use the proof mode to fill in the validity proof.

Obligation Tactic := idtac.

Definition ℓ : Location := ('nat ; 0).

Equations? foo : code fset0 [interface] 'nat :=
  foo := {code
    n ← get ℓ ;;
    ret n
  }.
Proof.
  ssprove_valid.
  (* We have to prove ℓ \in fset0 which does not hold. *)
Abort.

We can then see where the mistake was and change the empty interface to something containing like fset [:: ℓ ].

Note that ssprove_valid and the inference for ValidCode can be extended with hints. The former using the ssprove_valid_db database, the latter with the typeclass_instances database.

Note: There is an implicit coercion from code to raw_code.

Packages

Package construction

We have a notion of raw_package which is a collection of procedures of the form src → raw_code tgt distinguished by their signatures. This notion of raw_package will prove the most efficient when proving results about packages, such as advantages. However, we provide a syntax to define valid packages by construction, i.e. of type package L I E where each procedure must be ValidCode L I tgt and the lot of them must implement export interface E.

The syntax for valid packages is similar to that of interfaces. Better explained on an example:

Definition test :
  package
    fset0
    [interface
      val #[0] : 'nat → 'bool ;
      val #[1] : 'bool → 'unit
    ]
    [interface
      val #[2] : 'nat → 'nat ;
      val #[3] : 'bool × 'bool → 'bool
    ]
  :=
  [package
    def #[2] (n : 'nat) : 'nat {
      #import {sig #[0] : 'nat → 'bool } as f ;;
      #import {sig #[1] : 'bool → 'unit } as g ;;
      b ← f n ;;
      if b then
        g false ;;
        ret 0
      else ret n
    } ;
    def #[3] ('(b₀,b₁) : 'bool × 'bool) : 'bool {
      ret b₀
    }
  ].

Packages are wrapped in the [package] container which behaves like lists. They are of the form

[package d₀ ; d₁ ; … ; dₙ ]

where the dᵢ are declarations, given using a special syntax:

def #[ id ] (x : src) : tgt { e }

where id is a natural number / identifier, and src and tgt are codes of types in choice_type given using the special syntax (see Specialised types), while e is a regular Coq expression corresponding to the body of the function, with x bound inside it. As seen in the example, x can be matched against in the declaration by using the 'p notation where p is a pattern.

Similarly to ValidCode, there is an underlying ValidPackage class and we can call its best effort version with ssprove_valid, for instance using Equations (see Valid code).

In the example above we also explicitly gave an export interface while the information is already present in the declaration. As such in can be omitted as on the simpler example below:

Definition test' : package fset0 [interface] _ :=
  [package
    def #[ 0 ] (n : 'nat) : 'nat {
      ret (n + n)%N
    } ;
    def #[ 1 ] (b : 'bool) : 'nat {
      if b then ret 0 else ret 13
    }
  ].

The locations and import interface should however be given explicitly since they are what the programs can use, not what they exactly use.

Composition of packages

One of the key points of SSP is its package algebra with sequential and parallel composition as well as the identity package. All these operations are defined on raw_packages directly but extend to package with the {package} and {locpackage} notations.

Sequential composition is called link in SSProve and can be written p₀ ∘ p₁. It represents p₀ where all imports are replaced by the inlined procedures of p₁. It is valid when the export interface of p₁ matches the import interface of p₀.

Parallel composition of (raw) packages p₀ and p₁ is written par p₀ p₁. It is valid if we have Parable p₀ p₁ (which is a class). The resulting package must have the union of locations of its components, as such automation can be lacking on that front, so it might be a good idea to rely on Equations again:

Equations? pkg : package L I E :=
  pkg := {package (par p₀ p₁) ∘ p₂ }.
Proof.
  ssprove_valid.
  (* Now deal with the goals *)

Finally, the identity package is defined as ID I where I is an interface. It both imports and exports I by simply forwarding the calls. It is valid as long as I does not include two signatures sharing the same identifier, as overloading is not possible in our packages. This property is written flat I and can be inferred automatically by ssprove_ valid.

As illustrated above, {package p } casts a raw package to some package L I E, trying to infer the proof. We also have {locpackage p } which will cast to loc_package I E which is essentially the same as package but where the set of locations is internalised.

Note: loc_package and package both have implicit coercions to raw_package. This means that, for instance, if p₀ and p₁ are both package then, {package p₀ ∘ p₁ } is a valid expression, and will be complete if the interfaces match.

Note: When using Equations, please Set Equations Transparent such that code simplification in proofs via ssprove_code_simpl resolves properly.

High-level SSP proofs

To reason at the high-level of state-separating proofs, we have two main options. The first one is the package algebra which involves laws on sequential and parallel composition as well as on the identity package. The second is when talking about advantage and corresponds mainly to the triangle inequality and the reduction lemma.

Most of those apply to raw_package directly, but some will still have some extra conditions which might be validity of some bits.

Package algebra

The algebraic laws on packages are expressed as equalities (using Coq's equality type =) on raw_package.

Associativity of sequential composition / linking

Lemma link_assoc :
  ∀ p₁ p₂ p₃,
    p₁ ∘ (p₂ ∘ p₃) = (p₁ ∘ p₂) ∘ p₃.

Commutativity of parallel composition

Lemma par_commut :
  ∀ p1 p2,
    Parable p1 p2 →
    par p1 p2 = par p2 p1.

Associativity of parallel composition

Lemma par_assoc :
  ∀ p1 p2 p3,
    par p1 (par p2 p3) = par (par p1 p2) p3.

Identity law

Lemma link_id :
  ∀ L I E p,
    ValidPackage L I E p →
    flat I →
    trimmed E p →
    link p (ID I) = p.
Lemma id_link :
  ∀ L I E p,
    ValidPackage L I E p →
    trimmed E p →
    link (ID E) p = p.

These laws require the package p to be valid but also to be trimmed which means that it doesn't implement more than it exports. For packages constructed as in [Packages], this is always the case.

Interchange between sequential and parallel composition

Lemma interchange :
  ∀ A B C D E F L₁ L₂ L₃ L₄ p₁ p₂ p₃ p₄,
    ValidPackage L₁ B A p₁ →
    ValidPackage L₂ E D p₂ →
    ValidPackage L₃ C B p₃ →
    ValidPackage L₄ F E p₄ →
    trimmed A p₁ →
    trimmed D p₂ →
    Parable p₃ p₄ →
    par (p₁ ∘ p₃) (p₂ ∘ p₄) = (par p₁ p₂) ∘ (par p₃ p₄).

The last line can be read as

(p₁ ∘ p₃) || (p₂ ∘ p₄) = (p₁ || p₂) ∘ (p₃ || p₄)

Adversarial advantage

Security theorems in SSP will often conclude on an inequality of advantages. We offer several ways to reason about them, but first we will show how to even state such theorems.

Advantage and games

The simplest notion of advantage we have is AdvantageE of the following type

AdvantageE (G₀ G₁ A : raw_package) : R

G₀ and G₁ are the packages compared by the distinguisher/adversary A. The result is a real number, of type R.

We also have an alternative version simply style Advantage which takes in a GamePair:

Definition GamePair :=
  bool → raw_package.
Advantage (G : GamePair) (A : raw_package) : R

The two definitions are equivalent, as stated by the following. AdvantageE should be preferred as it is slightly less constrained.

Lemma Advantage_E :
  ∀ (G : GamePair) A,
    Advantage G A = AdvantageE (G false) (G true) A.

We have several useful lemmata on advantage. We will list the important ones below.

Lemma Advantage_link :
  ∀ G₀ G₁ A P,
    AdvantageE G₀ G₁ (A ∘ P) =
    AdvantageE (P ∘ G₀) (P ∘ G₁) A.

This one corresponds to the reduction lemma and is very useful.

Lemma Advantage_sym :
  ∀ P Q A,
    AdvantageE P Q A = AdvantageE Q P A.
Lemma Advantage_triangle :
  ∀ P Q R A,
    AdvantageE P Q A <= AdvantageE P R A + AdvantageE R Q A.

The triangle inequality is also very useful when reasoning about advantage. As such we provide the user with an n-ary version of it which allows the user to simulate game-hopping, in the form of a convenient tactic.

ssprove triangle p₀ [:: p₁ ; p₂ ; p₃ ] p₄ A as ineq.

will produce an inequality

ineq :
  AdvantageE p₀ p₄ A <= AdvantageE p₀ p₁ A +
                        AdvantageE p₁ p₂ A +
                        AdvantageE p₂ p₃ A +
                        AdvantageE p₃ p₄ A

Perfect indistinguishability

When the advantage of an adversary A (with disjoint state) against a game pair (G₀, G₁) is 0, we say that G₀ and G₁ are perfectly indistinguishable and we write G₀ ≈₀ G₁. Because this definition needs to talk about state, it can only apply to valid packages. This notation indeed implicitly asks for the following:

ValidPackage L₀ Game_import E G₀
ValidPackage L₁ Game_import E G₁

for some L₀, L₁ and E. It is equivalent to the following:

∀ LA A,
  ValidPackage LA E A_export A →
  fdisjoint LA L₀ →
  fdisjoint LA L₁ →
  AdvantageE G₀ G₁ A = 0.

So one can use G₀ ≈₀ G₁ to rewrite an advantage to 0, typically after using the triangle inequality, to eliminate some terms. Herein A_export is the export interface of an adversary, it contains a single procedure RUN of type 'unit → 'bool.

Probabilistic relational program logic

To prove perfect indistinguishability of two packages, we propose a low-level probabilistic relational Hoare logic. We first show how to prove a statement of the form P ≈₀ Q and then how to reason in this program logic.

Proving perfect indistinguishability

The lemma of interest here is the following:

Lemma eq_rel_perf_ind :
  ∀ {L₀ L₁ E} (p₀ p₁ : raw_package) (inv : precond)
    `{ValidPackage L₀ Game_import E p₀}
    `{ValidPackage L₁ Game_import E p₁},
    Invariant L₀ L₁ inv →
    eq_up_to_inv E inv p₀ p₁ →
    p₀ ≈₀ p₁.

Most conditions are for p₀ ≈₀ p₁ to even make sense. The important part is that to prove p₀ ≈₀ p₁ it suffices to prove that their procedures are related in our program logic, while preserving an invariant inv. An invariant relates the two heaps (state) used by p₀ and p₁ respectively. The simplest example of invariant simply state equality of the two:

λ '(s₀, s₁), s₀ = s₁

To use it, one case use the following special case.

Corollary eq_rel_perf_ind_eq :
  ∀ {L₀ L₁ E} (p₀ p₁ : raw_package)
    `{ValidPackage L₀ Game_import E p₀}
    `{ValidPackage L₁ Game_import E p₁},
    eq_up_to_inv E (λ '(h₀, h₁), h₀ = h₁) p₀ p₁ →
    p₀ ≈₀ p₁.

We will say more about invariants later in [Crafting invariants].

Once this lemma is applied, we need to simplify the eq_up_to_inv expression. We have a set of tactics that help us achieve that automatically.

eapply eq_rel_perf_ind_eq.
simplify_eq_rel x. (* x is a name *)
all: ssprove_code_simpl.

simplify_eq_rel x will turn eq_upto_inv into one goal for each procedure, x being the name for the argument in each case. The goals it returns can be quite massive, with typically linking that is not reduced (not inlined). For each sub-goal (hence the goal selector all:), we apply the ssprove_code_simpl tactic which we will describe in the next section.

Massaging relational judgments

A relational goal obtained after simplify_eq_rel will be of the form

⊢ ⦃ pre ⦄ c₀ ≈ c₁ ⦃ post ⦄

where pre is a precondition, post a postcondition, and c₀ and c₁ are both raw code. As stated above, the expressions c₀ and c₁ may not be in their best shape. For instance, linking might be stuck because of a match expression.

ssprove_code_simpl will simplify such a goal by traversing both c₀ and c₁ and performing simplifications such as commutation of linking with match, or associativity of bind. In some rare cases, two applications of this tactic might be necessary. While it performs most possible simplifications, it only works syntactically.

ssprove_code_simpl_more on the other hand will operate semantically, exploiting the fact that we are proving a relational judgment. One of the main points of it is that it can deal with associativity of #assert.

ssprove_code_simpl is actually extensible by adding hints to the ssprove_code_simpl database. Consider for instance the following extension:

Hint Extern 50 (_ = code_link _ _) =>
  rewrite code_link_scheme
  : ssprove_code_simpl.

The hints must be able to solve goals where the left-hand side is an evar and the right-hand side is the expression to simplify. Here we state that whenever the expression to simplify is code_link we can rewrite using code_link_scheme before continuing the simplification.

The lemma in question is

Lemma code_link_scheme :
  ∀ A c p,
    @ValidCode fset0 [interface] A c →
    code_link c p = c.

stating that code which does not import anything (here we add the unnecessary requirement that it must be state-less as well) remains unchanged after linking.

Proving relational judgments

Now that our goal is pretty enough to read, we can try and prove it. For this a lot of rules are introduced in the pkg_rhl module (which you import simply by importing Package). Not all of them are useful for the user experience, some are simply used to prove other ones.

We will present the most important ones as well as tactics that can help apply certain rules.

Synchronous rules

The most basic tactic deals with judgment with the same first instruction on both sides, for instance

⊢ ⦃ pre ⦄ x ← sample D ;; k₀ x ≈ x ← sample D ;; k₁ x ⦃ post ⦄

Applying the ssprove_sync tactic will reduce the goal to

∀ x, ⊢ ⦃ pre ⦄ k₀ x ≈ k₁ x ⦃ post ⦄

In the case of sampling there is no extra requirement, but when the first operation is a get or a put, there is a priori no guarantee that the precondition pre will be preserved. ssprove_sync will try to solve this extra condition on its own but will ask the user for it if it doesn't manage. To solve it, it calls the extensible ssprove_invariant tactic that we will see in [Crafting invariants].

There is also a more specialised tactic ssprove_sync_eq which solves the same problem when the precondition is equality: λ '(s₀, s₁), s₀ = s₁. This one never generates extra conditions.

Bind

Sometimes, the head is not a command stem, but a bind. In this case we have the r_bind rule. It is also more general than the above in that the two head expression can differ, provided they are still related.

Lemma r_bind :
  ∀ {A₀ A₁ B₀ B₁ : ord_choiceType} m₀ m₁ f₀ f₁
    (pre : precond) (mid : postcond A₀ A₁) (post : postcond B₀ B₁),
    ⊢ ⦃ pre ⦄ m₀ ≈ m₁ ⦃ mid ⦄ →
    (∀ a₀ a₁, ⊢ ⦃ λ '(s₀, s₁), mid (a₀, s₀) (a₁, s₁) ⦄ f₀ a₀ ≈ f₁ a₁ ⦃ post ⦄) →
    ⊢ ⦃ pre ⦄ bind m₀ f₀ ≈ bind m₁ f₁ ⦃ post ⦄.

Return

The tactics above only apply when the code has a head command, this excludes programs like ret t. To deal with them, we have the specialised r_ret rule.

Lemma r_ret :
  ∀ {A₀ A₁ : ord_choiceType} u₀ u₁ (pre : precond) (post : postcond A₀ A₁),
    (∀ s₀ s₁, pre (s₀, s₁) → post (u₀, s₀) (u₁, s₁)) →
    ⊢ ⦃ pre ⦄ ret u₀ ≈ ret u₁ ⦃ post ⦄.

Swapping

Sometimes, two expressions are similar but don't have the same head symbol. In this case, it might prove useful to swap commands. Say we have the following goal:

⊢ ⦃ pre ⦄ x ← sample D ;; y ← get ℓ ;; r₀ x y ≈ c₁ ⦃ post ⦄

then applying ssprove_swap_lhs 0%N will leave us to prove

⊢ ⦃ pre ⦄ y ← get ℓ ;; x ← sample D ;; r₀ x y ≈ c₁ ⦃ post ⦄

instead.

However, not any two commands are swappable. The tactic will try to infer the swappability condition automatically, this is the case for sampling which can always be swapped (if dependencies permit), or for get/put when they talk about distinct locations. If automation proves insufficient, the user will have to provide the proof themselves. There is also the option of enriching the ssprove_swap database, as in the example below.

Hint Extern 40 (⊢ ⦃ _ ⦄ x ← ?s ;; y ← cmd _ ;; _ ≈ _ ⦃ _ ⦄) =>
  eapply r_swap_scheme_cmd ; ssprove_valid
  : ssprove_swap.

Here we provide a hint to swap a bind with a regular command (get, sample or put), in the case where the program that we bind is a scheme, i.e. a stateless, import-less program:

Lemma r_swap_scheme_cmd :
  ∀ {A B : choiceType} (s : raw_code A) (c : command B),
    ValidCode fset0 [interface] s →
    ⊢ ⦃ λ '(s₀, s₁), s₀ = s₁ ⦄
      x ← s ;; y ← cmd c ;; ret (x,y) ≈
      y ← cmd c ;; x ← s ;; ret (x,y)
    ⦃ eq ⦄.

The tactics we provide are

  • ssprove_swap_lhs n for swapping in the left-hand side at depth n;
  • ssprove_swap_rhs n for swapping in the right-hand side;
  • ssprove_swap_seq_lhs s for swapping in the lhs using a sequence s of swapping instructions given as a list of natural numbers;
  • ssprove_swap_seq_rhs s for the same in the rhs.

Reflexivity rule

We have a very simple reflexivity rule in the case where the invariant is equality:

Lemma rreflexivity_rule :
  ∀ {A : ord_choiceType} (c : raw_code A),
    ⊢ ⦃ λ '(s₀, s₁), s₀ = s₁ ⦄ c ≈ c ⦃ eq ⦄.

In case the invariant is not equality, there is still a reflexivity rule, but it is more constrained:

Lemma r_reflexivity_alt :
  ∀ {A : choiceType} {L} pre (c : raw_code A),
    ValidCode L [interface] c →
    (∀ ℓ, ℓ \in L → get_pre_cond ℓ pre) →
    (∀ ℓ v, ℓ \in L → put_pre_cond ℓ v pre) →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄
      c ≈ c
    ⦃ λ '(b₀, s₀) '(b₁, s₁), b₀ = b₁ ∧ pre (s₀, s₁) ⦄.

Validity can be handled with ssprove_valid and the other get_pre_cond and put_pre_cond are goals dealt with ssprove_invariant.

Dealing with memory

There are ways to deal with memory is an asynchronous way. We tried to make it as idiomatic as possible.

Reading or writing twice

When faced with the goal

⊢ ⦃ pre ⦄ x ← get ℓ ;; r₀ x ≈ x ← get ℓ ;; r₁ x ⦃ post ⦄

one can use ssprove_sync to introduce the x and continue proving equivalence between r₀ and r₁. The information that x comes from location however is lost.

The first solution to this problem comes from contraction rules, or rather tactics.

ssprove_contract_get_lhs will take a goal of the form

⊢ ⦃ pre ⦄ x ← get ℓ ;; y ← get ℓ ;; r₀ x y ≈ c₁ ⦃ post ⦄

and turn it into

⊢ ⦃ pre ⦄ x ← get ℓ ;; r₀ x x ≈ c₁ ⦃ post ⦄

ssprove_contract_put_lhs will take a goal of the form

⊢ ⦃ pre ⦄ put ℓ := v ;; put ℓ := v' ;; c₀ ≈ c₁ ⦃ post ⦄

and turn it into

⊢ ⦃ pre ⦄ put ℓ := v' ;; c₀ ≈ c₁ ⦃ post ⦄

ssprove_contract_put_get_lhs will take a goal of the form

⊢ ⦃ pre ⦄ put ℓ := v ;; x ← get ℓ ;; r₀ x ≈ c₁ ⦃ post ⦄

and turn it into

⊢ ⦃ pre ⦄ put ℓ := v ;; r₀ v ≈ c₁ ⦃ post ⦄

ssprove_contract_get_rhs, ssprove_contract_put_rhs and ssprove_contract_put_get_rhs are their right-hand side counterparts.

Remember after reading

Sometimes, swapping and contracting is not possible, even when the code makes two reads to the same location. It can happen for instance if the value read is branched upon before being read again.

For this we have several rules that will remember which location was read.

Lemma r_get_remember_lhs :
  ∀ {A B : choiceType} ℓ r₀ r₁ (pre : precond) (post : postcond A B),
    (∀ x,
      ⊢ ⦃ λ '(s₀, s₁), (pre ⋊ rem_lhs ℓ x) (s₀, s₁) ⦄ r₀ x ≈ r₁ ⦃ post ⦄
    ) →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ x ← get ℓ ;; r₀ x ≈ r₁ ⦃ post ⦄.

It behaves like you would expect an asynchronous rule for get except that the precondition gets extended with rem_lhs ℓ x stating that the location contained value x on the left-hand side. In this fashion we have r_get_remember_rhs which will add rem_rhs ℓ x instead, but also synchronous rules that will also remember, for instance

Lemma r_get_vs_get_remember_lhs :
  ∀ {A B : choiceType} ℓ r₀ r₁ (pre : precond) (post : postcond A B),
    Syncs ℓ pre →
    (∀ x,
      ⊢ ⦃ λ '(s₀, s₁), (pre ⋊ rem_lhs ℓ x) (s₀, s₁) ⦄
        r₀ x ≈ r₁ x
      ⦃ post ⦄
    ) →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄
      x ← get ℓ ;; r₀ x ≈
      x ← get ℓ ;; r₁ x
    ⦃ post ⦄.

Here we have an additional Syncs ℓ pre condition which states that should point to the same value on both sides (or more precisely that this should be ensured by the precondition pre). exact _ (type-class inference) or ssprove_invariant should deal with it.

We also have the right-hand side variant r_get_vs_get_remember_rhs and the do-all rule r_get_vs_get_remember which remembers on both sides.

Now, that we have stored information in the precondition, we have several ways of using it, or discarding it. Indeed, this precondition will not always be preserved by rules, in particular writing memory (put) does not necessarily preserve rem_lhs/rem_rhs. As such, one can call ssprove_forget to discard the most recent remember, and ssprove_forget_all will discard all of them.

More importantly, one can make use of remembered values with, for instance, the following rule

Lemma r_get_remind_lhs :
  ∀ {A B : choiceType} ℓ v r₀ r₁ (pre : precond) (post : postcond A B),
    Remembers_lhs ℓ v pre →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ r₀ v ≈ r₁ ⦃ post ⦄ →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ x ← get ℓ ;; r₀ x ≈ r₁ ⦃ post ⦄.

Here Remembers_lhs is also a class that can be inferred using ssprove_invariant. It basically checks that pre contains some rem_lhs ℓ v. The right-hand side counterpart is r_get_remind_rhs. In some cases, one has remembered the value of something on the left, and needs it on the right, in which case the following lemma is useful:

Lemma Remembers_rhs_from_tracked_lhs :
  ∀ ℓ v pre,
    Remembers_lhs ℓ v pre →
    Syncs ℓ pre →
    Remembers_rhs ℓ v pre.

and similarly Remembers_lhs_from_tracked_rhs.

We will see later, in [Crafting invariants], how we can also leverage these remembered values with invariants.

Invariant debts after writing

Dually to how we remember read values, we propose a way to write to a memory location, even when it might temporarily break the invariant. As we will se in [Crafting invariants], a lot of invariants will involve several locations at once, meaning the most of the time, writing a value will break them. Thus, our machinery to write to the memory freely and then, at the user's command, to restore the invariant.

These debts to the precondition are incurred by using one of the following rules.

Lemma r_put_lhs :
  ∀ {A B : choiceType} ℓ v r₀ r₁ (pre : precond) (post : postcond A B),
    ⊢ ⦃ λ '(s₀, s₁), (set_lhs ℓ v pre) (s₀, s₁) ⦄ r₀ ≈ r₁ ⦃ post ⦄ →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ put ℓ := v ;; r₀ ≈ r₁ ⦃ post ⦄.

or its rhs counterpart r_put_rhs. We can deal with a put on both sides with r_put_vs_put.

Now we either have set_lhs or set_rhs around our invariant. This means that temporarily we cannot remember read values or use the invariant as it might no longer hold. Once we believe that the invariant has been restored, we can use one of the two tactics ssprove_restore_pre and ssprove_restore_mem.

ssprove_restore_pre is the simplest and typically applies to a goal where the precondition has been hidden by several set_lhs/set_rhs. Under the hood it applies the rule

Lemma r_restore_pre :
  ∀ {A B : choiceType} l r₀ r₁ (pre : precond) (post : postcond A B),
    preserve_update_pre l pre →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ r₀ ≈ r₁ ⦃ post ⦄ →
    ⊢ ⦃ λ '(s₀, s₁), (update_pre l pre) (s₀, s₁) ⦄ r₀ ≈ r₁ ⦃ post ⦄.

So it will restore pre as a precondition, assuming that the predicate preserve_update_pre holds. Automation in this case is also performed with ssprove_invariant. It might not solve all goals, but should generally give you to prove the specific invariants that you used.

Note that if your precondition contains some rem_lhs/rem_rhs, you will have to prove that those are preserved too. This will not be the case if you have written to those very memory locations. In which case, it is recommended to use the following tactic instead.

ssprove_restore_mem is similar but will also take into account the remembered read values. Under the hood it applies the rule

Lemma r_restore_mem :
  ∀ {A B : choiceType} l m r₀ r₁ (pre : precond) (post : postcond A B),
    preserve_update_mem l m pre →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ r₀ ≈ r₁ ⦃ post ⦄ →
    ⊢ ⦃ λ '(s₀, s₁), (update_pre l (remember_pre m pre)) (s₀, s₁) ⦄
      r₀ ≈ r₁
    ⦃ post ⦄.

The predicate preserve_update_mem generalises the one above (in fact preserve_update_pre is defined using the empty list for the remembered values, this means that automation is shared between the two). This can be very useful because it essentially means that you can use the assumptions you have on initial memory to restore the invariant.

The predicate is defined as

Definition preserve_update_mem l m (pre : precond) :=
  ∀ s₀ s₁, (remember_pre m pre) (s₀, s₁) → pre (update_heaps l s₀ s₁).

Note that restoring the invariant with this method will forget all assumptions you had on memory, only the proper invariant will remain. Feel free to open an issue if you would need something stronger.

Crafting invariants

Proper invariants

As already stated, the easiest to use invariant is equality of heaps. The fact that it is an invariant is summarised in the lemma below:

Lemma Invariant_eq :
  ∀ L₀ L₁,
    Invariant L₀ L₁ (λ '(s₀, s₁), s₀ = s₁).

where L₀ and L₁ represent the sets of memory locations of both programs. While it can be enough for a lot of examples (our own examples mostly use equality as an invariant), it is not always sufficient.

Another invariant we propose is called heap_ignore and is defined as

Definition heap_ignore (L : {fset Location}) :=
  λ '(h₀, h₁),
    ∀ (ℓ : Location), ℓ \notin L → get_heap h₀ ℓ = get_heap h₁ ℓ.

It only states equality of heaps on locations that are not in L, the set of ignored locations. It is a valid invariant as long as the ignored locations are contained in the program location (in other words, one cannot ignore the locations of the adversary).

Lemma Invariant_heap_ignore :
  ∀ L L₀ L₁,
    fsubset L (L₀ :|: L₁) →
    Invariant L₀ L₁ (heap_ignore L).

The two above are considered proper invariants, or support of an invariant, because they conclude on the whole heaps. However, invariants will often need to conclude about specific locations, like saying that two locations store values that are related in some way. For this purpose, we define a notion of semi-invariant (short of a better name) which do not conclude about the heaps globally but only about parts of it. Proper invariants (like heap_ignore) can then be combined with semi-invariants to produce new proper invariants; we do this with the notation inv ⋊ sinv which is a sort of fancy notation for conjunction.

Lemma Invariant_inv_conj :
  ∀ L₀ L₁ inv sinv,
    Invariant L₀ L₁ inv →
    SemiInvariant L₀ L₁ sinv →
    Invariant L₀ L₁ (inv ⋊ sinv).

A semi-invariant isn't as restrictive as an invariant. Note that we already used this notion in [Proving relational judgments] when talking about rem_lhs/rem_rhs which are semi-invariants.

Note: Using ssprove_invariant is the recommended way for deriving automatically these properties. This tactic can be extended by adding hints to the ssprove_invariant database.

We will now review the already defined semi-invariants.

Location coupling

One semi-invariant that we provide is couple_lhs:

Definition couple_lhs ℓ ℓ' R : precond :=
  λ '(s₀, s₁), R (get_heap s₀ ℓ) (get_heap s₀ ℓ').

couple_lhs ℓ ℓ' R states that the values stored in locations and ℓ' of the lhs are related by relation R. Alternatively, we also provide couple_rhs. It is a semi-invariant provided that the locations belong to the programs:

Lemma SemiInvariant_couple_lhs :
  ∀ L₀ L₁ ℓ ℓ' (R : _ → _ → Prop),
    ℓ \in L₀ :|: L₁ →
    ℓ' \in L₀ :|: L₁ →
    R (get_heap empty_heap ℓ) (get_heap empty_heap ℓ') →
    SemiInvariant L₀ L₁ (couple_lhs ℓ ℓ' h).

Now, to make use of this invariant, one can call the following rule:

Lemma r_rem_couple_lhs :
  ∀ {A B : choiceType} ℓ ℓ' v v' R (pre : precond) c₀ c₁ (post : postcond A B),
    Couples_lhs ℓ ℓ' R pre →
    Remembers_lhs ℓ v pre →
    Remembers_lhs ℓ' v' pre →
    (R v v' → ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ c₀ ≈ c₁ ⦃ post ⦄) →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ c₀ ≈ c₁ ⦃ post ⦄.

It basically gives you the same goal you add, with the extra hypothesis that the relation holds for the values. When you typically do eapply (r_rem_couple_lhs ℓ ℓ'), you can try exact _ or ssprove_invariant to infer Couples_lhs and Remembers_lhs that will check in the precondition that the two locations are indeed coupled on the left and that we have read them.

Alternatively, one can use r_rem_couple_rhs.

We also provide useful relations like

Definition sameSome {A B} (x : option A) (y : option B) :=
  isSome x = isSome y.

Relating three locations

Similarly to couple_rhs we also provide triple_rhs which works in essentially the same way.

Lemma SemiInvariant_triple_rhs :
  ∀ L₀ L₁ ℓ₁ ℓ₂ ℓ₃ (R : _ → _ → _ → Prop),
    ℓ₁ \in L₀ :|: L₁ →
    ℓ₂ \in L₀ :|: L₁ →
    ℓ₃ \in L₀ :|: L₁ →
    R (get_heap empty_heap ℓ₁) (get_heap empty_heap ℓ₂) (get_heap empty_heap ℓ₃) →
    SemiInvariant L₀ L₁ (triple_rhs ℓ₁ ℓ₂ ℓ₃ R).
Lemma r_rem_triple_rhs :
  ∀ {A B : choiceType} ℓ₁ ℓ₂ ℓ₃ v₁ v₂ v₃ R
    (pre : precond) c₀ c₁ (post : postcond A B),
    Triple_rhs ℓ₁ ℓ₂ ℓ₃ R pre →
    Remembers_rhs ℓ₁ v₁ pre →
    Remembers_rhs ℓ₂ v₂ pre →
    Remembers_rhs ℓ₃ v₃ pre →
    (R v₁ v₂ v₃ → ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ c₀ ≈ c₁ ⦃ post ⦄) →
    ⊢ ⦃ λ '(s₀, s₁), pre (s₀, s₁) ⦄ c₀ ≈ c₁ ⦃ post ⦄.

🚧 For the moment we only deal with the right-hand side, as we might want to develop something more general. 🚧