Skip to content

Function contracts

Mark Tuttle edited this page Sep 18, 2021 · 6 revisions

A function contract is a function specification. It consists of

  • a precondition that must be true when the function is called,
  • a postcondition that will be true when the function returns, and
  • a write set giving the set of all memory locations that the function might change (nothing outside the write set will be changed).

CBMC can check that a function satisfies its function contract by proving that for all memory states and for all function arguments, if the memory and arguments satisfy the precondition when the function is invoked, then when the function returns, only memory locations in the write set will have been changed, and the memory and return arguments will satisfy the postcondition.

CBMC can replace a function with its function contract by checking that the memory and function arguments satisfy the precondition when the function is invoked, and then replacing the values of memory locations in the write set with arbitrary values satisfying the postcondition.

The value of function contracts is proof abstraction:

  • API specifications: If we can write function contracts for every function in an interface, then we can ignore the details of the interface implementation when we validate code using the interface. The proof becomes valid for any implementation of the API, and what might otherwise be a large and complicated problem for CBMC to solve may become much smaller and more tractible.
  • Proof scaling: Applying CBMC directly to a code base can be intractible. A function with a complicated call graph simply may not fit within the capacity of CBMC on any computer of reasonable size. But suppose we think of the function call graph as a computation tree, and suppose we can find a small set of functions such that if we prune the computation tree at the invocations of these functions, then we get a collection of subtrees of reasonable size. If we can write function contracts for these functions, and if we can prove that these functions satisfy their contracts, then we can use these function contracts in the proofs of other function contracts for other functions that depend on them. We can essentially replace a subtree with the function contract for the function at the root of the subtree. In this way, we can use function contracts to scale CBMC proof to code bases larger than CBMC could otherwise validate.

Writing function contracts

We write a function contract by annotating the function definition with

  • requires clauses that describe the precondition,
  • ensures clauses that describe the postcondition, and
  • assigns clauses that describe the write set.

Recall our string buffer example strbuf2.c from the preceding section with function

typedef struct { size_t length; char* buffer; } strbuf;

char cache;
void cache_fifth_char(strbuf* str) { cache = str->buffer[4]; }

and proof harness

main() {
  size_t len;
  strbuf* str = strbuf_allocate(len);

  __CPROVER_assume(strbuf_is_valid(str));
  __CPROVER_assume(str->length > 4);

  cache_fifth_char(str);

  __CPROVER_assert(strbuf_is_valid(str), "String buffer remains valid");
  __CPROVER_assert(str->length > 4, "String buffer remains length >4");
}

This proof harness is --- as explicitly as possible --- describing the preconditions and postconditions for the function. The assume clauses describe the preconditions and the assert clauses describe the postconditions. One thing not described here is the fact that the function modifies the value of cache.

Now let's write the function contract for this function. All we have to do is annotate the function definition with requires, ensures, and assigns clauses.

void cache_fifth_char(strbuf* str)
  __CPROVER_requires(strbuf_is_valid(str))
  __CPROVER_requires(str->length > 4)
  __CPROVER_ensures(strbuf_is_valid(str))
  __CPROVER_ensures(str->length > 4)
  __CPROVER_assigns(cache)
{
  cache = str->buffer[4];
}

Having done this, we can now simplify the proof harness to doing nothing but allocating data

main() {
  size_t len;
  strbuf* str = strbuf_allocate(len);

  cache_fifth_char(str);
}

You can see the full example in contract1.c. Notice by the way, the strength of the function contract. In particular, the assigns clause says that cache may be modified, but the assigns clause also says that only cache is modified. In particular, left unchanged is

  • the pointer str,
  • the length str->length,
  • the pointer str->buffer, and
  • the contents of the buffer pointed to by str->buffer.

This means that our ensures clauses are actually unnecessary, because they follow immediately from the assigns clause. Let's leave them alone for now and treat them as valuable documentation.

But let's modify this function just a bit to illustrate things we can say with function contracts that are harder to say (or at least clumsier) with a proof harness alone. Let's modify the function cache_fifth_char to increment a global counter and return the character written to the cache.

char cache;
int counter;

char cache_fifth_char(strbuf* str)
{
  counter = counter + 1;
  cache = str->buffer[4];
  return str->buffer[4];
}

Now we can add to the function contract the lines

  __CPROVER_ensures(cache == __CPROVER_return_value)
  __CPROVER_ensures(counter == __CPROVER_old(counter) + 1)
  __CPROVER_assigns(counter)

The first line illustrates how the symbol __CPROVER_return_value can be used to refer to the function return value in the postcondition of a function contract. It says that the return value and the cached value are the same character.

The second line illustrates how the symbol __CPROVER_old(v) applied to a variable v can be used to refer to the value of v before function invocation. It says that the value of counter after function return is one greater than the value of counter before function invocation.

The third line says that the function now increments counter in additon to cache.

You can see the full example in contract2.c

Checking function contracts

To prove that cache_fifth_char satisfies its function contract, we run CBMC as before, but first we have to instrument the program to do the contract checking. We use the goto-cc compiler to produce the goto program as usual, but now we use goto-instrument --enforce-contract cache_fifth_char to instrument the program with contract checking.

goto-cc -o contract2.goto contract2.c
goto-instrument --enforce-contract cache_fifth_char contract2.goto proof2.goto
cbmc --pointer-check --bounds-check --malloc-may-fail --malloc-fail-null proof2.goto

We have proved that cache_fifth_char satisfies its function contract.

Using function contracts

Now let's use our function contract to verify another function that calls it. (Drum roll, please.)

Let's forget about old proof harnesses and let's rewrite main to use cache_fifth_char in some way.

#define LEN 5
main() {
  strbuf* str = strbuf_allocate(LEN+1);
  if (!str || !str->buffer) return;

  strcpy(str->buffer, "abcde");
  counter = 0;
  cache = 'a';

  cache_fifth_char(str);

  assert(counter == 1);
  assert(cache == 'e');
}

The function fills the buffer with a string whose fifth character is 'e' and sets counter and cache to initial values. Then it invokes cache_fifth_char and checks that counter has been incremented and cache has been set to the fifth character 'e'.

We use function contracts just as we checked function contracts, except that this time we call goto-instrument with --replace-call-with-contract to instrument the program to use the function contact in place of the function call. We run the following, and...

goto-cc -o contract3.goto contract3.c
goto-instrument --replace-call-with-contract cache_fifth_char contract3.goto proof3.goto
cbmc --pointer-check --bounds-check --malloc-may-fail --malloc-fail-null proof3.goto

It fails! What? The assertion cache == 'e' is failing! How is this possible? The function contract ensures that cache == __CPROVER_return_value, and it is obvious from the code that the return value is the fifth character which is 'e'.

But the function contract doesn't actually say that the return value is the fifth character. And, in fact, if we run CBMC with --trace to produce the error trace demonstrating the assertion failure, we see that cache is being set to 'd' and the return value is 'd', and all of that is perfectly consistent with the function contract that we actually wrote.

Let's fix this by adding the missing postcondition

  __CPROVER_ensures(cache == str->buffer[4])

to produce contract3a.c. Now running

goto-cc -o contract3a.goto contract3a.c
goto-instrument --replace-call-with-contract cache_fifth_char contract3a.goto proof3a.goto
cbmc --pointer-check --bounds-check --malloc-may-fail --malloc-fail-null proof3a.goto

successfully validates the code, and we are done.

Except...

Warning: Check before use!

We have just made a horrible mistake in our conclusion that we have fully validated our code: We never went back to check that function satisfied the new function contract.

We first proved that cache_fifth_char satisfied the original function without the missing postcondition cache == str->buffer[4] using the proof harness contract2.c. We then tried to validate our code using our original function contract contract3.c and failed, so we created a new function contract by adding the missing postcondition. We then successfully validated our code using the new function contract contract3a.c. But we never proved that that cache_fifth_char satisfies the new function contract.

If we go back to our origintal proof harness for the function contract contract2.c and add the missing postcondition contract2a.c, we can prove that cache_fifth_char satisfies the new function contract with

goto-cc -o contract2a.goto contract2a.c
goto-instrument --enforce-contract cache_fifth_char contract2a.goto proof2a.goto
cbmc --pointer-check --bounds-check --malloc-may-fail --malloc-fail-null proof2a.goto

Now we really are done, and we have learned an important lesson:

  • It is essential to check a function contract before using it in another proof.

It is fine to work out the correct function contract by using it in different contexts to learn what is required of a function contract, but in the end, the final version of the function contract needs to be checked before it is used. To forget to do this is a serious mistake.