Type checking and preservation proof for LTL #206

gergo- · 2017-10-06T12:04:33Z

The main reason for the patches in this branch is to strengthen the Locmap.gss_reg lemma, which used to claim that any value can be written to and recovered from any register:

Lemma gss_reg: forall r v m, (set (R r) v m) (R r) = v.

This statement is not true if v is a 64-bit value and r is a 32-bit register. (This does not seem to allow miscompilations in practice, as there are plenty of type checks both above and below the LTL level.) The new version enforces correct typing, as for stack slots:

Lemma gss_reg: forall r v m, Val.has_type v (mreg_type r) -> (set (R r) v m)

The BR_splitlong constructor used to be recursive, meaning that a long result could in theory be split into an arbitrary tree of atomic parts. But we only ever split longs into exactly two ints, so this generality was not needed. This simplification will help with the LTL typing proof.

If LTLtyping finds that the program after register allocation is well-typed, then execution preserves well-typedness of the state. In particular, this typing property ensures that Locmap accesses are well-typed: All register writes are of values with a type compatible with the register's type.

Locmap.set now uniformly uses `Val.load_result` to model stores to registers and to stack slots equivalently.

xavierleroy · 2017-10-09T07:40:30Z

Some general comments. (We'll go over the code when you're back.)

I support the new "typed" semantics for registers in LTL. It is more realistic and unifies the treatment of registers with that of stack locations.
At first I thought the Allocation checker could be strengthened to guarantee the well-typedness of the generated LTL, without a need for an additional LTL type-checking pass. Maybe it's not true, and probably it's simpler to have the additional LTL type-checking pass. But then, are there opportunities to simplify the Allocation checker? E.g. the well_typed_move business.
I'm on the edge concerning the non-recursive BR_splitlong. This makes it impossible to describe an int64 builtin result that would be split between a 32-bit register and a 32-bit stack location. Apparently we never need this, otherwise this PR would break. On the other hand, on the side of builtin arguments we do need this kind of description (for debugging info and for annotation builtins) and so BA_splitlong must be recursive. So we're losing some symmetry between builtin results and builtin arguments. Probably it doesn't matter.

gergo- · 2017-10-11T12:26:26Z

Thanks for your comments.

At first I thought the Allocation checker could be strengthened to guarantee the well-typedness of the generated LTL, without a need for an additional LTL type-checking pass.

That may be possible, I didn't try to approach it that way. The type-checking pass is convenient for reasoning of the form "if f is well-typed, then any instruction in f is well-typed". But there might be enough information in the Allocation checker to replace this.

I do use the Allocation checker where the information from the type checker is not precise enough, i.e., in the case we discussed of spills from a general register (e.g., Tany32) to a stack slot of a concrete type (e.g., Tsingle). Maybe we could generalize this and do away with the type checker completely.

I'm on the edge concerning the non-recursive BR_splitlong. This makes it impossible to describe an int64 builtin result that would be split between a 32-bit register and a 32-bit stack location.

I'm not sure this is true. BR_splitlong takes a type argument A, so a value of type BR_splitlong Locations.loc should be able to hold a register in one part and a stack slot in the other. On the other hand, on a quick skim I didn't find any occurrences of this type in the diffs, so maybe this was never actually supported? Most occurrences use mreg or preg for A, only allowing pairs of registers, excluding pairs of a register and a stack slot.

In addressing modes for load and store instructions, the offset must be a multiple of the memory size being accessed. When accessing global variables, this may not be the case if the alignment of the variable is less than its size. Errors occur at link time. This PR extends the check for a representable offset for the addressing of global variables to also check whether the variable is correctly aligned. Only if both conditions are met can we generate the short sequence Padrp / ADadr. Otherwise we go through the generic loadsymbol sequence.

gergo- added 5 commits October 6, 2017 11:14

Add LTLtyping and enforce well-typed LTL in Allocation

993491f

Adapt LTL typing proof for all architectures

bfddee5

Unify register and stack cases in Locmap.set

2bef5ed

Locmap.set now uniformly uses `Val.load_result` to model stores to registers and to stack slots equivalently.

xavierleroy force-pushed the master branch from aa95f89 to 0aeff47 Compare November 7, 2020 14:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Type checking and preservation proof for LTL #206

Type checking and preservation proof for LTL #206

gergo- commented Oct 6, 2017

xavierleroy commented Oct 9, 2017

gergo- commented Oct 11, 2017

Type checking and preservation proof for LTL #206

Are you sure you want to change the base?

Type checking and preservation proof for LTL #206

Conversation

gergo- commented Oct 6, 2017

xavierleroy commented Oct 9, 2017

gergo- commented Oct 11, 2017