Skip to content

Commit

Permalink
Record pattern compilation (#1816)
Browse files Browse the repository at this point in the history
* Minor rewording of module documentation

* Basics of pattern compilation

This PR implements core functions to compile patterns and match
expressions to simpler Nickel code that doesn't use patterns at all.
This is a step toward allowing general pattern in match expressions.

For now, this new implementation isn't actually used yet as it lacks a
couple new primitive operations to work properly. The implementation
also ignores contracts annotation in the pattern.

* Add %field_is_defined%

To safely check if a pattern matches a value (safely meaning without
throwing an uncatchable error), we need a primitive operation to check
if we can safely access the field of a record. To do so, we need the
field to exists _and_ to have a definition. This is what the added
`field_is_defined` checks.

* Update pattern compilation to use field_is_defined

* Use new with_env primop for pattern compilation

* Add non exhaustive match error

As match expressions are going to handle more general patterns, the
existing NonExhaustiveMatch error - which was in fact specialized for
enums - is renamed to NonExhaustiveMatchEnum, and the previous name is
reused for a more general errors which doesn't assume that the pattern
or the matched value are enums.

* Fix bugs in pattern compil. and associated primops

* Correctly handle record pattern tail

The partial implementation of record pattern compilation didn't handle
the record pattern tail yet: for example, a closed pattern `{foo, bar}`
would match a larger record `{foo=1, bar=2, baz=3}`.

This commit additionally keeps the current rest of the record being
matched, that is the original record minus all the matched fields, to
either check that it is empty at the end (closed pattern), or to bind
this rest to an identifier if the pattern captures the rest with `{foo,
bar, ..rest}`.

* Fix record pattern compilation

After doing some experimentation, fix a number of bugs observed in the
compilation of record patterns: wrong order of arguments, wrong
identifiers used, clash of the special rest field between patterns and
sub-patterns, and so on.

* with_env -> pattern_branch

Rename the `with_env` primop to `pattern_branch` in order to avoid
confusion about its scope and usage: this shouldn't be used as a general
environment augmentation operation, although it kinda is, because of
existing footguns.

* pattern.rs -> pattern/mod.rs

* split compile mod from pattern mod in its own file

* Fix clippy and rustdoc warnings, remove unneeded code

* Implement reviewer suggestions

Co-authored-by: jneem <[email protected]>

---------

Co-authored-by: jneem <[email protected]>
  • Loading branch information
yannham and jneem authored Feb 15, 2024
1 parent 0768f05 commit f7b50d3
Show file tree
Hide file tree
Showing 9 changed files with 626 additions and 6 deletions.
30 changes: 27 additions & 3 deletions core/src/error/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -147,15 +147,23 @@ pub enum EvalError {
},
/// A non-equatable term was compared for equality.
EqError { eq_pos: TermPos, term: RichTerm },
/// A value didn't match any branch of a `match` expression at runtime.
NonExhaustiveMatch {
/// A value didn't match any branch of a `match` expression at runtime. This is a specialized
/// version of [Self::NonExhaustiveMatch] when all branches are enum patterns. In this case,
/// the error message is more informative than the generic one.
NonExhaustiveEnumMatch {
/// The list of expected patterns. Currently, those are just enum tags.
expected: Vec<LocIdent>,
/// The original term matched
found: RichTerm,
/// The position of the `match` expression
pos: TermPos,
},
NonExhaustiveMatch {
/// The original term matched.
value: RichTerm,
/// The position of the `match` expression
pos: TermPos,
},
/// Tried to query a field of something that wasn't a record.
QueryNonRecord {
/// Position of the original unevaluated expression.
Expand Down Expand Up @@ -1270,7 +1278,7 @@ impl IntoDiagnostics<FileId> for EvalError {
.with_message("cannot compare values for equality")
.with_labels(labels)]
}
EvalError::NonExhaustiveMatch {
EvalError::NonExhaustiveEnumMatch {
expected,
found,
pos,
Expand Down Expand Up @@ -1303,6 +1311,22 @@ impl IntoDiagnostics<FileId> for EvalError {
"But it has been applied to an argument which doesn't match any of those patterns".to_owned(),
])]
}
EvalError::NonExhaustiveMatch { value, pos } => {
let mut labels = Vec::new();

if let Some(span) = pos.into_opt() {
labels.push(primary(&span).with_message("in this match expression"));
}

labels.push(
secondary_term(&value, files)
.with_message("this value doesn't match any branch"),
);

vec![Diagnostic::error()
.with_message("unmatched pattern")
.with_labels(labels)]
}
EvalError::IllegalPolymorphicTailAccess {
action,
label: contract_label,
Expand Down
59 changes: 58 additions & 1 deletion core/src/eval/operation.rs
Original file line number Diff line number Diff line change
Expand Up @@ -352,7 +352,7 @@ impl<R: ImportResolver, C: Cache> VirtualMachine<R, C> {
env: cases_env,
})
.or(default)
.ok_or_else(|| EvalError::NonExhaustiveMatch {
.ok_or_else(|| EvalError::NonExhaustiveEnumMatch {
expected: cases.keys().copied().collect(),
found: RichTerm::new(Term::Enum(*en), pos),
pos: pos_op_inh,
Expand Down Expand Up @@ -1159,6 +1159,47 @@ impl<R: ImportResolver, C: Cache> VirtualMachine<R, C> {
pos_op_inh,
)))
}
UnaryOp::PatternBranch() => {
// The continuation, that we must evaluate in the augmented environment.
let (mut cont, _) = self
.stack
.pop_arg(&self.cache)
.ok_or_else(|| EvalError::NotEnoughArgs(2, String::from("with_env"), pos_op))?;

match_sharedterm!(match (t) {
Term::Record(data) => {
for (id, field) in data.fields {
if let Some(value) = field.value {
match_sharedterm!(match (value.term) {
Term::Closure(idx) => {
cont.env.insert(id.ident(), idx);
}
_ => {
cont.env.insert(
id.ident(),
self.cache.add(
Closure {
body: value,
env: env.clone(),
},
BindingType::Normal,
),
);
}
});
} else {
// This should not really happen, as `with_env` is intended to be
// used with very simple records: no metadata, no recursive fields,
// no field without definition, etc.
debug_assert!(false);
}
}

Ok(cont)
}
_ => Err(mk_type_error!("with_env", "Record")),
})
}
}
}

Expand Down Expand Up @@ -1755,6 +1796,22 @@ impl<R: ImportResolver, C: Cache> VirtualMachine<R, C> {
}
_ => Err(mk_type_error!("has_field", "String", 1, t1, pos1)),
}),
BinaryOp::FieldIsDefined(op_kind) => match_sharedterm!(match (t1) {
Term::Str(id) => {
if let Term::Record(record) = &*t2 {
Ok(Closure::atomic_closure(RichTerm::new(
Term::Bool(matches!(
record.fields.get(&LocIdent::from(id.into_inner())),
Some(field @ Field { value: Some(_), ..}) if matches!(op_kind, RecordOpKind::ConsiderAllFields) || !field.is_empty_optional()
)),
pos_op_inh,
)))
} else {
Err(mk_type_error!("field_is_defined", "Record", 2, t2, pos2))
}
}
_ => Err(mk_type_error!("field_is_defined", "String", 1, t1, pos1)),
}),
BinaryOp::ArrayConcat() => match_sharedterm!(match (t1) {
Term::Array(ts1, attrs1) => match_sharedterm!(match (t2) {
Term::Array(ts2, attrs2) => {
Expand Down
5 changes: 5 additions & 0 deletions core/src/parser/grammar.lalrpop
Original file line number Diff line number Diff line change
Expand Up @@ -1000,6 +1000,8 @@ BOpPre: BinaryOp = {
"go_field" => BinaryOp::GoField(),
"has_field" => BinaryOp::HasField(RecordOpKind::IgnoreEmptyOpt),
"has_field_all" => BinaryOp::HasField(RecordOpKind::ConsiderAllFields),
"field_is_defined" => BinaryOp::FieldIsDefined(RecordOpKind::IgnoreEmptyOpt),
"field_is_defined_all" => BinaryOp::FieldIsDefined(RecordOpKind::ConsiderAllFields),
"elem_at" => BinaryOp::ArrayElemAt(),
"hash" => BinaryOp::Hash(),
"serialize" => BinaryOp::Serialize(),
Expand Down Expand Up @@ -1239,6 +1241,8 @@ extern {

"has_field" => Token::Normal(NormalToken::HasField),
"has_field_all" => Token::Normal(NormalToken::HasFieldAll),
"field_is_defined" => Token::Normal(NormalToken::FieldIsDefined),
"field_is_defined_all" => Token::Normal(NormalToken::FieldIsDefinedAll),
"map" => Token::Normal(NormalToken::Map),
"generate" => Token::Normal(NormalToken::ArrayGen),
"elem_at" => Token::Normal(NormalToken::ElemAt),
Expand Down Expand Up @@ -1278,6 +1282,7 @@ extern {
"enum_unwrap_variant" => Token::Normal(NormalToken::EnumUnwrapVariant),
"enum_is_variant" => Token::Normal(NormalToken::EnumIsVariant),
"enum_get_tag" => Token::Normal(NormalToken::EnumGetTag),
"pattern_branch" => Token::Normal(NormalToken::PatternBranch),

"{" => Token::Normal(NormalToken::LBrace),
"}" => Token::Normal(NormalToken::RBrace),
Expand Down
7 changes: 7 additions & 0 deletions core/src/parser/lexer.rs
Original file line number Diff line number Diff line change
Expand Up @@ -270,6 +270,10 @@ pub enum NormalToken<'input> {
RecForceOp,
#[token("%rec_default%")]
RecDefaultOp,
#[token("%field_is_defined%")]
FieldIsDefined,
#[token("%field_is_defined_all%")]
FieldIsDefinedAll,

#[token("merge")]
Merge,
Expand Down Expand Up @@ -342,6 +346,9 @@ pub enum NormalToken<'input> {
#[token("%eval_nix%")]
EvalNix,

#[token("%pattern_branch%")]
PatternBranch,

#[token("{")]
LBrace,
#[token("}")]
Expand Down
31 changes: 30 additions & 1 deletion core/src/term/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -618,6 +618,15 @@ impl fmt::Display for MergePriority {
}
}

/// Content of a match expression.
#[derive(Debug, PartialEq, Clone)]
pub struct MatchData {
/// Branches of the match expression, where the first component is the pattern on the left hand
/// side of `=>` and the second component is the body of the branch.
pub branches: Vec<(Pattern, RichTerm)>,
pub default: Option<RichTerm>,
}

/// A type or a contract together with its corresponding label.
#[derive(Debug, PartialEq, Clone)]
pub struct LabeledType {
Expand Down Expand Up @@ -1370,6 +1379,20 @@ pub enum UnaryOp {
EnumIsVariant(),
/// Extract the tag from an enum tag or an enum variant.
EnumGetTag(),

/// Take a record representing bindings to be added to the local environment and proceed to
/// evaluate a pattern branch given as a second argument (which isn't a proper primop argument
/// but is stored on the stack) in its environment augmented with the bindings.
///
/// [Self::PatternBranch] isn't specific to pattern branches: what it does is to take a set of
/// extra bindings and a term, and run the term in the augmented environment. While it could
/// useful to implement other operations, it would be fragile as a generic `with_env` operator,
/// because the term to be run must not be burried into a closure, or the environment
/// augmentation would be shallow and have no effect on the actual content of the term (we have
/// the same kind of constraints when updating record fields with the recursive environment of
/// a record, for example). This is why the name tries to make it clear that it shouldn't be
/// used blindly for something else.
PatternBranch(),
}

impl fmt::Display for UnaryOp {
Expand Down Expand Up @@ -1425,6 +1448,8 @@ impl fmt::Display for UnaryOp {
EnumUnwrapVariant() => write!(f, "enum_unwrap_variant"),
EnumIsVariant() => write!(f, "enum_is_variant"),
EnumGetTag() => write!(f, "enum_get_tag"),

PatternBranch() => write!(f, "with_env"),
}
}
}
Expand Down Expand Up @@ -1581,6 +1606,9 @@ pub enum BinaryOp {
/// Test if a record has a specific field.
HasField(RecordOpKind),

/// Test if the field of a record exists and has a definition.
FieldIsDefined(RecordOpKind),

/// Concatenate two arrays.
ArrayConcat(),

Expand Down Expand Up @@ -1675,6 +1703,8 @@ impl fmt::Display for BinaryOp {
DynAccess() => write!(f, "dyn_access"),
HasField(RecordOpKind::IgnoreEmptyOpt) => write!(f, "has_field"),
HasField(RecordOpKind::ConsiderAllFields) => write!(f, "has_field_all"),
FieldIsDefined(RecordOpKind::IgnoreEmptyOpt) => write!(f, "field_is_defined"),
FieldIsDefined(RecordOpKind::ConsiderAllFields) => write!(f, "field_is_defined_all"),
ArrayConcat() => write!(f, "array_concat"),
ArrayElemAt() => write!(f, "elem_at"),
Merge(_) => write!(f, "merge"),
Expand Down Expand Up @@ -2499,7 +2529,6 @@ pub mod make {
Term::LetPattern(pat.into(), t1.into(), t2.into()).into()
}

#[cfg(test)]
pub fn if_then_else<T1, T2, T3>(cond: T1, t1: T2, t2: T3) -> RichTerm
where
T1: Into<RichTerm>,
Expand Down
Loading

0 comments on commit f7b50d3

Please sign in to comment.