Skip to content

Commit

Permalink
Uniformize destruct and pattern matching logic (tweag#1907)
Browse files Browse the repository at this point in the history
* Uniformize destruct and pattern matching logic

When patterns were first introduced, only let-destructuring existed, and
match expressions weren't a thing. Rather than to manually generate code
that would check that a pattern does match the destructured value, we
piggy-backed on contracts instead, which was easier and provided better
error messages. Up to now, this was still how destructuring worked: a
contract is elaborated from the pattern and applied to the matched
value, and should report any error if the value doesn't match. Then,
destructuring is desugared to a series of let-bindings that assume that
the destucture value has the right shape. The generated code is thus
quite simple, as it doesn't have any error handling to do.

When pattern matching landed, we kept the old destructuring
infrastructure, because it was there and worked well. However, there is
duplication: the compilation of match expressions does a work quite
similar to destructuring.

What's more, we're at risk of having a diverging semantics. Some
questions aren't trivial to answer for destructuring, in particular with
respect to lazyness: should `let _ = x in null` force `x`? What about
`let 'Foo _ = x in null`, or `let 'Foo 5 = x in null`?

After some discussion, we decided to give destructuring a simple
semantics in term of pattern matching: `let <pat> = <matched> in <body>`
should be strictly equivalent to `<matched> |> match { <pat> => <body>
}`. Indeed, `match` has a clear semantics around forcing values in lazy
languages. This also agrees with intuition, or at least gives a
reasonable semantics for most cases.

This commit implements this change in practice, by getting rid of the
ad-hoc desugaring of destructuring and rewriting let-destructuring to
actual pattern matching instead.

* Update core/src/term/pattern/compile.rs

Co-authored-by: jneem <[email protected]>

---------

Co-authored-by: jneem <[email protected]>
  • Loading branch information
yannham and jneem authored May 6, 2024
1 parent d1b42df commit 07cd719
Show file tree
Hide file tree
Showing 22 changed files with 97 additions and 376 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2,17 +2,11 @@
source: cli/tests/snapshot/main.rs
expression: err
---
error: contract broken by a value
extra field `b`
┌─ [INPUTS_PATH]/errors/destructuring_closed_fail.ncl:3:11
error: unmatched pattern
┌─ [INPUTS_PATH]/errors/destructuring_closed_fail.ncl:3:5
3let {a} = {a=1, b=2}
--- ^^^^^^^^^^ applied to this expression
│ │
expected type
= Have you misspelled a field?
= The record contract might also be too strict. By default, record contracts exclude any field which is not listed.
Append `, ..` at the end of the record contract, as in `{some_field | SomeContract, ..}`, to make it accept extra fields.


^^^^^^^^^^^^^^^^
│ │ │
│ │ this value doesn't match any branch
in this match expression
7 changes: 4 additions & 3 deletions core/src/parser/grammar.lalrpop
Original file line number Diff line number Diff line change
Expand Up @@ -404,10 +404,11 @@ RecordField: FieldDef = {
<l: @L> <path: FieldPath> <ann: FieldAnnot<Type>?> <r: @R> <t: ("=" <Term>)?> => {
let annot_span = mk_span(src_id, l, r);
let span = if let Some(ref value) = t {
// value.pos.unwrap(): the position of `t` is necessarily set by the <Term> rule
// value.pos.unwrap(): the position of `t` is necessarily set by
// the <Term> rule
// result unwrap(): the term and the annotation have spans
// coming from the same file (the current one)
RawSpan::fuse(annot_span, value.pos.unwrap()).unwrap()
// coming from the same file (the current one)
annot_span.fuse(value.pos.unwrap()).unwrap()
} else {
annot_span
};
Expand Down
9 changes: 3 additions & 6 deletions core/src/parser/uniterm.rs
Original file line number Diff line number Diff line change
Expand Up @@ -374,12 +374,9 @@ impl UniRecord {
tail: Box::new(tail),
})),
_ => {
// Position of identifiers must always be set at this stage
// (parsing)
let span_id = id.pos.unwrap();
let term_pos = field_def.pos.into_opt().unwrap_or(span_id);
Err(InvalidRecordTypeError::InvalidField(
RawSpan::fuse(span_id, term_pos).unwrap(),
// Position of identifiers must always be set at this stage (parsing)
id.pos.fuse(field_def.pos).unwrap(),
))
}
}
Expand Down Expand Up @@ -418,7 +415,7 @@ impl UniRecord {
FieldPathElem::Ident(id) => id.pos.unwrap(),
FieldPathElem::Expr(rt) => rt.pos.unwrap(),
})
.reduce(|acc, span| RawSpan::fuse(acc, span).unwrap_or(acc))
.reduce(|acc, span| acc.fuse(span).unwrap_or(acc))
// We already checked that the path is non-empty.
.unwrap();

Expand Down
2 changes: 1 addition & 1 deletion core/src/parser/utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -150,7 +150,7 @@ impl FieldDef {

// `RawSpan::fuse` only returns `None` when the two spans are in different files.
// A record field and its value *must* be in the same file, so this is safe.
let pos = TermPos::Original(RawSpan::fuse(id_span, acc_span).unwrap());
let pos = TermPos::Original(id_span.fuse(acc_span).unwrap());

match path_elem {
FieldPathElem::Ident(id) => {
Expand Down
40 changes: 34 additions & 6 deletions core/src/position.rs
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,13 @@ pub struct RawSpan {

impl RawSpan {
/// Fuse two spans if they are from the same source file. The resulting span is the smallest
/// span that contain both `span1` and `span2`.
pub fn fuse(span1: RawSpan, span2: RawSpan) -> Option<RawSpan> {
if span1.src_id == span2.src_id {
/// span that contain both `self` and `other`.
pub fn fuse(self, other: RawSpan) -> Option<RawSpan> {
if self.src_id == other.src_id {
Some(RawSpan {
src_id: span1.src_id,
start: min(span1.start, span2.start),
end: max(span1.end, span2.end),
src_id: self.src_id,
start: min(self.start, other.start),
end: max(self.end, other.end),
})
} else {
None
Expand Down Expand Up @@ -140,6 +140,34 @@ impl TermPos {
pub fn contains(&self, pos: RawPos) -> bool {
self.as_opt_ref().map_or(false, |sp| sp.contains(pos))
}

/// Fuse two positions if they are from the same source file.
///
/// - If both positions are defined and from the same file, the resulting position is the
/// smallest span that contain both.
/// - If both positions are defined but aren't from the same file, this returns `TermPos::None`
/// - If at most one position is defined, the other is returned (whether defined or not).
pub fn fuse(self, other: Self) -> Self {
match (self, other) {
(TermPos::Original(sp1), TermPos::Original(sp2)) => {
if let Some(span) = sp1.fuse(sp2) {
TermPos::Original(span)
} else {
TermPos::None
}
}
(TermPos::Inherited(sp1), TermPos::Inherited(sp2))
| (TermPos::Original(sp1), TermPos::Inherited(sp2))
| (TermPos::Inherited(sp1), TermPos::Original(sp2)) => {
if let Some(span) = sp1.fuse(sp2) {
TermPos::Inherited(span)
} else {
TermPos::None
}
}
(TermPos::None, maybe_def) | (maybe_def, TermPos::None) => maybe_def,
}
}
}

/// A natural ordering for positions: `p1` is smaller than `p2` if they are located in the same
Expand Down
8 changes: 4 additions & 4 deletions core/src/term/pattern/compile.rs
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@
//! to handle before that). We might revisit this in the future if pattern matching turns out to be
//! a bottleneck.
//!
//! Most build blocks are generated programmatically rather than written out as e.g. members of the
//! [internals] stdlib module. While clunkier, this lets more easily change the compilation
//! strategy in the future and is already a more efficient in the current setting (combining
//! Most building blocks are generated programmatically rather than written out as e.g. members of
//! the [crate::stdlib::internals] module. While clunkier, this makes it easier to change
//! the compilation strategy in the future and is more efficient in the current setting (combining
//! building blocks from the standard library would require much more function applications, while
//! we can generate inlined versions on-the-fly here).
use super::*;
Expand Down Expand Up @@ -184,7 +184,7 @@ fn update_with_merge(record_id: LocIdent, id: LocIdent, field: Field) -> RichTer
// We fuse all the definite spans together.
// unwrap(): all span should come from the same file
// unwrap(): we hope that at least one position is defined
.reduce(|span1, span2| crate::position::RawSpan::fuse(span1, span2).unwrap())
.reduce(|span1, span2| span1.fuse(span2).unwrap())
.unwrap();

let merge_label = MergeLabel {
Expand Down
130 changes: 3 additions & 127 deletions core/src/term/pattern/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,13 @@
use std::collections::{hash_map::Entry, HashMap};

use super::{
record::{Field, RecordAttrs, RecordData},
LabeledType, NickelString, Number, RichTerm, Term, TypeAnnotation,
record::{Field, RecordData},
NickelString, Number, RichTerm, TypeAnnotation,
};

use crate::{
error::EvalError,
identifier::LocIdent,
impl_display_from_pretty, mk_app,
parser::error::ParseError,
error::EvalError, identifier::LocIdent, impl_display_from_pretty, parser::error::ParseError,
position::TermPos,
stdlib::internals,
typ::{Type, TypeF},
};

pub mod compile;
Expand Down Expand Up @@ -188,125 +183,6 @@ impl RecordPattern {
}
}

impl FieldPattern {
/// Convert this field pattern to a record field binding with metadata. Used to generate the
/// record contract associated to a record pattern.
pub fn as_record_binding(&self) -> (LocIdent, Field) {
let mut annotation = self.annotation.clone();
// If the inner pattern gives rise to a contract, add it the to the field decoration.
annotation
.contracts
.extend(self.pattern.elaborate_contract());

(self.matched_id, Field::from(annotation))
}
}

// We don't implement elaborate_contract for `FieldPattern`, which is of a slightly different
// nature (it's a part of a record pattern). Instead, we call to `FieldPattern::as_record_binding`,
// which takes care of elaborating a field pattern to an appropriate record field.
pub trait ElaborateContract {
/// Elaborate a contract from this pattern. The contract will check both the structure of the
/// matched value (e.g. the presence of fields in a record) and incoporate user-provided
/// contracts and annotations, as well as default values.
///
/// Some patterns don't give rise to any contract (e.g. `Any`), in which case this function
/// returns `None`.
fn elaborate_contract(&self) -> Option<LabeledType>;
}

impl ElaborateContract for PatternData {
fn elaborate_contract(&self) -> Option<LabeledType> {
match self {
PatternData::Wildcard | PatternData::Any(_) => None,
PatternData::Record(pat) => pat.elaborate_contract(),
PatternData::Enum(pat) => pat.elaborate_contract(),
PatternData::Constant(pat) => pat.elaborate_contract(),
}
}
}

impl ElaborateContract for Pattern {
fn elaborate_contract(&self) -> Option<LabeledType> {
self.data.elaborate_contract()
}
}

// Generate the contract `std.contract.Equal <value>` as a labeled type.
fn contract_eq(value: Term, span: crate::position::RawSpan) -> LabeledType {
let contract = mk_app!(internals::stdlib_contract_equal(), value);

let typ = Type {
typ: TypeF::Flat(contract),
pos: span.into(),
};

LabeledType::new(typ, span)
}

impl ElaborateContract for ConstantPattern {
fn elaborate_contract(&self) -> Option<LabeledType> {
// See [^unwrap-span].
let span = self.pos.unwrap();

Some(match &self.data {
ConstantPatternData::Bool(b) => contract_eq(Term::Bool(*b), span),
ConstantPatternData::String(s) => contract_eq(Term::Str(s.clone()), span),
ConstantPatternData::Number(n) => contract_eq(Term::Num(n.clone()), span),
ConstantPatternData::Null => contract_eq(Term::Null, span),
})
}
}

impl ElaborateContract for EnumPattern {
fn elaborate_contract(&self) -> Option<LabeledType> {
// TODO[adts]: it would be better to simply build a type like `[| 'tag arg |]` or `[| 'tag
// |]` and to rely on its derived contract. However, for the time being, the contract
// derived from enum variants isn't implemented yet.
let contract = if self.pattern.is_some() {
mk_app!(internals::enum_variant(), Term::Enum(self.tag))
} else {
mk_app!(internals::stdlib_contract_equal(), Term::Enum(self.tag))
};

let typ = Type {
typ: TypeF::Flat(contract),
pos: self.pos,
};

// [^unwrap-span]: We need the position to be defined here. Hopefully,
// contract-generating pattern are pattern used in destructuring, and destructuring
// patterns aren't currently generated by the Nickel interpreter. So we should only
// encounter user-written patterns here, which should have a position.
Some(LabeledType::new(typ, self.pos.unwrap()))
}
}

impl ElaborateContract for RecordPattern {
fn elaborate_contract(&self) -> Option<LabeledType> {
let typ = Type {
typ: TypeF::Flat(
Term::Record(RecordData::new(
self.patterns
.iter()
.map(FieldPattern::as_record_binding)
.collect(),
RecordAttrs {
open: self.is_open(),
..Default::default()
},
None,
))
.into(),
),
pos: self.pos,
};

// unwrap(): see [^unwrap-span]
Some(LabeledType::new(typ, self.pos.unwrap()))
}
}

impl_display_from_pretty!(PatternData);
impl_display_from_pretty!(Pattern);
impl_display_from_pretty!(ConstantPatternData);
Expand Down
Loading

0 comments on commit 07cd719

Please sign in to comment.