Mutable Value Semantics
Mutable Value Semantics
ABSTRACT Mutable value semantics is a programming discipline that upholds the independence of values to support local
reasoning. In the discipline’s strictest form, references become second-class citizens: they are only created implicitly, at function
boundaries, and cannot be stored in variables or object fields. Hence, variables can never share mutable state. Unlike pure
functional programming, however, mutable value semantics allows part-wise in-place mutation, thereby eliminating the memory
traffic usually associated with functional updates of immutable data.
This paper presents implementation strategies for compiling programs with mutable value semantics into efficient native code.
We study Swift, a programming language based on that discipline, through the lens of a core language that strips some of
Swift’s features to focus on the semantics of its value types. The strategies that we introduce leverage the inherent properties of
mutable value semantics to unlock aggressive optimizations. Fixed-size values are allocated on the stack, thereby enabling
numerous off-the-shelf compiler optimizations, while dynamically sized containers use copy-on-write to mitigate copying costs.
KEYWORDS Mutable value semantics, local reasoning, memory safety, borrowing, copy-on-write, compilation, optimizations.
An AITO publication
freedom to write efficient and type-safe programs. These ideas, – We present a handful of compiler optimizations relying on
however, invariably complicate type systems with mechanisms local reasoning and leveraging runtime knowledge to elide
like named lifetimes, which significantly raise the barrier to unnecessary copies (Section 6).
entry for inexperienced developers (Turner 2017). – We report performance measurements on handwritten and
1 fn longer_of ( x : String , y : String ) −> String { randomly generated programs with varying numbers of mu-
2 if x . len () > y . len () { x } else { y } tating operations, comparing results between Swift, Swift-
3 } let, Scala, and C++ to demonstrate the benefits of MVS
4
5 fn rep ort_lo ngest ( x : String , y : String ) {
over functional updates (Section 7).
6 let z = longer_of (x , y ) ;
7 println !( " longest of {:?} and {:?} is {:?} " ,
8 x , y , z ) ; // <− error 2. Mutable value semantics
9 }
Before we delve deeper into the details of a language implemen-
Consider the Rust (Matsakis & Klock 2014) example above.
tation, we ought to define precisely what MVS is and how it
A simple function longer_of returns the longer of two character
differs from the more widespread reference semantics.
strings. Its caller, report_longest, also simple, is ill-typed. The
compiler complains that x and y have been moved (into the call
to longer_of), and can no longer be used. Ending variable life- 2.1. Primitive and compound types
times early is part of Rust’s strategy for ensuring memory safety Popular object-oriented programming languages have converged
without creating expensive copies at function call boundaries. on a common mutation model that distinguishes between so-
These goals are difficult to reconcile, so it’s perhaps not surpris- called “primitive” or “built-in” types (typically numeric types)
ing that simple-looking code exposes language complexity. and “compound” types (typically arrays and classes). Variables
Mutable value semantics (MVS) sits at third point in the of primitive types are independent: the value assigned to a
design space where both goals are satisfied and mutation is variable of a primitive type cannot change due to an operation
supported, without the complexity inherent to flow-sensitive on another variable in the program. In contrast, variables of
type systems. The key to this balance is simple: MVS does not compound type may share state with other variables. Hence, the
surface references as a first-class concept in the programming value assigned to a variable of a compound type can change due
model. As such, they can neither be assigned to a variable nor to an operation on another variable.
stored in object fields, and all values form disjoint topological
trees rooted in the program’s variables. 1 class Vec2 { int x , y ; } // Primitive fields
2 class Rect { Vec2 pos , dim ; } // Compound fields
The reader may justifiably wonder whether a discipline with 3
these restrictions is expressive enough to write efficient, non- 4 int i1 = 2; // Same pattern with
trivial programs. We note that a large body of software projects 5 Vec2 v1 = new Vec2 ( i1 , i1 ) ; // − int : primitive
across multiple languages already answer this question empiri- 6 Rect r1 = new Rect ( v1 , v1 ) ; // − Vec2 : compound
7 Rect r2 = r1 ;
cally, such as the Boost Graph Library (Siek et al. 2002), a col- 8
lection of generic components (Stepanov & Rose 2014) for com- 9 r2 . dim . x += 4 // Mutates r2
putations on graphs in C++, and Swift for TensorFlow (Saeta 10 System . out . println ( r1 . pos . x ) // Now 6: r1 changed
11 System . out . println ( r1 . pos . y ) // 2: no change
et al. 2021), a high-performance platform for machine learn-
ing. Further, we observe that well-established programming Listing 1 Compound types in Java have reference semantics
languages have adopted MVS at the core of their semantics,
such as R (R Core Team 2020) and Swift (Apple Inc. 2021), for Consider the Java program above, which illustrates the dis-
safety and/or efficiency. tinction in full detail. In lines 1 and 2, it declares compound
In more detail, Swift is a modern general-purpose program- types Vec2 and Rect, representing 2d vectors and rectangles,
ming language, used in a broad spectrum of applications. Its respectively. In line 5, both primitive-type fields of v1 are initial-
standard library offers a rich collection of reusable components ized to the value of the same variable. In line 6, both compound-
based on MVS, while striving to show competitive performance type fields of r1 are initialized with v1, causing r1.pos to share
against comparable libraries in programming languages such state with r1.dim. In line 7, assignment causes r1 to share state
as C, C++ and Rust. The language is translated to native code with r2. After line 7, the contents of the program’s memory can
using an LLVM (Lattner & Adve 2004) backend. be depicted as in Figure 1a.
After a brief introduction of the core tenets of MVS (Sec- Line 9 performs a mutation, growing the x dimension of r2
tion 2), we explore some of these implementation strategies and by 4. Line 10 shows that the mutation has had a non-local effect,
make the following contributions: changing r1.pos, a distinct variable of compound type. Line
– We propose a core language called Swiftlet, a subset of 11 shows that, by contrast, the mutation has not changed the
Swift focused exclusively on MVS. We introduce Swiftlet value of the field r1.pos.y, a distinct variable of primitive type,
through a series of examples (Section 3) and formalize its initialized in the same way.
semantics (Section 4). This difference in behavior demonstrates that Java has two
– We discuss a compiler for Swiftlet that supports the cre- different kinds of mutation semantics: one for “primitive” types
ation of zero-cost abstractions (Section 5). and another for “compound” types.
2 Racordon et al.
r1:Rect r2:Rect
v1
pos:Vec2 dim:Vec2 pos:Vec2 dim:Vec2
r1 :Rect :Vec2
x:Int y:Int x:Int y:Int x:Int y:Int x:Int y:Int
pos x 2
r2 2 2 2 2 2 2 2 2
dim y 2
Figure 1 Contents of the memory of a program involving compound types. Arrows represent references and boxes represent
whole/part relationships.
2.2. Value and reference semantics two or more live variables refer to the same memory location—
We can decouple these two mutation semantics from the ques- every time a variable is passed to a function or assigned to
tion of whether a type is “primitive” or “compound”. In fact, another variable. Unfortunately, leaving alias creation implicit
one could argue that it makes more sense for a notional vector in the language creates a collection of problems (Noble et al.
value like Vec2 to behave just like a scalar int. A more gen- 1998) that we informally dub spooky action at a distance.1 In
eral distinction separates types with reference semantics, which short, neither humans nor machines (e.g., optimizing compilers)
behave like Java’s compound types, from those with value se- can reason locally about mutation semantics in the presence of
mantics, which behave like Java’s primitive types. Conceptually, aliases.
two variables of a reference type can share mutable state, but Consider the so-called “signing flaw”, a security vulnerabil-
two variables of a value type cannot. ity discovered in a previous version of the Java platform that
Because the difference in behavior always involves mutation, allowed untrusted applets to escalate access into the virtual ma-
an immutable type can be said to have value semantics trivially. chine (Vitek & Bokowski 2001). The vulnerability was caused
Mutation by assigning a whole new value to a variable can be by a reference leak, giving the attacker the means to mutate the
rewritten as binding a new variable, with no mutation, so is triv- system’s internal list of signatures. The following snippet is a
ially equivalent. Therefore, to distinguish the nontrivial cases of simplified excerpt of the flawed implementation:
interest, we say that a value type has mutable value semantics 1 public class Class {
when its parts can be mutated in-place, without reassigning a 2 public Identity [] getSigners () {
3 return this . signers ;
variable of the type. 4 }
1 struct Vec2 { var x : Int , y : Int } 5 private final Identity [] signers ;
2 struct Rect { var pos : Vec2 , dim : Vec2 } 6 }
3
4 var i1 = 2
The field signers is exposed via the method getSigners. An
5 var v1 = Vec2 ( x : i1 , y : i1 ) attacker could thus obtain an alias on the list of trusted signers
6 var r1 = Rect ( pos : v1 , dim : v1 ) and alter it as they see fit. Although the field is private final
7 var r2 = r1 and is thus neither accessible to clients nor reassignable, nothing
8
9 r2 . dim . x += 4 // Mutates r2 prevents a method from accidentally leaking a reference to
10 print ( r1 . pos . x ) // Prints 2: r1 unchanged the object it holds, and through that reference, the list can be
11 print ( r1 . pos . y ) // Prints 2 mutated (Potanin et al. 2013).
Listing 2 Swift has compound types with value semantics The standard prescription for accidental aliasing in Java is
the manual insertion of defensive copies, but that is hardly an
Consider the Swift program above, a direct transliteration of adequate cure: a missed defensive copy is a possible security
the Java code from Listing 1, but using only types with mutable vulnerability, an extra copy a source of inefficiency. Alias pre-
value semantics. In Swift, structs are value types, so after line 6, vention mechanisms like ownership types (Clarke et al. 2013)
r1.pos and r1.dim do not share state. and confined types (Vitek & Bokowski 2001)) are safer, but re-
Figure 1b depicts the contents of the program’s memory after quire complex annotations in code. Even when applied correctly,
line 7. Note that we use nesting rather than arrows to represents defensive copies must be made conservatively, without dynamic
relationships between values and their parts, because values knowledge of how the objects will ultimately be used—for ex-
never share parts. Unlike in Java, the dimensions of r2 are ample, whether they are eventually mutated, or even inspected—
independent from those of r1. Hence, the mutation of r2 at line and thus impose a heavy performance tax. Below the level of the
9 does not propagate to r1, as shown by the print statement at programming model, the mere possibility of mutation through a
line 10. shared reference creates additional costs (Shaikhha et al. 2017).
Optimizing compilers such as GCC and LLVM go to sig-
2.3. Spooky action at a distance nificant lengths to “model the heap” to prove references and
pointers do not alias. In cases where uniqueness cannot be
Our Java example demonstrates how programming with refer-
ence types implicitly introduces aliasing—a condition where 1 With apologies to Einstein.
4 Racordon et al.
3 static func += ( a : inout Self , b : Self ) { 3 var u = v ;
4 a . x += b . x 4 u . y = 8 // v = Vec2 ( x : 4 , y : 2)
5 a . y += b . y 5 // u = Vec2 ( x : 4 , y : 8)
6 }
7 } Immutability applies transitively. All fields of a struct bound
8 var v1 = Vec2 ( x : 2 , y : 2) to a constant are also treated as immutable by the type system,
9 var v2 = v1 regardless of their declaration. For example, the program below
10 v2 += Vec2 ( x : 1 , y : 0)
11 print ( v1 ) // Vec2 ( x : 2 , y : 2) is ill-typed because v.y denotes a constant, notwithstanding that
12 print ( v2 ) // Vec2 ( x : 3 , y : 2) field having been declared with var.
1 struct Vec2 { ... };
Semantic uniformity is a prerequisite for generic program- 2 let v = Vec2 ( x : 4 , y : 2) ;
ming, the discipline of realizing algorithms and data structures 3 v . y = 8 // <− type error
so they work in the most general setting possible, without loss
of efficiency (Stepanov & Rose 2014).4 Indeed, it becomes Likewise, all elements of an array are constant if the array itself
difficult to even describe the semantics of an algorithm if any is bound to a constant.
part of it can have non-local effects. 1 struct Vec2 { ... };
2 let a = [ Vec2 ( x : 4 , y : 2) , Vec2 ( x : 5 , y : 3) ];
3 a [0]. y = 8 // <− type error
3. Swiftlet Functions are declared with the keyword func followed by
Swiftlet is a subset of Swift, focusing on value types and dis- a name, a list of typed parameters, a codomain, and a body.
carding all features that are not essential to their description. Arguments are evaluated eagerly and passed by value. Functions
Our language only features structs (i.e., compounds of het- are allowed to be mutually recursive.
erogeneous types), arrays (i.e., dynamically sized collections 1 func fact ( n : Int ) −> Int {
of homogeneous data), functions, and numeric (integer and 2 n > 1 ? n ∗ fact ( n : n − 1) : 1
floating-point) values. The result is a language whose complete 3 };
4 fact (6) // Prints 720
operational semantics fits a single page (Section 4).
A program is described by a sequence of struct declarations, To implement in-place part-wise mutation across function
followed by a single expression denoting an entry point (i.e., boundaries, a parameter’s type is marked inout, which makes
the contents of the main file in a regular Swift program). the parameter mutable in the callee. Conceptually, an inout
A variable is declared with the keyword var followed by argument is copied when a function is called and copied back
a name, an optional type annotation, an initial value, and the when that function returns.6
expression in which it is bound. A constant is declared similarly, 1 struct Vec2 { ... };
with the keyword let. Naturally, a variable can be mutated or 2 func translateX ( v : inout Vec2 , d : Int ) −> Void {
reassigned whereas a constant cannot. 3 v.x = v.x + d
4 };
1 var foo : Int = 4; 5 var v = Vec2 ( x : 4 , y : 2) ;
2 let bar = foo ; 6 _ = translateX ( v : &v , d : 6) ;
3 print ( bar ) // Prints 4 7 print ( v . x ) // Prints 10
A struct is a compound type composed of zero or more In the program above, the function translateX accepts an
fields. Each field is typed explicitly with an annotation and inout parameter of type Vec2, which it is allowed to mutate.
associated with a mutability qualifier (let or var) specifying The return type, Void, is Swiftlet’s unit type. The function is
whether it is constant or mutable. Fields can be of any type, called at line 6, effectively mutating the value of the vector v
but—for simplicity only—type definitions cannot be mutually across function boundaries. Note that the ampersand featured
recursive. Hence, all values have a finite representation. in the call expression is not the “address-of” operator from
1 struct Vec2 { ... };
C/C++. Instead, it signals in code that the argument is to be
2 var v = Vec2 ( x : 4 , y : 2) ; mutated—conceptually “copied out” of the callee upon return.
3 print ( v . y ) // Prints 2 Of course, inout extends to multiple arguments, with one
important restriction: to prevent any writeback from being dis-
As all types have value semantics, values form disjoint topo-
carded, overlapping mutations are prohibited. In other words,
logical trees rooted at variables or constants. Conceptually, an
inout arguments must have independent values. This Law of
assignment is always a copy of the right operand5 and does not
Exclusivity (McCall 2017) creates a crucial optimization oppor-
create an alias. In the program below, u is assigned a copy of
tunity: it is safe to sidestep the conceptual copies by allowing
v’s value, so the update of u’s second component in line 4 does
the callee to write the argument’s memory in the caller’s context.
not modify v.
In other words, inout argument passing can be implemented as
1 struct Vec2 { ... }; pass-by-reference without surfacing reference semantics in the
2 var v = Vec2 ( x : 4 , y : 2) ;
programming model.
4 Generic programming as described by its originators depends on the concept Remark that inout parameters are reminiscent of (if not iden-
of regularity (Stepanov & McJones 2009), a refinement of value semantics. tical to) borrowing (Naden et al. 2012), as found in languages
5 We discuss how the language implementation eliminates unnecessary eager
6 The Fortran enthusiast may think of the so-called “call-by-value/return” policy.
copies in Section 6.
6 Racordon et al.
cause the pair to refer to itself, avoiding an infinite recursion in integer c
line 9. Instead, the value of p has been copied into p._2.
name x, s
4. Formal definition context µ, φs : X → M×V
This section introduces Swiftlet formally. We start with its prog. g ::= de
syntax and present a first description of its operational semantics
struct d ::= struct s { b };
in the form of big-step inference rules (a.k.a. natural semantics).
This semantics is intended to describe the high-level user model qual. m ::= let | var
and provide a formal framework for discussing optimization
bind. b ::= mx:τ
strategies.
Then, we present the Swiftlet’s static semantics, and show arg. a ::= &r | e
how its type system guarantees uniqueness of inout arguments
expr. e ::= e; e | b = e in e | r = e | [e] | r | v
at function boundaries.
Although natural semantics is convenient for describing ob- | s(e) | e( a) | e ? e : e | e as τ
servable behaviors (Leroy & Grall 2009), its inability to distin-
| func x ( x : p) → τ {[ x ] in e} in e
guish failure from non-termination makes it less well-suited to
the study of soundness properties. Hence, to demonstrate the param. p ::= inout τ | τ
guarantees provided by our static semantics, we finally present a
type τ ::= ( p) → τ | [τ ] | s | Z | Any | ()
second operational semantics in the form of small-step inference
rules. path r ::= e.x | e[e] | w
value v ::= λ( x : p, e) | φs | [v] | box(v) | c
4.1. Notations
We use horizontal bar notation to denote sequences of terms. lvalue w ::= w.x | w[c] | x
For instance, x expands to x1 , . . . , xk for some k. We write ε for
the empty sequence and, for the sake of syntactic regularity, we Figure 3 Formal syntax of Swiftlet
assume that x1 , . . . , xk is an empty sequence if k = 0. We write
| x | for the length of the sequence x. We write x : y meaning
x1 : y1 , . . . , x k : y k . scribed by a unique global name and a sequence of property
Let f : A → B be a function, dom( f ) denotes its domain. If declarations. A property is declared by a binding m x : τ where
f is a partial function, then dom( f ) is the subset A0 ⊆ A for m denotes its mutability, x identifies its name, and τ specifies
which f is defined. We write f = [⊥] A→ B to represent a partial its type.
function f : A → B with dom( f ) = ∅. We write f = [ a 7→ Other types include integer (written Z)11 , homogeneous
b] A→ B to represent a partial function f such that f ( a) = b arrays (written [τ ] where τ is the element type), function types
with dom( f ) = { a}. We write f = [ a 7→ g( a) | p( a)] A→ B (written ( p) → τ, where τ is the return type and each p is a
for the function that returns g( a) for all a ∈ A that satisfy parameter type potentially qualified by inout), the existential
a predicate p. For example, [i 7→ −i | i ∈ Z ∧ i < 0]Z→Z container type (written Any), and the unit type (written ()).
denotes a function that maps each negative integer to its absolute Functions can be recursive (although not hoisted), but we pro-
value. We omit the subscript when the function’s domain and scribe mutually recursive type declarations. For the sake of sim-
codomain are obvious from the context. We write f [ a 7→ b] for plicity, Swiftlet requires all named declarations (i.e., structures,
the function that returns b for a and f ( x ) for any other argument. properties, parameters, and local bindings) to have a unique
For instance, if f (0) = 1 and f (1) = 2, then ( f [0 7→ 3])(0) = name. This simplification does not restrict the expressiveness
3 and ( f [0 7→ 3])(1) = 2. We write f [ a 7→ ⊥] for the function of our language, as name conflicts can always be eliminated via
that is not defined for a and returns f ( x ) for any other argument. α-conversion. Further, function declarations always feature a
Given f : A → B and g : A → B, we write f [ a 7→? g( a)] for capture list, even when it is empty.
the function f [ a 7→ g( a)] if a ∈ dom( g), or f otherwise. Expressions are composed out of array literals, structure
Let e be a term and σ a set of substitutions represented as a instantiations, function calls, conditionals, function declarations,
partial function from variables to terms, we write e[/σ] for the binding declarations, assignments, sequences, casts, values, and
term obtained by applying the substitutions σ to e, renaming paths. The latter are at the heart of mutable value semantics. In
free variables as necessary. For instance, if e = λa.ab and broad strokes, a path denotes access to a value or part thereof. It
σ = [b 7→ c], then e[/σ ] = λa.ac. can be the name of a binding or any expression suffixed by either
a dotted accessor (e.g., e.n) or a bracketed index identifying a
4.2. Syntax specific element in an array (e.g., e1 [e2 ]).
Figure 3 presents the formal syntax of Swiftlet. A program Borrowing from C parlance, path expressions starting with
g is a sequence of structure declarations followed by a single a name are called lvalues, as they may appear on the left hand
functional term acting as its entry point.10 A structure is de- side of an assignment. Only mutable lvalues can serve as ar-
10 Named functions are declared in the body of the entry point 11 We exclude floating-point values from the formal definition.
8 Racordon et al.
∆, µ ` e ⇓ R µ0 , v
E-E LEM
E-NAME E-P ROP ∆, µ ` e1 ⇓ R µ0 , [v1 , . . . , vk ] E-I NOUT
0
µ( x ) = m v R
∆, µ ` e ⇓ µ , φ s s
φ (x) = m v ∆, µ0 ` e2 ⇓ R µ00 , c 0≤c<k ∆, µ ` r ⇓ L µ0 , var w
∆, µ ` x ⇓ R µ, v ∆, µ ` e.x ⇓ R µ0 , v ∆, µ ` e1 [e2 ] ⇓ R µ00 , vc+1 ∆, µ ` &r ⇓ R µ0 , w
E-S TRUCT L IT
1≤ i ≤ k E-A RRAY L IT
z }| {
1≤ i ≤ k
∆, µi−1 ` ei ⇓ R µi , vi z }| {
struct s {m1 x1 : τ1 , . . . , mk xk : τk } ∈ ∆ ∆, µi−1 ` ei ⇓ R µi , vi
∆, µ0 ` s(e1 , . . . , ek ) ⇓ R µk , [ xi 7→ mi copy(vi ) | 1 ≤ i ≤ k]s ∆, µ0 ` [e1 , . . . , ek ] ⇓ R µk , [copy(v1 ), . . . , copy(vk )]
E-F UNC
e10 = e1 [/σ ] ∆, µ[ x0 7→ let λ( x1 : p1 , . . . , xk : pk , e10 [/σ0 ])] ` e2 ⇓ R µ0 , v
σ = [y j 7→ copy(v j ) | (1 ≤ j ≤ h) ∧ µ(y j ) = m j v j ] σ0 = [ x0 7→ func x0 ( x1 : p1 , . . . xk : pk ) → τ {[] in e10 } in x0 ]
∆, µ ` func x0 ( x1 : p1 , . . . xk : pk ) → τ {[y1 , . . . , yh ] in e1 } in e2 ⇓ R µ0 [ x0 7→? µ( x0 )], v
E-C ALL
1≤ i ≤ k
z }| {
∆, µ ` e0 ⇓ R µ0 , λ( x1 : p1 , . . . , xk : pk , eb ) ∆, µi−1 ` ai ⇓ R µi , vi E-S EQ
σ = [ xi 7→ copy(vi ) | 1 ≤ i ≤ k ] ∆, µk ` eb [/σ ] ⇓ R µ0 , v ∆, µ ` e1 ⇓ R µ0 , v1 ∆, µ0 ` e2 ⇓ R µ00 , v2
∆, µ ` e0 ( a1 , . . . , ak ) ⇓ R µ0 , copy(v) ∆, µ ` e1 ; e2 ⇓ R µ00 , v2
∆, µ ` r ⇓ L µ0 , m w
Finally, the mutation is performed by calling a helper func- it can always be implemented as an in-place update without
tion set: any optimizer heroics, since paths are known to always identify
unique and independent values.
set(µ, x, v) = µ[ x 7→ v]
set(µ, w.x, v) = set(µ, w, get(µ, w)[ x 7→ v])
set(µ, w[c], v) = set(µ, w, [u1 , . . . , uc−1 , v, uc+1 , . . . , uk ]) Closures Function declarations are handled by E-F UNC. First,
where get(µ, w) = [u1 , . . . , uk ] it creates a substitution σ as a table mapping captured bindings
to their value in the context µ. These bindings are identified
The resemblance of set to a functional update suggests that
explicitly using the capture list.
there is a simple mapping from a program using MVS to one
that is purely functional. Unlike its rendition after such a purely- As mentioned in Section 3, recall that captures are immutable.
functional transformation, our set has predictable performance: Hence, they can be substituted for their values directly, thus
Next, consider the expression a in f (2), which declares the Function declarations Function declarations are typed in two
function f and immediately applies it to an integer argument. steps. The first creates a new context for type checking the body
That expression is evaluated by E-F UNC, which creates a sub- expression. The second consists of type checking the expression
stitution mapping f to an expression a in f . This substitution delimiting the scope of the declaration.
is applied to the body of the function, resulting in a closure
λ(n : Z, (n > 1 ? n ∗ ( a in f )(n − 1) : 1)). The rule T-F UNC starts by mapping each capture to its corre-
In a call, if n is greater than one, the reduction of the first sponding type, treating them as immutable regardless of their
branch of the conditional will trigger E-F UNC to evaluate a in f , mutability in the surrounding context. Then, each parameter is
effectively unfolding the recursive declaration one more time. mapped onto its type and mutability, resulting in a typing con-
Otherwise, the second branch of the conditional will reduce text Γ00 . Parameters are immutable unless qualified by inout.
immediately as the value 1, ending recursion. This translation is expressed by a small helper function:
Function calls The rule E-C ALL describes function calls. The (
callee is evaluated first and must reduce to a closure of the form let τ if p = τ
type( x : p) =
λ( x : p, eb ). Arguments are evaluated next, from left to right, var τ if p = inout τ
just as in E-S TRUCT L IT and E-A RRAY L IT. The value of each
argument is then substituted for the corresponding parameter
name in the function’s body. The context Γ00 is finally extended by mapping the function’s
The call’s inout arguments are handled by inlining the lvalue name onto its own type before type checking the body e1 in order
to which they reduce in the function’s body. Indeed, notice that to handle recursive calls. The context Γ is extended similarly
E-I NOUT evaluates the path following the ampersand as an lvalue to type check the expression e2 in which the newly declared
rather than a value, using ⇓ L rather than ⇓ R . function is visible. In both instances, the binding representing
the function is considered immutable.
Example 4.3. Let f (&a[ a[0]].b) be a function call evaluated
by E-C ALL, in a context µ = [ f 7→ let λ( x : inout Z, x =
Function calls In function calls, the type system upholds
42), a 7→ var [0, 1]]. The rule starts by reducing the callee, by
uniqueness of inout parameters by guaranteeing that the same
direct application of the E-NAME. Then, the inout argument
lvalue cannot be dereferenced from two different paths. This
is handled by E-I NOUT, triggering the application of P-E LEM
test is defined via a relation ⊆ on argument expressions. Intu-
that eventually produces a mutable lvalue var a[1]. The latter is
itively, ai ⊆ a j holds if both expressions are inout arguments
inlined in the closure’s body, resulting in an expression a[1] =
(i.e., paths prefixed by &) whose paths are either identical, or
42 that is evaluated in µ. The call finally concludes with an
ai ’s is a subpath of a j ’s. Formally, ⊆ is the minimal reflexive
updated context µ0 = [ f 7→ let λ( x : inout Z, x = 42), a 7→
var [0, 42]].
and transitive relation that satisfies the following rules:15
14 Alternatively, one could define a closure as a term λµ ( x : p.e) where µ would
represent the environment. Such a strategy would let us represent mutable (yet 15 The intuition of the operator relates to the size of the path rather than the set
copied) captures. of locations that it represents.
10 Racordon et al.
`g:τ ∆; Γ `arg a : p
∆; Γ ` e : τ
T-S TRUCT L IT
1≤ i ≤ k T-A RRAY L IT
z }| { T-C OND
∆; Γ ` ei : τi 1≤ i ≤ k ∆; Γ ` e : Z
z }| { T-C AST
struct s {m1 x1 : τ1 , . . . , mk xk : τk } ∈ ∆ Γ, ∆ ` ei : τ ∆; Γ ` et : τ ∆; Γ ` ee : τ ∆; Γ ` e : τ 0
∆; Γ ` s(e1 , . . . , ek ) : s ∆; Γ ` [e1 , . . . , ek ] : [τ ] ∆; Γ ` e ? et : ee : τ ∆; Γ ` e as τ : τ
T-F UNC
Γ0 = [y j 7→ let τj | 1 ≤ j ≤ h ∧ Γ(y j ) = m j τj ] Γ00 = Γ0 [ xi 7→ type( pi ) | 1 ≤ i ≤ k]
00
∆, Γ [ x 7→ let ( p1 , . . . , pk ) → τ ] ` e1 : τλ ∆; Γ[ x 7→ let ( p1 , . . . , pk ) → τ ] ` e2 : τ
∆; Γ ` func x ( x1 : p1 , . . . , xk : pk ) → τλ {[y1 , . . . , yh ] in e1 } in e2 : τ
T-C ALL
1≤ i ≤ k
z }| {
∆; Γ ` e : ( p1 , . . . , pk ) → τ ∆; Γ `arg ai : pi ∀1≤i≤k , ∀1≤ j≤k , i 6= j =⇒ ai 6⊆ a j
∆; Γ ` e( a1 , . . . , ak ) : τ
∆; Γ `path r : m τ
The last rule applies when the value of an array index is not Casts Notice that T-C AST does not perform any test to guaran-
statically computable. It represents the restriction that fends tee that the value e is indeed of type τ. Indeed, casts in Swiftlet
off cases where two arbitrary expressions would evaluate to are completely dynamic and, therefore, errors are handled at
the same value, effectively producing two identical paths. For runtime. In other words, the correctness of cast a expression
12 Racordon et al.
∆ ` π; η; e −→ π 0 ; η 0 ; e0
ESS-B INDING
ESS-C ONTEXT l 6∈ dom(π ) ESS-S EQ
∆ ` π; η; e −→ π 0 ; η 0 ; e0 π 0 = π [l 7→ m v] η 0 = η1 [ x 7 → l ] π 0 = drop(π, v)
∆ ` π; η; Ehei −→ π 0 ; η 0 ; Ehe0 i ∆ ` π; η; m x : τ = v in e −→ π 0 ; η 0 , η; e; pop l ∆ ` π; η; v; e −→ π 0 ; η; e
ESS-F UNC
l 6∈ dom(π 0 ) η 0 = η1 [ x 0 7 → l ] ESS-C OND -T
0
π , ηλ = mkenv(π, y1 : η1 (yi ), . . . , yh : η1 (yh )) π 00 = π 0 [l 7→ let λ( x : p, ηλ , e1 )] c 6= 0
∆ ` π; η; func x0 ( x : p) → τλ {[y1 , . . . , yh ] in e1 } in e2 −→ π 00 ; η 0 , η; e2 ; pop l ∆ ` π; η; c ? et : ee −→ π 0 ; η; et
ESS-C ALL
Icpy = {i | 1 ≤ i ≤ k ∧ pi = τi } Iref = {i | 1 ≤ i ≤ k ∧ pi = inout τi }
{li | i ∈ Icpy } ∩ dom(π ) = ∅ ∀i, j ∈ Iref , i 6= j =⇒ acc(π, vi ) ∩ acc(π, v j ) = ∅
π 0 = π [li 7→ let vi | i ∈ Icpy ] η 0 = ηλ [ xi 7→ li | i ∈ Icpy ][ xi 7→ vi | i ∈ Iref ] ESS-C OND -F
ESS-P OP
1≤ i ≤ k 1≤ i ≤ k
z }| { z }| {
π i −1 ( li ) = m i v i πi = drop(πi−1 , vi ) ESS-I NOUT-PATH
ESS-I NOUT
π 0 = π k [ li 7 → ⊥ | 1 ≤ i ≤ k ] ∆ ` π; η; &r −→lv π 0 , η 0 ; r 0
∆ ` π0 ; η, η; v; pop l1 , . . . , lk −→ π 0 ; η; v ∆ ` π; η; &r −→ π 0 , η 0 ; r 0 ∆ ` π; η; &l var −→ π, η; l
∆ ` π; η; r −→lv π 0 ; η 0 ; r 0
PSS-P ROP
PSS-NAME PSS-S TRUCT m0 = min(m, mi ) π ( l ) = m [ l1 , . . . , l k ] s
0 0 0
η1 ( x ) = l π (l ) = m v ∆ ` π; η; r −→lv π ; η ; r struct s {m1 x1 : τ1 , . . . , mk xk : τk } ∈ ∆
∆ ` π; η; x −→lv π; η; l m ∆ ` π; η; r.x −→lv π 0 ; η 0 ; r 0 .x
0
∆ ` π; η; l m .xi −→lv π; η; lim
1≤ i ≤ k 1≤ i ≤ k
z }| { z }| { π ( l0 ) = m [ l1 , . . . , l k ] s
π0 ( li ) = m i v i πi , vi0 = copy(πi−1 , π0 (li )) [
acc(π, l0 ) = acc(π, li ) ∪ {l0 , l1 , . . . , lk }
l10 , . . . , lk0 6∈ dom(πk ) π 0 = πk [li0 7→ mi vi0 | 1 ≤ i ≤ k]
1≤ i ≤ k
π 0 , [l10 , . . . , lk0 ]s = copy(π0 , [l1 , . . . , lk ]s )
Garbage collection Rules ESS-B INDING, ESS-F UNC, and 4.6. Soundness
ESS-C ALL relate to expressions that delimit a scope. The three We now discuss how Swiftlet’s operational semantics relates to
of them append an expression of the form pop l in their conclu- its typing semantics. Specifically, our typing rules guarantee that
sions, where the sequence l represents the memory locations at well-typed programs can either be reduced to a value, or never
which scoped values where allocated. terminate, or fail because of a runtime error, such as an invalid
Pop expressions delimit the end of a scope. They are evalu- cast or a out-of-bound array access. Hence, crucially, they
ated by ESS-P OP, which removes the last frame and destroy the cannot violate immutability restrictions and local reasoning.
values stored at the locations l, reclaiming the memory of the We first establish these guarantees on the small-step seman-
values that are no longer accessible. tics from Section 4.5.
Memory collection is carried out by a helper function drop
that destroys a value by freeing the locations that are part of its Definition 4.1 (Well-formed memory state). A memory state
representation. Just as copying, this process is implemented as π; η is well-formed if |η | ≥ 1 and for any pair of bindings
a recursive traversal of the value’s representation. For instance, x, x 0 ∈ dom(η1 ) such that x 6= x 0 , and l = η1 ( x ), and l 0 =
destroying structures is defined as follows: η1 ( x 0 ), we have acc(π, l ) ∩ acc(π, l 0 ) = ∅.
14 Racordon et al.
We state the soundness theorem in the classical syntactic compiler
style of Wright & Felleisen (1994). Full proofs appear in Ap-
pendix A. parse sema codegen
Lemma 4.1 (Progress). Given π; η such that ∆; Γ; π ` e : τ
swiftlet LLVM
and ∆ ` π; η : Γ, either e is a value, or there exist π 0 , η 0 , e0
such that ∆ ` π; η; e −→ π 0 ; η 0 ; e0 , or the program is stuck due source syntax type bitcode
to a runtime error. errors errors
Given a well-typed memory state, Lemma 4.1 states that the Equivalence between memory representations is determined
evaluation of an expression e either steps due to an invalid array by the means of an operator det ·π ∆ , defined in Figure 8. The
subscript (e.g., a[2] where a is an empty array), or a invalid operator translates a value from the small-step semantics into
cast operation. Lemma 4.2 states that the evaluation of a step their representation in the natural semantics, given a set of
preserves well-formedness and well-typedness. Type soundness structure declarations ∆ and a pointer map π.
follows trivially.
5. Generating native code with LLVM
Theorem 4.1 (Type soundness). If ∆; ∅; [⊥] ` e : τ and
∆ ` [⊥]; [⊥]; e −→∗ π 0 ; η 0 ; e0 , then either e0 is a value or This section describes the implementation of a compiler for
the program is stuck due to a runtime error. Swiftlet. That compiler is written in Swift, in the style of MVS,
and distributed as an open-source project hosted on GitHub:
https://github.com/kyouko-taiga/mvs-calculus.
JcKπ Figure 9 gives an overview of the compiler’s architecture.
∆ =c
The “parse” module implements a recursive-descent parser us-
Jbox(l )Kπ π ing combinators (Hutton & Meijer 1998) that transforms textual
∆ = box(JvK∆ )
sources to an abstract syntax tree (AST). That AST is passed to
where π (l ) = m v the “sema” module (for semantic analysis), which is essentially
a type checker. It verifies that expressions are well-typed (e.g.,
Jbox(l )Kπ π
∆ = box(JvK∆ ) if π ( l ) = m v variables of type Int are only assigned to integer values), that
mutability constraints are satisfied (e.g., constants are never mu-
J [ l1 , . . . , l k ] Kπ π π
∆ = [Jl1 K∆ , . . . , Jlk K∆ ] tated), and guarantees path uniqueness for all inout arguments.
Finally, the “codegen” module translates the AST to LLVM’s
J [ l1 , . . . , l k ] s Kπ
∆ = [ xi 7 → mi Jli K∆ | 1 ≤ i ≤ k ]
π s
intermediate representation, optionally applying a handful of op-
where struct s {m1 x1 : τ1 , . . . , mk xk : τk } ∈ ∆ timizations. Note that code generation always succeed, as ASTs
that passed semantic analysis are guaranteed to be well-formed.
Jλ( x : p, ηλ , e)Kπ LLVM (Lattner & Adve 2004) is a popular middleware in
∆ = λ ( x p , e [ /σ ])
numerous compilers, including Clang (C/C++ and Objective-
where σ = [ xi 7→ Jvi Kπ
∆ | xi ∈ dom( ηλ ) C), rustc (Rust) and even swiftc (Swift). The framework is
∧ ηλ ( x i ) = li ∧ π ( li ) = m i v i ] centered around an SSA-style (Cytron et al. 1991) intermedi-
ate representation, called LLVM IR, that serves as a front-end
Figure 8 Correspondence between value representations agnostic language to apply code optimizations, and generate
machine code. Hence, LLVM IR dramatically reduces the engi-
neering effort required to build a compiler. However, translating
To convince ourselves that the natural semantics from Sec-
language features into this common representation—a process
tion 4.3 is an appropriate abstraction of the small-step semantics,
often referred to as lowering—comes with its own challenges.
can establish an equivalence relation between the two.
program code
represented as a passive data structure (PDS), where each field ν
is laid out contiguously with possible padding for alignment.
e e1 e2 ··· en
5.1.2. Arrays Arrays require dynamic allocation, as the com-
φ
piler is in general incapable of determining their size statically.
An array is represented by a pointer φ to a contiguous block
of heap-allocated memory. That block is structured as a tuple
hr, n, k, ei where r is a reference counter, n denotes the number Figure 11 In-memory representation of a closure
of elements in the array, k denotes the capacity of the array’s
payload (i.e., the size of its actual contents in bytes) and e is a
global function in which all captured identifiers are lifted into
payload of k bytes, containing n elements. The counter r serves
an additional parameter for the closure’s environment.
to implement copy-on-write (see Section 6.3).
Figure 10 depicts the in-memory representation of an array
5.1.4. Existential containers Containers of type Any are
assigned to a local variable. The square on the left of the dashed
implemented via value boxing (Henglein & Jørgensen 1994).
line represents the single memory cell allocated on the stack,
Boxing consists of a storing a value inside of a heap-allocated
containing the pointer φ. The squares on the right represent
area, so that it can be represented by a fixed-sized pointer to
cells allocated in the heap. Each cell ei is a single independent
that area. Unfortunately, this approach suffers a performance
element that may itself contain pointers to other heap-allocated
penalty incurred by heap allocation and collection, which may
memory blocks (e.g., for an array of arrays).
prove particularly expensive in a programming language with
pervasive copying.
stack heap We leverage a technique called small-object optimization
to mitigate that cost. Instead of systematically representing a
φ r n k e1 e2 ··· en
container as a pointer to a heap-allocated area, we use a small
buffer that is large enough to fit small objects inline. Values are
k = n × stride( T ) boxed in the heap only when they are too large to fit inside of
the buffer. In this case, the latter is used to store a pointer to out-
Figure 10 In-memory representation of an array of T of-line storage. Otherwise, we can avoid the cost of allocating
and freeing memory in the heap and eliminate the indirection
The capacity k of an array typically differs from the number overhead typically caused by value boxing.
of its elements n. The former depends on the size of an element A container is represented as a tuple hs, νi where s is a
in memory, or more precisely, its stride. The stride of a type small inline buffer and ν is a pointer to the value witness of the
denotes the number of bytes between two consecutive instances wrapped value (see Section 5.2). Choosing the size of s is a
stored in contiguous memory, which depends on the size and trade-off between minimizing heap allocation and minimizing
memory alignment of a type. Both information depend on the the space that is wasted when the wrapped value is smaller than
target ABI and are left for LLVM to figure out. the buffer, or when it must be allocated out-of-line nonetheless.
In our implementation, that space is large enough to fit three
Example 5.1. An array of two 16-bit integers [42, 1337] on a 64-bit integer values, which is sufficient to store numeric values,
64-bit little-endian machine is represented by the byte sequence arrays, and closures.
h1, 0, 0, 0, 2, 0, 0, 0, 4, 0, 0, 0, 42, 0, 5, 57i. It contains two ele-
ments, thus n = 2, yet its capacity k = 4 since each element
has a stride of two bytes. stack
program code
ν
5.1.3. Closures Just like arrays, closures require dynamic
allocation because the size of their environment cannot be de- · unused space
termined statically. A closure is represented as a triple hφ, e, νi
where φ is a pointer to a function implementing the closure, e 5
is a pointer to the closure’s environment (potentially null if the Vec2 stored inline
closure has no captures), and ν is a pointer to the value witness 2
of the closure (see Section 5.2).
Figure 11 depicts the in-memory representation of a closure
graphically. The cells e represent the contents of the closure’s Figure 12 In-memory representation of an existential con-
environment, laid out contiguously. Just like in the case of tainer using inline storage to hold a vector Vec2(x: 2, y: 5)
arrays, each cell is an independent value.
The function pointed by φ is obtained by defunctionaliza- Figure 12 shows an example of the in-memory representa-
tion (Reynolds 1998a). This process transforms a closure into a tion of a container that stores a 2-dimensional vector (i.e., the
16 Racordon et al.
result of an expression Vec2(x: 2, y: 5) as Any). Here, the vec- stack heap
tor’s value is small enough to be stored directly inside of the
container’s buffer, leaving some unused space.
0x300 b: 0x1b8
swap
5.2. Value witnesses 0x308 a: 0x318
A value’s lifetime corresponds to the span of time from its initial-
ization to its destruction. In the absence of first-class references, 0x310 vc.y: 5
that information can be determined statically. Initialization oc-
main
0x318 vc.x: 2
curs when a value is assigned to a variable while two events can
trigger destruction: reassignment and exit from the variable’s 0x320 ar: 0x190 1 2 16 42 13
scope. Following this observation, memory management can be
automated during code generation.
Swift allows the declaration of a variable to be separated Figure 13 In-memory representation of inout arguments
from its initialization. It then relies on definite assignment
analysis (Fruja 2004) for guaranteeing initialization before use,
possibly inserting dynamic checks in situations where that prop- representation of an existential package in type theory (Pierce
erty cannot be determined statically (e.g., when variables are 2002, Chapter 24), a value witness species the hidden imple-
initialized conditionally). In contrast, Swiftlet requires that mentation of a type’s value semantics, providing the compiler
all local bindings be initialized at the point of their declara- with a uniform programming interface to interact with values. A
tion. This restriction conveniently implies that the compiler can call to the copy function is issued every time a value is assigned
always distinguish between initialization and assignment. or crosses function boundaries, while a call to the destructor is
We say that a type is trivial if it denotes a numeric value or a issued before lifetime ending events, effectively implementing
composition of trivial types in a structure (e.g., a pair of Ints). compile-time garbage collection.
Conversely, types requiring dynamic allocations are non-trivial. All values of a particular type share the same value witness,
That includes array types, function types, existential containers, except closures. Indeed, the environment captured by a clo-
and structures containing at least one non-trivial property. In sure of type (T) −> U might differ from that of another closure
other words, the notional value of a trivial type is represented with the same type. Hence, a different witness must be synthe-
exactly by the contents of its inline storage, whereas the notional sized for each function declaration. Incidentally, that explains
value of a non-trivial type may include out-of-line storage. why closure tuples contain pointers to their copy function and
Since trivial types do not involve any out-of-line storage, destructor.
copying or deinitializing a value does not necessitate any par- 5.2.2. Synthesizing equality As observed in Section 2.4,
ticular operation. Hence, assigning a variable of a trivial type the ability of MVS to express whole/part relationships allows us
boils down to a byte-wise copy of the right operand. synthesize operations based on notional values, such as hashing
The situation is a bit more delicate for non-trivial types. and equality.
For arrays, a first issue is that the size of the heap-allocated Swiftlet synthesizes an equality function for all types used
storage cannot be determined statically. Instead, it depends on in the program.16 For integer and floating-point values, equality
the value of k in the tuple representing the array. A second corresponds directly to LLVM’s icmp and fcmp instructions, re-
issue is that copying may involve additional operations if the spectively. For other types, the compiler builds a function that
elements contained in the array are dynamically sized as well. In recursively checks for equality on each part of the value.
this case, a byte-wise copy of the array’s payload would create
unintended aliases, breaking value independence. Instead, each 5.3. Inout arguments
non-trivial element should be copied individually. One solution
is to synthesize a function for each data type that is applied At function boundaries, structures are exploded into scalar argu-
whenever a copy should occur. ments and passed through registers, provided the machine has
enough of them. If the structure is too large, it is passed as a
5.2.1. Synthesizing copy and destruction If the type is pointer to a stack cell in the caller’s context, in which a copy of
trivial (i.e., it does not involve any dynamic allocation), its the argument is stored before the call.
copy function is equivalent to a byte-wise copy. Otherwise, it inout arguments are passed as (possibly interior) pointers. If
implements the appropriate logic, calling the copy function of the argument refers to a local variable or one of its fields, then
each contained element. Similarly, the logic implementing the it is passed as a pointer to the stack. If it refers to the element
destruction of a value can be synthesized into a destructor. If of an array, then it is passed as a pointer to the array’s storage,
the type is trivial, then this destructor is a no-op. Otherwise, it offset by the element’s index.
recursively calls the destructor of each contained element and
Example 5.2. Consider the following program:
frees the memory allocated for all values being destroyed.
We synthesize a copy function and a destructor for every type 1 struct Vec2 { var x : Int ; var y : Int ; };
used in a program. Together, these functions form the value 16 In Swift, synthesizing equality (and hashing) is provided as an opt-in mecha-
witness of a type. Just as the witness type specifies the hidden nism by conforming to the protocol Equatable (and Hashable).
18 Racordon et al.
even be prohibited. At a concrete operation level, it means that 3 a0 [1] = 3;
destructors are not called on borrowed parameters. 4 var a1 = a0 ;
5 sort ( array : & a1 )
A second part of the contract stipulates that borrowed pa- 6 // a0 and a1 are the same array
rameters may not escape. That clause is guaranteed by the
application of copy in the conclusion of the rule, which pro- We assume the existence of a function that sorts an array
duces a new value whose lifetime is independent from that of in-place. Then, we declare an array a0, which is mutated at line
any borrowed parameter. 3. At that point, the value of a0’s internal reference counter is 1,
Following the same rationale, initialization of immutable so the mutation is performed on its storage directly.
bindings from immutable values can be substituted by aliases Line 4 declares another array a1, initialized to a0. With
as well. Consider the following expression: let x = [[1, 2], copy-on-write, the value of a0 is not copied right away. Instead,
[3, 4]]; let y = x[0]; f(y). The constant y is initialized from the reference counter of its internal storage is incremented,
another constant value. Moreover, the lifetime of y is lexically meaning that a1 is actually an alias by the time it is passed as
shorter than x’s. Therefore, y can simply alias x’s first element an argument to sort, at line 5. Should elements not be in order,
rather than copying it. Formally, this optimization can be de- the first mutation that sort will attempt will trigger a copy. In
scribed by another variant of E-B INDING: the present case, however, no copy will occur and a0 and a1 will
continue to share state after line 5.
E-B INDING -A LIAS Of course, copy-on-write prevents purely static garbage col-
∆, µ ` r ⇓ L µ0 , let w lection. Indeed, because of potential sharing, the lifetime of
σ = [ x 7→ w] ∆, µ0 ` e[/σ ] ⇓ R µ00 , v heap-allocated storage can no longer be determined at compile-
∆, µ ` let x : τ = r; e ⇓ R µ00 [ x 7→? µ0 ( x )], v time. Nonetheless, garbage collection can still be automated
with predictible performance. The reference counter is de-
The rule only applies to binding declarations of the form creased whenever the destructor of a value referring to the
let x : τ = r; e, where the binding is declared constant and associated storage is called. If it reaches zero, then the con-
initialized by a path expression. Notice that the path expression tents of the storage are destroyed and deallocated.
is evaluated with ⇓ L rather than ⇓ R , producing an lvalue rather Swiftlet applies copy-on-write on arrays only, as structures
than a value. The rule additionally checks that this lvalue is are allocated inline, enabling a different set of optimizations to
immutable before substituting it for the declared binding in the elimitate unnecessary copies. One limitation of our approach,
expression e. though, stems from its interaction with the implementation of
inout arguments (Section 5). Recall that an inout argument is
6.3. Copy-on-write passed as a (possibly interior) pointer. Hence, the callee has no
The optimization strategies we discussed in Section 6.2 are way to determine whether or not that pointer refers to a value
not applicable in the presence of mutation. Any assignment inside of a shared buffer. As a result, the caller is compelled to
involving a mutable binding, on the left- or right-hand side copy non-unique storage defensively.
typically requires a copy, because the value might be mutated
Example 6.2. Consider the following program:
later. Similarly, assigning a mutable value to a mutable binding
also requires a copy. 1 func sort ( array : inout [ Int ]) > Void { ... };
2 var a0 = [[1 , 2] , [3 , 4]];
Nonetheless, it is possible that neither the original nor the 3 let a1 = a1 ;
copy end up being actually mutated, perhaps because the muta- 4 sort ( array : & a0 [1])
tion depends on a condition that is evaluated at runtime. In this
case, unfortunately, the compiler must conservatively assume The variable a0 is declared as an array of arrays of Ints. With
that a mutation will occur and perform a copy to preserve value copy-on-write, it shares state with the variable a1 by the time
sort is called at line 4. The caller is compelled to copy the outer
independence.
One simple mechanism can be used to work around this array a0 because there is no way for the callee to determine that
apparent shortcoming: copy-on-write. Copy-on-write lever- inner array a0[1] is stored inside of shared storage.
ages runtime knowledge to delay copies until they are actually Nonetheless, note that a0’s copy will not trigger the copy
needed. Heap-allocated storage is associated with a counter that of its inner arrays, applying copy-on-write instead. Hence, a
keeps track of the number of references to that storage. Every copy of the inner array a0[0] will occur if and only if sort must
time a value is copied, an alias is created and the counter is perform a mutation. Meanwhile, a0[1] will share state with
a1[1] after the call at line 4.
incremented. The value of this counter is checked before mu-
tation actually occurs, at runtime, to determine uniqueness. If
6.4. Leveraging local reasoning
the storage is shared, the counter is decremented, the storage is
duplicated and the mutation is performed on a copy. Otherwise, We cited O’Hearn et al. (2001) in the introduction to emphasize
the mutation is performed on the original. the importance of local reasoning for human developers and
compilers alike. In particular, one can easily identify and discard
Example 6.1. Consider the following program: mutations, whose results cannot be observed elsewhere.
1 func sort ( array : inout [ Int ]) −> Void { ... }; 1 struct Vec2 { ... };
2 var a0 = [1 , 2]; 2 // ...
20 Racordon et al.
instructions. We use that heuristic to exclude benchmarks ex-
ceeding a threshold and keep program inputs that can terminate
104 Scala
in a reasonable time. Finally, the number of mutating instruc- Swift
tions can be adjusted by modifying the weights used by the Swiftlet
C++
fuzzer to select the constructs that it generates. 103
Well-known micro-benchmarks. Additionally, we also
port 5 well-known benchmarks by Marr et al. (2016): Bounce,
102
Mandelbrot, NBody, Permute and Queens. This choice is dic-
tated by the limited set of features in Swiftlet. The benchmarks
we picked are implemented using imperative language con- 101
structs such as arrays in Swift/Swiftlet/Scala and std::vector in
C++.
100
0.0 0.2 0.4 0.6 0.8
7.3. Results of synthetic benchmarks
We report results for a data set measuring the execution time
of 1344 randomly generated programs across 20 different in- Figure 15 Normalized running times (y-axis) relative to ratio
dependent runs. We normalize execution times by the fastest of writes as fraction of all memory accesses (x-axis) across all
implementation per benchmark and report aggregated scores synthetic benchmarks.
(Figure 14). We also report 50 percentile normalized time across
all benchmarks classified by the number of mutations as fraction Swift
11.06 x
of all total memory accesses (Figure 15). C++
Scala
101 Swiftlet
4.06 x
C++
1.83 x
1.65 x
1.40 x
1.35 x
1.34 x
1.10 x
1.06 x
1.05 x
1.04 x
1.03 x
1.00 x
1.00 x
1.00 x
1.00 x
1.00 x
0.91 x
100
0.65 x
Scala
0.48 x
Swiftlet
10 1
bounce mandelbrot nbody permute queens
Swift
Figure 16 Normalized running times (y-axis) on micro-
100 101 102 benchmarks.
22 Racordon et al.
interfaces, the latter being tied to reference semantics, some- The best solution is likely a clever combination of both ap-
times leading to counter-intuitive situations (Steimann 2021). proaches. Hence, extending Swiftlet to study these aspects is
Swift addresses that issue with value witnesses, implementing another interesting direction for future work.
different copy behaviors for reference types and value types.
Significant effort has been poured into techniques that op-
10. Conclusion
timize functional updates. One well-established approach is
fusion (Johann 2003), a process aimed at eliminating intermedi- We discuss implementation strategies to compile programming
ate data structures from expressions written as compositions of languages featuring mutable value semantics, a paradigm that
functions. Fusion, however, cannot eliminate all intermediate supports local reasoning by upholding the notion of value and ex-
structures, in particular when they are accessed by multiple cluding references from the user’s programming model. These
consumers. In that case, allocating and reclaiming temporary strategies are inspired by the Swift programming language,
space may incur a significant overhead. Shaikhha et al. (2017) which leverages the benefits of MVS for safety, correctness
propose to address this shortcoming by rewriting programs in and efficiency. To illustrate the details of our implementation,
a destination passing style to guarantee efficient downstream we introduce Swiftlet, a subset of Swift that focus on features es-
stack-like allocation and compile-time garbage collection. In sential for MVS, through a series of informal examples as well
Swiftlet, intermediate structures can be removed altogether us- as a formal operational semantics. Swiftlet supports compounds
ing inout to perform in-place part-wise mutation, or by relying of heterogeneous data, dynamically sized lists, type-erased con-
on optimizations to substitute aliases for copies (Section 6). tainers, and closures.
Reinking et al. (2021) advocate for the use of reference We discuss a handful of simple yet efficient static and dy-
counting as an automatic garbage collection mechanism to al- namic optimization techniques to eliminate unnecessary copies.
low efficient in-place updates of unique data structures, using Finally, we evaluate the performance of MVS on a large set of
borrowed references to reduce reference counter updates (Ull- randomly generated programs with varying numbers of mutat-
rich & de Moura 2019). Unlike our naive implementation of ing operations, comparing Swift, Swiftlet, Scala and C++. Our
copy-on-write, their framework is able to generate faster code results provide empirical evidence in favor of MVS. Specifically,
paths when reference counts can be tracked statically. they show that copy-on-write offers compelling performance
gain and they highlight the benefits of in-place, part-wise muta-
tion over functional updates in programs with large number of
9. Future work writes.
This paper focuses on a single threaded execution model,
yet concurrent and parallel applications have become ubiqui- Acknowledgments
tous. Fortunately, MVS offers promising prospects in that area. The authors would like to thank the reviewers for their helpful
Specifically, MVS is immune to data races—a condition in comments and suggestions.
which two or more threads access the same memory location
concurrently—and provides a simple yet powerful framework
to reason locally about concurrent programs, akin to concur- References
rent separation logic (Brookes & O’Hearn 2016). One future Agha, G. A. (1990). Actors - A model of concurrent computation
direction is, therefore, to explore implementation strategies that in distributed systems. MIT Press.
leverage MVS to support efficient parallelization. Apple Inc. (2021). The swift programming language. https://
Part of Swift’s concurrency model is based on actors (Agha docs.swift.org/swift-book/. (Retrieved September 20, 2021)
1990) with reference semantics. This choice seems appropriate Baker, H. G. (1992). Lively linear lisp: "look ma, no
in light of the vast literature on actor-based concurrency, while garbage!". ACM SIGPLAN Notices, 27(8), 89–98. Re-
languages such as Pony (Clebsch et al. 2015) or Encore (Bran- trieved from https://doi.org/10.1145/142137.142162 doi:
dauer et al. 2015) already present compelling arguments in favor 10.1145/142137.142162
of type-based aliasing restrictions for memory safety. Nonethe- Baker, H. G. (1994). Linear logic and permutation stacks -
less, revisiting concurrency without compromising on the con- the forth shall be first. SIGARCH Comput. Archit. News,
straints of MVS with respect to first-class references is an excit- 22(1), 34–43. Retrieved from https://doi.org/10.1145/181993
ing challenge. .181999 doi: 10.1145/181993.181999
For the sake of conciseness, Swiftlet leaves out protocols, Bierema, N. (2022). Immutable.js. https://github.com/
the construct that Swift uses to define constraints on generic immutable-js/immutable-js. (Retrieved January 10, 2022)
types (Racordon & Buchs 2020). Protocols, however, present Brandauer, S., Castegren, E., Clarke, D., Fernandez-Reyes,
a number of interesting issues to generate efficient code. One K., Johnsen, E. B., Pun, K. I., . . . Yang, A. M. (2015).
challenge, in particular, is to choose between monomorphisation Parallel objects for multicores: A glimpse at the parallel
and type erasure (Griesemer et al. 2020). The former approach language encore. In M. Bernardo & E. B. Johnsen (Eds.),
involves generating multiple variants of the same generic code, Formal methods for multicore programming (Vol. 9104, pp.
specialized for different concrete types. The latter involves 1–56). New York, NY: Springer. Retrieved from https://
settling for a common representation, typically by introducing doi.org/10.1007/978-3-319-18941-3_1 doi: 10.1007/978-3
indirections (e.g., boxing). -319-18941-3\_1
24 Racordon et al.
bourg (Ed.), Computer science logic (Vol. 2142, pp. 1–19). ings of the 6th ACM SIGPLAN international workshop on
New York, NY: Springer. Retrieved from https://doi.org/ functional high-performance computing, fhpc@icfp 2017,
10.1007/3-540-44802-0_1 doi: 10.1007/3-540-44802-0\_1 oxford, uk, september 7, 2017 (pp. 12–23). ACM. Re-
O’Neill, M. E. (2009). The genuine sieve of eratosthenes. trieved from https://doi.org/10.1145/3122948.3122949 doi:
Journal of Functional Programming, 19(1), 95–106. Re- 10.1145/3122948.3122949
trieved from https://doi.org/10.1017/S0956796808007004 Siek, J., Lee, L.-Q., & Lumsdaine, A. (2002). The boost graph
doi: 10.1017/S0956796808007004 library: User guide and reference manual. USA: Addison-
Pierce, B. C. (2002). Types and programming languages (1st Wesley Longman Publishing Co., Inc.
ed.). The MIT Press. Simms, D. (2019). Valhalla. https://wiki.openjdk.java.net/
Potanin, A., Östlund, J., Zibin, Y., & Ernst, M. D. (2013). display/valhalla. (Retrieved September 20, 2021)
Immutability. In D. Clarke, J. Noble, & T. Wrigstad (Eds.), Smith, F., Walker, D., & Morrisett, G. (2000). Alias types.
Aliasing in object-oriented programming. types, analysis and In G. Smolka (Ed.), Programming languages and systems,
verification (Vol. 7850, pp. 233–269). Berlin: Springer. Re- 9th european symposium on programming, ESOP 2000, held
trieved from https://doi.org/10.1007/978-3-642-36946-9_9 as part of the european joint conferences on the theory and
doi: 10.1007/978-3-642-36946-9\_9 practice of software, ETAPS 2000, berlin, germany, march
R Core Team. (2020). R: A language and environment for 25 - april 2, 2000, proceedings (Vol. 1782, pp. 366–381).
statistical computing [Computer software manual]. Vienna, Berlin: Springer. Retrieved from https://doi.org/10.1007/
Austria. Retrieved from https://www.R-project.org/ 3-540-46425-5_24 doi: 10.1007/3-540-46425-5\_24
Racordon, D., & Buchs, D. (2020). Featherweight swift: a core Steimann, F. (2021). The kingdoms of objects and values. In
calculus for swift’s type system. In R. Lämmel, L. Tratt, & Onward! ACM.
J. de Lara (Eds.), Proceedings of the 13th ACM SIGPLAN Stepanov, A., & McJones, P. (2009). Elements of programming
international conference on software language engineering, (1st ed.). Boston, MA: Addison-Wesley Professional.
SLE 2020, virtual event, usa, november 16-17, 2020 (pp. Stepanov, A., & Rose, D. E. (2014). From mathematics to
140–154). ACM. Retrieved from https://doi.org/10.1145/ generic programming (1st ed.). Boston, MA: Addison-Wesley
3426425.3426939 doi: 10.1145/3426425.3426939 Professional.
Reinking, A., Xie, N., de Moura, L., & Leijen, D. (2021). Strachey, C. S. (2000). Fundamental concepts in programming
Perceus: garbage free reference counting with reuse. In languages. High. Order Symb. Comput., 13(1/2), 11–49. Re-
S. N. Freund & E. Yahav (Eds.), PLDI ’21: 42nd ACM SIG- trieved from https://doi.org/10.1023/A:1010000313106 doi:
PLAN international conference on programming language 10.1023/A:1010000313106
design and implementation, virtual event, canada, june 20-25, Stucki, N., Rompf, T., Ureche, V., & Bagwell, P. (2015). RRB
20211 (pp. 96–111). ACM. Retrieved from https://doi.org/ vector: a practical general purpose immutable sequence. In
10.1145/3453483.3454032 doi: 10.1145/3453483.3454032 K. Fisher & J. H. Reppy (Eds.), Proceedings of the 20th ACM
Reynolds, J. C. (1998a). Definitional interpreters for higher- SIGPLAN international conference on functional program-
order programming languages. Higher-Order and Sym- ming, ICFP 2015, vancouver, bc, canada, september 1-3,
bolic Computation, 11(4), 363–397. doi: 10.1023/A: 2015 (pp. 342–354). ACM. Retrieved from https://doi.org/
1010027404223 10.1145/2784731.2784739 doi: 10.1145/2784731.2784739
Reynolds, J. C. (1998b). Theories of programming languages. Tofte, M., Birkedal, L., Elsman, M., & Hallenberg, N. (2004). A
Cambridge University Press. retrospective on region-based memory management. Higher-
Reynolds, J. C. (2002). Separation logic: A logic for shared Order and Symbolic Computation, 17(3), 245–265. Retrieved
mutable data structures. In 17th IEEE symposium on logic in from https://doi.org/10.1023/B:LISP.0000029446.78563.a4
computer science (LICS 2002), 22-25 july 2002, copenhagen, doi: 10.1023/B:LISP.0000029446.78563.a4
denmark, proceedings (pp. 55–74). IEEE Computer Society. Tov, J. A., & Pucella, R. (2011). Practical affine types. In
Retrieved from https://doi.org/10.1109/LICS.2002.1029817 T. Ball & M. Sagiv (Eds.), Proceedings of the 38th ACM
doi: 10.1109/LICS.2002.1029817 SIGPLAN-SIGACT symposium on principles of programming
Rytz, L., Amin, N., & Odersky, M. (2013). A flow-insensitive, languages, POPL 2011, austin, tx, usa, january 26-28, 2011
modular effect system for purity. In W. Dietl (Ed.), Formal (pp. 447–458). ACM. Retrieved from https://doi.org/10.1145/
techniques for java-like programs (pp. 4:1–4:7). New York, 1926385.1926436 doi: 10.1145/1926385.1926436
NY: ACM. Retrieved from https://doi.org/10.1145/2489804 Turner, J. (2017). Rust 2017 survey results. https://blog.rust
.2489808 doi: 10.1145/2489804.2489808 -lang.org/2017/09/05/Rust-2017-Survey-Results.html. (Re-
Saeta, B., Shabalin, D., Rasi, M., Larson, B., Wu, X., Schuh, P., trieved April 8, 2021)
. . . Wei, R. (2021). Swift for tensorflow: A portable, flexible Ullrich, S., & de Moura, L. (2019). Counting immutable
platform for deep learning. beans: reference counting optimized for purely functional
Shabalin, D. (2020). Just-in-time performance without warm-up programming. In J. Stutterheim & W. Chin (Eds.), IFL ’19:
(Tech. Rep.). Lausanne, Swizerland: EPFL. Implementation and application of functional languages, sin-
Shaikhha, A., Fitzgibbon, A. W., Jones, S. P., & Vytiniotis, gapore, september 25-27, 2019 (pp. 3:1–3:12). ACM. Re-
D. (2017). Destination-passing style for efficient memory trieved from https://doi.org/10.1145/3412932.3412935 doi:
management. In P. Trinder & C. E. Oancea (Eds.), Proceed- 10.1145/3412932.3412935
26 Racordon et al.
1. If e1 and e2 are values, then e1 is an array in- (a) If all arguments a1 , . . . , ak are values, then
stance of the form [l1 , . . . , lk ]. If e2 is a number by ESS-C ALL we have ∆ ` π; η; v0 (v) −→
c such that 0 ≤ c < k, then by E-E LEM we have π 0 ; η 0 , η 0 ; eλ ; pop {li | i ∈ Icpy }.
∆ ` π; η; [l ][c] −→ π 0 ; η; v. (b) If there exists ai that is not a value, then we
2. If e1 and e2 are values, but e2 is not a number, or know that a j for all 1 ≤ j < i are values and
is not in the range 0 ≤ c < k, then the evaluation the type derivation for ai must have either of the
is stuck at an invalid array subscript. following form, depending on whether ai is a
3. If e1 is a value but e2 is not, then by regular argument e or an inout argument &r.
ESS-C ONTEXT we have ∆ ` π; η; v[e2 ] −→
π 0 ; η 0 ; v[e20 ]. ∆; Γ ` e : τ ∆; Γ `path r : var τ
4. If e1 is not a value, then by ESS-C ONTEXT we ∆; Γ `arg e : τ ∆; Γ `arg &r : inout τ
have ∆ ` π; η; e1 [e2 ] −→ π 0 ; η 0 ; e10 [e2 ].
i. If ai ≡ e, then by induction hypothesis e
sub-case e.x: The typing derivation is governed by takes a step.
T-L ET P ROP R EF or T-VAR P ROP R EF. By the fact that
ii. If ai ≡ &r, and r is a location, then by
∆ ` π; η : Γ, we know that e : s and s is a field of s.
ESS-I NOUT applies.
1. If e1 is a value, then it is a structure instance iii. If ai ≡ &r, and r is not a location, then r
of the form [l ]s and by ESS-P ROP we have ∆ ` takes a step via −→lv .
π; η; [l ]s .x −→ π 0 ; η 0 ; v.
In all cases, by ESS-C ONTEXT we have
2. If e1 is not a value, then by ESS-C ONTEXT we ∆ ` π; η; v0 (v1 , . . . , vi−1 , ei , ei+1 . . . ek ) −→
have ∆ ` π; η; e.x −→ π 0 ; η 0 ; e0 .x. π 0 ; η 0 ; v0 (v1 , . . . , vi−1 , ei0 , ei+1 . . . ek )
case [e]: The typing derivation is governed by T-A RRAY L IT. 2. If e is not a value, then by ESS-C ONTEXT we have
1. If all elements in e are values, then by ESS-A RRAY L IT ∆ ` π; η; e( a) −→ π 0 ; η 0 ; e0 ( a).
we have ∆ ` π; η; [v] −→ π 0 ; η 0 ; [l ]. case b = ex in e: The typing derivation is governed by
2. If there exists ei that is not a value, then we know that T-B INDING.
e j for all 1 ≤ j < i are values and by ESS-C ONTEXT
we have ∆ ` π; η; [v1 , . . . , vi−1 , ei , ei+1 . . . ek ] −→ 1. If ex is a value, then by ESS-B INDING we have ∆ `
π 0 ; η 0 ; [ v 1 , . . . , v i −1 , e 0 , ei +1 . . . e k ] π; η; b = v in e −→ π 0 ; η 0 ; e; pop l.
2. If ex is not a value, then by ESS-C ONTEXT we have
case s(e1 , . . . , ek ): The typing derivation is governed by ∆ ` π; η; b = ex in e −→ π 0 ; η 0 ; b = e0x ; e.
T-S TRUCT L IT. By the fact that ∆; Γ; π ` s(e) : τ, we
know that struct s{m1 x1 : τ1 , . . . , mk xk : τk } ∈ ∆. case r = er : The typing derivation is governed by T-A SSIGN
and must have the form
1. If all elements e1 , . . . , ek are values, then by
ESS-S TRUCT L IT we have ∆ ` π; η; s(v) −→ ∆; Γ ` er : τ ∆; Γ `path r : var τ
π 0 ; η 0 ; [l ]s . ∆; Γ ` r = er : [⊥]
2. If there exists ei that is not a value, then
we know that e j for all 1 ≤ j < i There are three situations to consider:
are values, and by ESS-C ONTEXT we have 1. If er is a value and r is a location, then by
∆ ` π; η; s(v1 , . . . , vi−1 , ei , ei+1 . . . ek ) −→ ESS-A SSIGN we have ∆ ` π; η; var l = v −→
π 0 ; η 0 ; s(v1 , . . . , vi−1 , ei0 , ei+1 . . . ek ) π 0 ; η 0 ().
case func x0 ( x : p) → τλ {[y1 , . . . , yh ] in e1 } in e2 : The typ- 2. If er is not a value and r is a location, then by
ing derivation is governed by T-F UNC. By the fact that ∆ ` ESS-C ONTEXT we have ∆ ` π; η; var l = er −→
π; η : Γ, we know that {y1 , . . . , yh } ∈ dom(η) . Therefore π 0 ; η 0 ; var l = er0 .
mkenv(π, y1 : η (y1 ), . . . , yh : η (yh )) is defined. Then 3. If r is not a location, then we have ∆ ` π; η; r =
by ESS-F UNC L IT we have ∆ ` π; η; func x0 ( x : p) → e1 −→lv π 0 ; η 0 ; r 0 = e1 .
τλ {[y1 , . . . , yh ] in e1 } in e2 −→ π 00 ; µ00 , η; e2 ; pop l
case e1 ? e2 : e3 : The typing derivation is governed by
case e( a1 , . . . , ak ): The typing derivation is governed by T-C OND. There three two cases to consider.
T-C ALL. We know that e : ( p1 , . . . , pk ) → τ. Further, by
Lemma A.1 and the fact that ∆; Γ; π ` e( a1 , . . . , ak ) : τ, 1. If e1 is a value different than 0, then by ESS-C OND -T
we know that inout arguments do not denote overlapping we have ∆ ` π; η; v ? e2 : e3 −→ π; η; e2 .
memory locations. 2. If e1 is the value 0, then by ESS-C OND -F we have
1. If e is a value λ( x : p, ηλ , eλ ), then by induction ∆ ` π; η; 0 ? e2 : e3 −→ π; η; e3 .
hypothesis, for all elements ai , either ai is a value, or 3. If e1 is not a value, then by ESS-C ONTEXT we have
the program can take a step. ∆ ` π; η; e1 ? e2 : e3 −→ π 0 ; η 0 ; e10 ? e2 ! e3 .
28 Racordon et al.
case e( a) : The typing derivation is governed by T-C ALL. Since – We assume ∆; Γ0 ; π 0 ` e2 : τ by induction.
e( a) is well-typed by assumption, e has type ( p) → τ and – Finally we must show that ∆ ` π 0 ; η 0 , η : Γ0 .
each argument ai has type pi . We know that dom(π 0 ) \ dom(π ) = {l }, and
π 0 (l ) = m v1 . We know that ∆ ` π; η, η : Γ
1. If e is a value, then e is a function object of the form by assumption, therefore we can show that ∆ `
λ( x : p, ηλ , eλ ). Then the evaluation steps depends π 0 ; η 0 , η : Γ0 holds.
on the arguments:
2. If e1 that is not a value, then by induction hypothesis
(a) If all elements in a are values, then we know that e1 takes a step and its type is preserved.
e( a) steps with ESS-C ALL.
– We pick Γ0 such that each parameter name case r = er : The typing derivation is governed by T-A SSIGN
xi is mapped onto its type pi , either mutably and must have the form
if pi = inout τi for some τi , or immutably ∆; Γ ` er : τ ∆; Γ `path r : var τ
otherwise, and each name in dom(ηλ ) is
mapped onto its type and mutability in π. ∆; Γ ` r = er : ()
– Since e is well-typed by assumption,
There are three situations to consider:
we know that ∆ ` π; η; e( a) −→
π 0 ; η 0 , η; eλ ; pop l. We assume ∆; Γ0 ; η 0 ` 1. If er is a value and r is a location, then we know
eλ ; pop l holds by induction. that r = er steps with ESS-A SSIGN. We know that
– Finally we must show that ∆ ` π 0 ; ηλ , η : Γ. r has type var τ and e has type mτ by assumption,
Since e is well-typed, the free in eλ are argu- therefore the typing context does not change. We
ments in x, inserted in Γ0 using the argument pick Γ0 = Γ and we have ∆; Γ; π ` r = er : (), as
list. requested.
Furthermore, since ESS-C ALL checks for 2. If e1 is not a value and r is a location, then by in-
uniqueness of inout arguments, we know that duction hypothesis e1 takes a step and its type is
parameter name is either mapped onto a new preserved.
location in π 0 or its location cannot overlap
3. If r is not a location, then r takes a step via −→lv and
with another location in π 0 . Hence, we have
its type is preserved.
∆ ` π 0 ; η 0 , η : Γ0 .
(b) If there exists ai that is not a value, then we case e1 ? e2 : e3 : The typing derivation is governed by
know that a j for all 1 ≤ j < i are values and T-C OND.
the type derivation for ai must have either of the
following form, depending on whether ai is a 1. If e1 is a value different than 0, then e1 ? e2 : e3
regular argument e or an inout argument &r. takes a step with ESS-C OND -T. We pick Γ0 = Γ and
we have ∆; Γ; π ` e2 : τ, as requested.
∆; Γ ` e : τ ∆; Γ `path r : var τ 2. If e1 is the value 0, then e1 ? e2 : e3 takes a step
∆; Γ `arg e : τ ∆; Γ `arg &r : inout τ with ESS-C OND -F. We pick Γ0 = Γ and we have
∆; Γ; π ` e3 : τ, as requested.
i. If ai ≡ e, then by induction hypothesis e 3. If e1 is not a value, then by induction hypothesis e1
takes a step and its type is preserved. takes a step and its type is preserved.
ii. If ai ≡ &r, and r is a location, then
ESS-I NOUT applies and r’s type is pre- case e; pop l: The typing derivation is governed by T-P OP.
served. There are two situations to consider:
iii. If ai ≡ &r, and r is not a location, then r 1. If e is a value, then e; pop l takes a step with ESS-P OP.
takes a step via −→lv and r’s type is pre- We pick Γ0 = Γ and we have ∆; Γ0 ; π ` v : τ, as
served. requested.
In all cases, e( a) takes a step and its type is
2. If e is not a value, then by induction hypothesis e
preserved.
takes a step and its type is preserved.
2. If e is not a value, then by induction hypothesis e
takes a step and its type is preserved. case e1 ; e2 : The typing derivation is governed by T-S EQ. There
are two situations to consider:
case b = e1 in e2 : The typing derivation is governed by
T-B INDING. 1. If e1 is a value, then v; e2 takes a step with ESS-S EQ.
We pick Γ0 = Γ and we have ∆; Γ0 ; π ` e2 : τ, as
1. If e1 is a value, then we know that m x : τ = e1 in e2 requested.
steps with ESS-B INDING. 2. If e1 is not a value, then by induction hypothesis e1
– We pick Γ0 = Γ[ x 7→ m τ ]. takes a step and its type is preserved.
30 Racordon et al.