Operational Semantics for A Declarative Sequential Kernel Language
Table of Contents
1 Semantics - Synopsis
should be defined in a simple mathematical model helps us reason abt correctness, execution time, memory usage etc.
4 widely used approaches.
Model | Description | Suitable for | Example |
---|---|---|---|
Operational | how to execute | Imperative | |
a statement in terms of an | Declarative | OZ | |
abstract machine | Logical | ||
Axiomatic | Statement = relation | statement | Hoare |
b/w input and output states | sequence | Logic | |
relation is a logical | stateful | ||
models | |||
Denotational | statement = function over | Declarative | |
an abstract domain | |||
Logical | statement = model for | ||
logical theorems | Declarative | ||
Relational |
Two step approach: (1) Translate every program into an equivalent one in a simple core language called the kernel language. (2) Define operational semantics for the kernel language.
Other approaches: Translator Approach (1) Foundational Calculus approach: translate into a mathematical calculus. Example, the lambda calculus, first-order logic, or pi-calculus. (2) Translation directly into instructions on an abstract machine, or a virtual machine.
Interpreter Approach o Language Semantics is defined by an interpreter. o Linguistic Abstraction == extending the interpreter. o Interpreter is a program in L1 that accepts programs in L2 and executes them. o Metacircular Evaluator : L1 = L2
- Self-contained implementation of linguistic abstractions
- does not preserve execution-time complexity
- basic concepts interact inside the interpreter.
————————————————————————
2 Single-Assignment Store
set of variables that are initially uninitialized. can be assigned to one value They are called declarative variables (dataflow variables).
A declarative variable is indistinguishable from its value.
Value Store: persistent mapping of variables to values.
- We can compute with partial values. e.g. Bind an unbound argument as the output of a procedure
2.1 Value Creation (Binding)
x = 314
(1) Construct value in the store (2) Bind x to this value.
If x is already bound, '=' tests whether the operands are compatible. Incompatible operands cause an exception.
2.2 Variable Identifier
textual name that refers to a store entity from outside the store.
Environment {<X> -> x}
A variable identifier can refer to a value or another bound variable.
Following the links of bound variables is called dereferencing. This is invisible to the programmer.
2.3 Partial Values
Data structure that may contain unbound variables.
2.4 Variable-to-variable binding
X = Y => forms an equivalence class in the store X = [1 2 3] {x1, x2}
We can even have circular chain of references:
X = [1 2 X]
2.5 Use before binding
Ways of handling:
- No error in execution: Garbage Output. e.g. C/C++
- No error in execution, initialized to default value.
- Exception in Runtime : Java
- Compiler Error
- Wait until possibly another path binds the variable.
(3) and (4) are good strategies for sequential programs. (5) is a good strategy for concurrent programs. Could it be bad for sequential programs? top
3 Kernel Language
3.1 Backus Naur Form (Above Kernel Language)
3.2 Language
Let <x> and <y> denote any variable identifier, <s> any statement, <v> any value, and <pattern> some pattern.
Statement | Description |
---|---|
s::= | |
skip | empty statement |
<s>1 <s>2 | Statement sequence |
local <x> in <s> end | Variable creation |
<x>1 = <x>2 | Variable-Variable binding |
<x> = <v> | Value creation |
if <x> | Conditional |
then <s>1 | |
else <s>2 end | |
case <x> | Pattern matching |
of <pattern> then <s>1 | |
else <s>2 end | |
{<x> <y>1 … <y>m} | Procedure application |
The following are the value expressions in the language (the <v> above.)
Nonterminal | Expression | Comment |
---|---|---|
<v> | <number> OR | Note that procedures |
<record> OR | are values | |
<procedure> | ||
<number> | <int> OR | Integers are |
<float> | arbitrarily large | |
Floats are of | ||
arbitrary precision | ||
<record>, <pattern> | <literal> OR | nil, 'fine day' |
<literal>(<feature>1: <x>1 | are patterns | |
… | ||
<feature>n: <x>n) | ||
<procedure> | proc{$ <x>1 … <x>m} | Procedure values are |
<s> end | called closures | |
<literal> | <atom> OR <bool> | |
<feature> | <atom> OR <bool> OR <int> | integers are default |
features if labels are | ||
omitted. | ||
<bool> | true OR false |
A type is a value together with operations defined on it.
The system has a well-defined set of types called the basic types. For example, the following is a fragment of the hierarchy of the basic types. This hierarchy is ordered by set inclusion (for example, every list is a tuple)
Value | +-------------------------------------+... | | | Number Record Procedure | | +-------+ Tuple | | | Int Float +--------------------+... | | | Char Literal List | | +-----------+... String | | Bool Atom
Abstract Data Types: User-defined types which are not basic.
Oz is dynamically typed. Type errors in the (basic) declarative model cause immediate termination. If we extend the model with exceptions, type errors can be handled within the system.
Notes on basic types: Numbers: Unary minus is written with a tilde prefix: e.g. ~10 Strings: A string "ex" is a list of character codes [101 121] Procedures: <x> = proc {$ <y>1 … <y>m} <s> end has the syntactic shortcut proc {<x> <y>1 … <y>m} <s> end
This has two operations which are distinct in the elaborate version: (1) creating the procedure value (2) Binding it to the identifier <x>
Operations (on basic types)
Operation | Description | Argument Type |
---|---|---|
A==B | Value | |
A\=B | Not Equals | " |
{IsProcedure B} | " | |
A=<B | !!!! | Number or Atom |
A<B | " | |
A>=B | !!!! | " |
A>B | " | |
A+B | Number | |
A-B | " | |
A*B | " | |
A div B | Integer Division | Int |
A mod B | Modulo | Int |
A/B | Integer Division | Float |
{Arity R} | Arity | Record |
{Label R} | " | |
R.F | Field selection | " |
4 Rationale
Can the kernel language be further reduced while keeping the expressivity unchanged?
Why are some of the above features considered basic?
4.1 Records
basic way to structure data. easy to compose, and decompose. decompose using patterns several utility methods: to find arity, to select a field etc.
Building block for (1) Object-Oriented Programming (see PlanePoint.oz) (2) Component-based Programming
4.2 Procedures
Why not objects? or functions?
Functions in Oz return a value.
Procedures can define entities that are not necessarily like mathematical functions. top
5 Kernel Language Semantics
Consists in evaluating functions over partial values.
Operational Semantics with respect to an abstract machine.
5.1 Basic Concepts
5.1.1 Simple execution
local A B in : create 2 variables in the store, make A, B point to them A = 11 : Bind A to 11. B = A + A : Add A to itself, bind the result to B end
5.1.2 Variable identifiers and Static Scoping
local X in X = 1 local X in X = 2 {Browse X} : value of an identifier is determined by end its innermost local declaration. {Browse X} Can be determined from the *text* of end the program.
How else can this be done? What could be dynamic scoping?
5.1.3 Procedures
Parameters are passed by reference. Does not return a value. Produces effects by binding its unbound arguments.
proc {Max X Y ?Z} ? is an annotation for o/p [No effect] if X>Y then Z=X else Z=Y end : Z is unbound when input, bound inside end the procedure.
proc {LB X ?Z} if X>Y then Z=X else Z=Y end : Y is a free variable. end
Free variables take the values they have in the text, when the procedure is defined (and not the value it has when it is invoked).
local Y in Y=0 proc {LB X ?Z} if X>Y then Z=X else Z=Y end : Y is 0, and not 15 (Static scoping) end local Y = 15 Z in {LB 5 Z} end end
When could dynamic scoping be useful? e.g. locally configurable parameters deeply passed-on arguments.
e.g. fn(Z,X1) : X1's value is used here ... f2(Y,X1) f1(X,X1) : X1's value is determined here /can be simplified with dynamic scoping
When could dynamic scoping be harmful? Harder to reason about programs from the text! Does not support modular reasoning. A procedure that was correct may behave incorrectly in other contexts.
5.1.4 Procedural Abstraction
Any statement can be made into the body of a procedure. The variables used in the statement can be made either into formal parameters of the procedure, or into free variables. The free variables are scoped statically.
5.1.5 Dataflow behavior
In a single assignment store, variables can be unbound.
When a statement needs a variable which is as yet unbound, it waits till the object is bound. (This could happen in a different thread of execution.) This is called dataflow behavior.
Dataflow behavior enables concurrency
Can be implemented in the abstract machine in a simple way.
5.2 Abstract Machine
5.2.1 Definitions
A single assignment store \s: a set of variables partitioned into (1) Equal but unbound variables (e.g. after a variable-to-variable assignment) (2) Variables bound to a a value - number, record or procedure.
e.g. {x1, x2=x3, x1=a|x2}
A stored variable bound to a value is indistiguishable from that value.
An environment E: mapping from variable identifiers to entities in the single assignment store.
e.g. E is { X->a, Y-y}
A semantic statement is a pair (<s>,E) where
<s> is a statement
E is an environment
Semantic statement relates the statement to what it references (many-to-many)
e.g. (local X in X=10 end, {X->x})
An execution state is a pair (ST, \s) where ST is a semantic stack: a stack of semantic statements \s is a single-assignment store.
Semantic Stack | Single-Assignment Store |
---|---|
(local X in X=10 end, {X->x}) | x->10 |
A computation is a sequence of execution states starting from the initial.
A single transition in a computation is called a computation step. A computation step is atomic, there are no visible intermediate states.
Questions:
(A) Why is an environment associated with every statement? Why not just have a global store?
(B) Why do we separate the identifiers and the value store variables?
5.2.2 Program Execution
A program is a statement <s>.
The initial execution state s
(<s>, O) | O |
o At each step, the first element of ST is popped and execution proceeds according to the form of the statement.
o The final execution state, if the program terminates, is one in which the semantic state is empty.
A semantic stack can be in
- Runnable State
- Terminated State
- Suspended state.
5.2.3 Calculating with Environments
E(<x>): entity associated with variable <x> in the store.
Adjunction: E+{<x>->x} overrides any existing mapping of <x> e.g. in lexical scoping, not reassignment.
Restriction: defines a subdomain of an existing one. e.g. E|<x>1, …, <x>n used in defining closures.
5.3 Nonsuspendable statements
Statement at | |
---|---|
the top of the | Operational Semantic Actions |
semantic stack | |
(skip, E) | execution is complete after popping this |
statement from the semantic stack. | |
(<s1 > <s2 >, E) | (1) Push (<s2>,E) on to the stack. |
(2) Push (<s1>,E) on to the stack. | |
Q: Why doesn't the environment change? | |
(local <x> | (1) Create a new variable x in the store |
in <s> end,E) | (2) E' = E + {<x> -> x} |
(3) Push (<s>,E') | |
(<x1>=<x2>,E) | Bind E(<x1>) to E(<x2>) in the store. |
(<x>=<v>,E) | (1) Create a new variable x in the store. |
(2) Construct the value represented by <v> | |
(3) Let <x> refer to it. | |
(4) All identifiers in <v> are replaced by | |
contents as given by E. | |
(5) Bind E(<x>) and <x> in the store. |
In the last case, the values can be numbers, records or procedures.
N.B. Procedure values are also called closures.
The procedure body <s> can have free variable identifiers. formal parameters external references
The procedure value is a pair (proc {$ <y>1 … <y>m} <s> end, CE) where CE is E|<z>1 … <z>k is the contextual environment.
5.4 Suspendable statements.
If a variable <x> is unbound, the execution is suspended until activation : E(<x>) must be determined.
In the declarative sequential model, if an execution is suspended, it will never continue.
| Statement at the top of semantic | Operational Semantic action | | stack | | |----------------------------------+---------------------------------------| | (if <x> then <s>1 else <s>2 end, | If E(<x>) is determined | | E) | If E(<x>) is not a boolean | | | raise an error condition | | | If E(<x>) is true | | | push(<s>1, E) | | | Else | | | push(<s>2, E) | | | Else | | Suspend execution | |----------------------------------+---------------------------------------| | ({<x> <y>1 ... <y>n}, | If E(<x>) is determined | | E) | If E(<x>) not a procedure value | | | or does not have n arguments | | | raise an error condition | | | If E(<x>)::(proc {$ <z>1 <z>n} | | | <s> end, CE) | | | push(<s>, CE+{<z>1 -> E(<y>1), | | | ..., <z>n -> E(<y>n)}) | | | Else | | | Suspend execution | |----------------------------------+---------------------------------------| | (case <x> of <lit>(<f>1:<x>1 | If E(<x>) is determined | | ... <f>n = <x>n) | If the label of E(<x>) is <lit> and | | then <s>1 else <s>2 end, E) | its arity is [<f>1 <f>n] | | | push(<s>1, E+{<x>1 -> E(<x>).f1, | | | ..., <x>n = E(<x>).fn}) | | | Else | | | push(<s>2, E) | |----------------------------------+---------------------------------------|
6 Execution Examples
6.1 Variable Identifiers and Static Scoping
-------------------------- local X in ------+ X=1 | local X in --+ <s> X=2 | | {Browse X} <s>1 | end --+ | {Browse Y} <s>2 | end ------+ --------------------------
Sequence of Execution states.
- The initial state of the semantic stack and the Store are
(s,emptyset) emptyset
- After executing the local statement, and binding X=1
(<s>1 <s>2 , {X->x}) {x->1}
- After executing the sequential composition
(<s>1, {X->x}) {x->1} (<s>2, {X->x})
- Executing local statement in <s>1
(X=2 {Browse X}, {X->x'}) {x', x->1} (<s>2, {X->x})
- After the binding X=2
({Browse X}, {X->x'}) {x'->2, ({Browse X}, {X->x}) x->1}
6.2 Procedure Definition and Calls
First, a procedure with no external references in its body.
1. local Copy in 2. local B in 3 local A in 4. Copy = proc {$ X ?Y} 5. Y=X 6. end 7. B = 1 8. {Copy B A} 9. skip 10. end 11. end 12. end
- The initial execution state is
(1-12:: emptyset) emptyset
- After executing the three local declarations,
(4-6::{Copy->c,B->b,A->a}) {c,b,a} (7::{Copy->c, B->b, A->a}) (8::{Copy->c, B->b, A->a}) (9::{Copy->c, B->b, A->a})
- After executing the two bindings,
(8::{Copy->c, B->b, A->a}) {c->(proc...end,emptyset), (9::{Copy->c, B->b, A->a}) b->1, a}
- The procedure invocation involves the following step which binds
the formal parameters X and Y to the store variables
corresponding to the actual parameters A and B. This is the
pass-by-value mechanism in the current semantics.
(5::{X->b, Y->a}) {c->(proc...end,emptyset), (9::{Copy->c, A->a, B->b}) {b->1}, a}
- After executing the procedure application,
(9::{Copy->c, B->b, A->a}) {c->(proc...end,emptyset), {a,b}->1}
The environment is unchanged since there are no external references in the body of the procedure.
- After executing
skip
, the statement stack is empty, and a is bound to 1.
Now, we consider a procedure with an external reference.
1. local Y in 2. Y=2 3. local CopyConstant in 4. local A in 5. CopyConstant = proc {$ B ?A} 6. A=Y 7. end 8. {CopyConstant 1 A} 9. end 10. end 11. end
The procedure value in this case is
(proc {$ B A} A = Y end, {Y->y}) ======
where the store contains
y->2
Procedure application has the following change: the new environment is computed starting from the contextual environment.
(6-6::{Y->y, B->b, A->a}) {CopyConstant->proc...end, a, b->1, y->2}
After the statement is executed, the stack is empty, with a bound to 2.
7 Unification
7.1 Introduction
The algorithm behind the binding operator ('=') and the logical operators entailment ('==') and disentailment ('\=').
The unification <X>=<Y> tries to make the partial values <X> and <Y>
"equal" by adding 0 or more bindings. "Equal" means that after
successful unification, <X>==<Y> would be true
.
Informally, unification works by "adding binding information to the store". We can clarify this by first considering some specific cases.
Consider some examples.
- The binding
X=Y
adds the information thatX
andY
are equal.Suppose
X
andY
were already bound before this. Then some additional bindings may be added to the store, e.g.X = r(1 2) Y=r(A B) X=Y
would bindA
to 1 andB
to 2 in the store. - Unification is symmetric:
2=X
andX=2
would both bind 2 toX
in the store. Similarlyr(A 2) = r(1 B)
would bindA
to 1 andB
to 2 in the store. - If two partial values are already equal, unification does nothing:
e.g.
2=2
,X=X
. - If the partial values are not compatible in type, then unification
fails, producing a
failure
exception. e.g.10=12
,r(A 2) = r(1 A)
.Executing the
fail
statement causes the unification to fail. - Cyclic structures can be unified. e.g.
X = 1|X
.Unification may result in new cyclic structures being formed. e.g.
X = f(a:X b:_) Y=f(a:_ b:Y) X=Y
creates a cyclic structureX = f(a:X b:X)
with 2 cycles.
7.2 Algorithm
The store σ is partitioned into
- Sets of unbound variables which are equal. (Each such set is called
an equivalence set). e.g. If X=Y, and X,Y,Z are unbound
variables, then the store contains the two equivalence sets
{x,y}
and{z}
. - Variables bound to values.
For the purpose of the algorithm that follows, note that a set refers always to the first case of a set of equivalent unbound variables.
7.2.1 Primitive Bind
The basic operation of unification. It binds all variables in an equivalence set.
- bind (S, <v>)
- binds all variables in the equivalence set S to
the number or record <v>. This function is
typically called when an unbound variable x ∈ S
is unified with <v>. e.g.
bind ( {x, y}, r(a:x) )
removes
x
andy
from the equivalence set{x,y}
and binds each of them separately tor(a:x)
. - bind(S, T)
- Merges the equivalence sets S and T. This is typically called when an unbound variable x ∈ S is unified with some unbound variable y ∈ T.
7.2.2 Unify(x,y)
Let x
, y
be two store variables. The algorithm Unify(x
, y
)
proceeds as follows. The cases are treated separately.
- Both unbound variables
- If x ∈ S and y ∈ T, then bind(S,T). Note that T is not necessarily distinct from S.
- One variable is bound to a value
- If x ∈ S and y is bound to a value, then bind(S,y). If y ∈ T and x is bound to a value, then bind(T,x). (Note that a bound variable is indistinguishable from its value.)
- Incompatible Record types
- If
x
is bound to r(f1: x1 … fn: xn) andy
is bound to r'(f1': y1 … fm': ym), where r ≠ r' or {f1, f2, …, fn} ≠ {f1', f2', …, fm'} then raise afailure
exception. - Compatible Record types
- If
x
is bound to r(f1: x1 … fn: xn) andy
is bound to r(f1: y1 … fn: yn), then- Do for
i
from 1 to n - Unify(xi, yi)
- Do for