Operational Semantics for A Declarative Sequential Kernel Language

1 Semantics - Synopsis
2 Single-Assignment Store
3 Kernel Language
- 3.1 Backus Naur Form (Above Kernel Language)
- 3.2 Language
4 Rationale
- 4.1 Records
- 4.2 Procedures
5 Kernel Language Semantics
6 Execution Examples
- 6.1 Variable Identifiers and Static Scoping
- 6.2 Procedure Definition and Calls
7 Unification
- 7.1 Introduction
- 7.2 Algorithm
  - 7.2.1 Primitive Bind
  - 7.2.2 Unify(x,y)

1 Semantics - Synopsis

should be defined in a simple mathematical model helps us reason abt correctness, execution time, memory usage etc.

4 widely used approaches.


Model	Description	Suitable for	Example
Operational	how to execute	Imperative
	a statement in terms of an	Declarative	OZ
	abstract machine	Logical
Axiomatic	Statement = relation	statement	Hoare
	b/w input and output states	sequence	Logic
	relation is a logical	stateful
		models
Denotational	statement = function over	Declarative
	an abstract domain
Logical	statement = model for
	logical theorems	Declarative
		Relational

Two step approach: (1) Translate every program into an equivalent one in a simple core language called the kernel language. (2) Define operational semantics for the kernel language.

Other approaches: Translator Approach (1) Foundational Calculus approach: translate into a mathematical calculus. Example, the lambda calculus, first-order logic, or pi-calculus. (2) Translation directly into instructions on an abstract machine, or a virtual machine.

Interpreter Approach o Language Semantics is defined by an interpreter. o Linguistic Abstraction == extending the interpreter. o Interpreter is a program in L1 that accepts programs in L2 and executes them. o Metacircular Evaluator : L1 = L2

Self-contained implementation of linguistic abstractions
does not preserve execution-time complexity
basic concepts interact inside the interpreter.

————————————————————————

2 Single-Assignment Store

set of variables that are initially uninitialized. can be assigned to one value They are called declarative variables (dataflow variables).

A declarative variable is indistinguishable from its value.

Value Store: persistent mapping of variables to values.

We can compute with partial values. e.g. Bind an unbound argument as the output of a procedure

2.1 Value Creation (Binding)

x = 314

(1) Construct value in the store (2) Bind x to this value.

If x is already bound, '=' tests whether the operands are compatible. Incompatible operands cause an exception.

2.2 Variable Identifier

textual name that refers to a store entity from outside the store.

Environment {<X> -> x}

A variable identifier can refer to a value or another bound variable.

Following the links of bound variables is called dereferencing. This is invisible to the programmer.

2.3 Partial Values

Data structure that may contain unbound variables.

2.4 Variable-to-variable binding

X = Y              =>  forms an equivalence class in the store
X = [1 2 3]            {x1, x2}

We can even have circular chain of references:

X = [1 2 X]

2.5 Use before binding

Ways of handling:

No error in execution: Garbage Output. e.g. C/C++
No error in execution, initialized to default value.
Exception in Runtime : Java
Compiler Error
Wait until possibly another path binds the variable.

(3) and (4) are good strategies for sequential programs. (5) is a good strategy for concurrent programs. Could it be bad for sequential programs? top

3 Kernel Language

3.1 Backus Naur Form (Above Kernel Language)

3.2 Language

Let <x> and <y> denote any variable identifier, <s> any statement, <v> any value, and <pattern> some pattern.


Statement	Description
s::=
skip	empty statement
<s>1 <s>2	Statement sequence
local <x> in <s> end	Variable creation
<x>1 = <x>2	Variable-Variable binding
<x> = <v>	Value creation
if <x>	Conditional
then <s>1
else <s>2 end
case <x>	Pattern matching
of <pattern> then <s>1
else <s>2 end
{<x> <y>1 … <y>m}	Procedure application

The following are the value expressions in the language (the <v> above.)


Nonterminal	Expression	Comment
<v>	<number> OR	Note that procedures
	<record> OR	are values
	<procedure>
<number>	<int> OR	Integers are
	<float>	arbitrarily large
		Floats are of
		arbitrary precision
<record>, <pattern>	<literal> OR	nil, 'fine day'
	<literal>(<feature>1: <x>1	are patterns
	…
	<feature>n: <x>n)
<procedure>	proc{$ <x>1 … <x>m}	Procedure values are
	<s> end	called closures
<literal>	<atom> OR <bool>
<feature>	<atom> OR <bool> OR <int>	integers are default
		features if labels are
		omitted.
<bool>	true OR false

A type is a value together with operations defined on it.

The system has a well-defined set of types called the basic types. For example, the following is a fragment of the hierarchy of the basic types. This hierarchy is ordered by set inclusion (for example, every list is a tuple)

                   Value
                    |
      +-------------------------------------+...
      |             |                       |
     Number        Record                 Procedure
      |             |
 +-------+         Tuple
 |       |          |
Int     Float       +--------------------+...
 |                  |                    |
Char                Literal            List 
                    |                    |
              +-----------+...         String
              |           |
             Bool        Atom

Abstract Data Types: User-defined types which are not basic.

Oz is dynamically typed. Type errors in the (basic) declarative model cause immediate termination. If we extend the model with exceptions, type errors can be handled within the system.

Notes on basic types: Numbers: Unary minus is written with a tilde prefix: e.g. ~10 Strings: A string "ex" is a list of character codes [101 121] Procedures: <x> = proc {$ <y>1 … <y>m} <s> end has the syntactic shortcut proc {<x> <y>1 … <y>m} <s> end

This has two operations which are distinct in the elaborate version: (1) creating the procedure value (2) Binding it to the identifier <x>

Operations (on basic types)


Operation	Description	Argument Type
A==B		Value
A\=B	Not Equals	"
{IsProcedure B}		"
A=<B	!!!!	Number or Atom
A<B		"
A>=B	!!!!	"
A>B		"
A+B		Number
A-B		"
A*B		"
A div B	Integer Division	Int
A mod B	Modulo	Int
A/B	Integer Division	Float
{Arity R}	Arity	Record
{Label R}		"
R.F	Field selection	"

top

4 Rationale

Can the kernel language be further reduced while keeping the expressivity unchanged?

Why are some of the above features considered basic?

4.1 Records

basic way to structure data. easy to compose, and decompose. decompose using patterns several utility methods: to find arity, to select a field etc.

Building block for (1) Object-Oriented Programming (see PlanePoint.oz) (2) Component-based Programming

4.2 Procedures

Why not objects? or functions?

Functions in Oz return a value.

Procedures can define entities that are not necessarily like mathematical functions. top

5 Kernel Language Semantics

Consists in evaluating functions over partial values.

Operational Semantics with respect to an abstract machine.

5.1 Basic Concepts

5.1.1 Simple execution

local A B in                 :  create 2 variables in the store, make
                                A, B point to them
  A = 11                     :  Bind A to 11. 
  B = A + A                  :  Add A to itself, bind the result to B
end

5.1.2 Variable identifiers and Static Scoping

local X in
  X = 1
  local X in
    X = 2
    {Browse X}               : value of an identifier is determined by 
  end                          its innermost local declaration.
  {Browse X}                   Can be determined from the *text* of
end                            the program.

How else can this be done? What could be dynamic scoping?

5.1.3 Procedures

Parameters are passed by reference. Does not return a value. Produces effects by binding its unbound arguments.

proc {Max X Y ?Z}                 ? is an annotation for o/p [No effect] 
  if X>Y then Z=X else Z=Y end  : Z is unbound when input, bound inside
end                               the procedure.

proc {LB X ?Z}
  if X>Y then Z=X else Z=Y end  : Y is a free variable.
end

Free variables take the values they have in the text, when the procedure is defined (and not the value it has when it is invoked).

local Y in
  Y=0
  proc {LB X ?Z}
    if X>Y then Z=X else Z=Y end  : Y is 0, and not 15 (Static scoping)
  end
  local Y = 15 Z in
    {LB 5 Z}
  end
end

When could dynamic scoping be useful? e.g. locally configurable parameters deeply passed-on arguments.

 e.g.     fn(Z,X1) : X1's value is used here
          ...
          f2(Y,X1)
          f1(X,X1) : X1's value is determined here
/can be simplified with dynamic scoping

When could dynamic scoping be harmful? Harder to reason about programs from the text! Does not support modular reasoning. A procedure that was correct may behave incorrectly in other contexts.

5.1.4 Procedural Abstraction

Any statement can be made into the body of a procedure. The variables used in the statement can be made either into formal parameters of the procedure, or into free variables. The free variables are scoped statically.

5.1.5 Dataflow behavior

In a single assignment store, variables can be unbound.

When a statement needs a variable which is as yet unbound, it waits till the object is bound. (This could happen in a different thread of execution.) This is called dataflow behavior.

Dataflow behavior enables concurrency

Can be implemented in the abstract machine in a simple way.

5.2 Abstract Machine

5.2.1 Definitions

A single assignment store \s: a set of variables partitioned into (1) Equal but unbound variables (e.g. after a variable-to-variable assignment) (2) Variables bound to a a value - number, record or procedure.

e.g. {x1, x2=x3, x1=a|x2}

A stored variable bound to a value is indistiguishable from that value.

An environment E: mapping from variable identifiers to entities in the single assignment store.

e.g. E is { X->a, Y-y}

A semantic statement is a pair (<s>,E) where

<s> is a statement

E is an environment

Semantic statement relates the statement to what it references (many-to-many)

e.g. (local X in X=10 end, {X->x})

An execution state is a pair (ST, \s) where ST is a semantic stack: a stack of semantic statements \s is a single-assignment store.


Semantic Stack	Single-Assignment Store
(local X in X=10 end, {X->x})	x->10

A computation is a sequence of execution states starting from the initial.

A single transition in a computation is called a computation step. A computation step is atomic, there are no visible intermediate states.

Questions:

(A) Why is an environment associated with every statement? Why not just have a global store?

(B) Why do we separate the identifiers and the value store variables?

5.2.2 Program Execution

A program is a statement <s>.

The initial execution state s


(<s>, O)	O

o At each step, the first element of ST is popped and execution proceeds according to the form of the statement.

o The final execution state, if the program terminates, is one in which the semantic state is empty.

A semantic stack can be in

Runnable State
Terminated State
Suspended state.

5.2.3 Calculating with Environments

E(<x>): entity associated with variable <x> in the store.

Adjunction: E+{<x>->x} overrides any existing mapping of <x> e.g. in lexical scoping, not reassignment.

Restriction: defines a subdomain of an existing one. e.g. E|_{<x>1, …, <x>n} used in defining closures.

5.3 Nonsuspendable statements


Statement at
the top of the	Operational Semantic Actions
semantic stack
(skip, E)	execution is complete after popping this
	statement from the semantic stack.
(<s₁ > <s₂ >, E)	(1) Push (<s_2>;,E) on to the stack.
	(2) Push (<s_1>;,E) on to the stack.
	Q: Why doesn't the environment change?
(local <x>	(1) Create a new variable x in the store
in <s> end,E)	(2) E' = E + {<x> -> x}
	(3) Push (<s>,E')
(<x_1>;=<x_2>;,E)	Bind E(<x_1>;) to E(<x_2>;) in the store.
(<x>=<v>,E)	(1) Create a new variable x in the store.
	(2) Construct the value represented by <v>
	(3) Let <x> refer to it.
	(4) All identifiers in <v> are replaced by
	contents as given by E.
	(5) Bind E(<x>) and <x> in the store.

In the last case, the values can be numbers, records or procedures.

N.B. Procedure values are also called closures.

The procedure body <s> can have free variable identifiers. formal parameters external references

The procedure value is a pair (proc {$ <y>1 … <y>m} <s> end, CE) where CE is E|_{<z>1 … <z>k} is the contextual environment.

5.4 Suspendable statements.

If a variable <x> is unbound, the execution is suspended until activation : E(<x>) must be determined.

In the declarative sequential model, if an execution is suspended, it will never continue.

| Statement at the top of semantic | Operational Semantic action           |
| stack                            |                                       |
|----------------------------------+---------------------------------------|
| (if <x> then <s>1 else <s>2 end, |  If E(<x>) is determined              |
| E)                               |   If E(<x>) is not a boolean          |
|                                  |      raise an error condition         |
|                                  |   If E(<x>) is true                   |
|                                  |      push(<s>1, E)                    |
|                                  |   Else                                |
|                                  |      push(<s>2, E)                    |
|                                  |  Else
|                                  |     Suspend execution                 |
|----------------------------------+---------------------------------------|
| ({<x> <y>1 ... <y>n},            | If E(<x>) is determined               |
| E)                               | If E(<x>) not a procedure value       |
|                                  |  or does not have n arguments         |
|                                  |  raise an error condition             |
|                                  | If E(<x>)::(proc {$ <z>1   <z>n}      |
|                                  |             <s> end, CE)              |
|                                  |  push(<s>, CE+{<z>1 -> E(<y>1),       |
|                                  |             ..., <z>n -> E(<y>n)})    |
|                                  | Else                                  |
|                                  |  Suspend execution                    |
|----------------------------------+---------------------------------------|
| (case <x> of <lit>(<f>1:<x>1     | If E(<x>) is determined               |
| ... <f>n = <x>n)                 |  If the label of E(<x>) is <lit> and  |
| then <s>1 else <s>2 end, E)      |   its arity is [<f>1   <f>n]          |
|                                  |     push(<s>1, E+{<x>1 -> E(<x>).f1,  |
|                                  |               ..., <x>n = E(<x>).fn}) |
|                                  |  Else                                 |
|                                  |    push(<s>2, E)                      |
|----------------------------------+---------------------------------------|

top

6 Execution Examples

6.1 Variable Identifiers and Static Scoping

--------------------------
local X in          ------+ 
  X=1                     |
  local X in        --+   <s>
    X=2               |   |
    {Browse X}      <s>1  |
  end               --+   |
  {Browse Y}        <s>2  |
end                 ------+
--------------------------

Sequence of Execution states.

The initial state of the semantic stack and the Store are
```
(s,emptyset)    emptyset
```
After executing the local statement, and binding X=1
```
(<s>1 <s>2 , {X->x})         {x->1}
```

After executing the sequential composition

(<s>1, {X->x})               {x->1}
(<s>2, {X->x})

Executing local statement in <s>1

(X=2 {Browse X}, {X->x'})    {x', x->1}
(<s>2, {X->x})

After the binding X=2

({Browse X}, {X->x'})        {x'->2, 
({Browse X}, {X->x})          x->1}

6.2 Procedure Definition and Calls

First, a procedure with no external references in its body.

1. local Copy in
2.   local B in
3        local A in
4.             Copy =   proc {$ X ?Y}
5.                         Y=X
6.                      end
7.             B = 1            
8.             {Copy B A}
9.             skip
10.       end
11.   end
12. end

The initial execution state is

(1-12:: emptyset)               emptyset

After executing the three local declarations,

(4-6::{Copy->c,B->b,A->a})        {c,b,a}
(7::{Copy->c, B->b, A->a})
(8::{Copy->c, B->b, A->a})
(9::{Copy->c, B->b, A->a})

After executing the two bindings,

(8::{Copy->c, B->b, A->a})        {c->(proc...end,emptyset),
(9::{Copy->c, B->b, A->a})         b->1,
                                   a}

The procedure invocation involves the following step which binds the formal parameters X and Y to the store variables corresponding to the actual parameters A and B. This is the pass-by-value mechanism in the current semantics.
```
(5::{X->b, Y->a})                  {c->(proc...end,emptyset),
(9::{Copy->c, A->a, B->b})                  {b->1},
                                          a}
```
After executing the procedure application,
```
(9::{Copy->c, B->b, A->a})        {c->(proc...end,emptyset),
                                   {a,b}->1}
```
The environment is unchanged since there are no external references in the body of the procedure.
After executing skip, the statement stack is empty, and a is bound to 1.

top

Now, we consider a procedure with an external reference.

1. local Y in
2.      Y=2
3.      local CopyConstant in
4.                  local A in
5.                        CopyConstant = proc {$ B ?A}
6.                                            A=Y               
7.                                       end                       
8.                        {CopyConstant 1 A}
9.          end
10.       end
11. end

The procedure value in this case is

(proc {$ B A} A = Y end,  {Y->y}) 
                          ======

where the store contains

y->2

Procedure application has the following change: the new environment is computed starting from the contextual environment.

(6-6::{Y->y, B->b, A->a})             {CopyConstant->proc...end,
                                       a,
                                       b->1,
                                       y->2}

After the statement is executed, the stack is empty, with a bound to 2.

top

7 Unification

7.1 Introduction

The algorithm behind the binding operator ('=') and the logical operators entailment ('==') and disentailment ('\=').

The unification <X>=<Y> tries to make the partial values <X> and <Y> "equal" by adding 0 or more bindings. "Equal" means that after successful unification, <X>==<Y> would be true.

Informally, unification works by "adding binding information to the store". We can clarify this by first considering some specific cases.

Consider some examples.

The binding X=Y adds the information that X and Y are equal.
Suppose X and Y were already bound before this. Then some additional bindings may be added to the store, e.g. X = r(1 2) Y=r(A B) X=Y would bind A to 1 and B to 2 in the store.
Unification is symmetric: 2=X and X=2 would both bind 2 to X in the store. Similarly r(A 2) = r(1 B) would bind A to 1 and B to 2 in the store.
If two partial values are already equal, unification does nothing: e.g. 2=2, X=X.
If the partial values are not compatible in type, then unification fails, producing a failure exception. e.g. 10=12, r(A 2) = r(1 A).
Executing the fail statement causes the unification to fail.
Cyclic structures can be unified. e.g. X = 1|X.
Unification may result in new cyclic structures being formed. e.g. X = f(a:X b:_) Y=f(a:_ b:Y) X=Y creates a cyclic structure X = f(a:X b:X) with 2 cycles.

7.2 Algorithm

The store σ is partitioned into

Sets of unbound variables which are equal. (Each such set is called an equivalence set). e.g. If X=Y, and X,Y,Z are unbound variables, then the store contains the two equivalence sets {x,y} and {z}.
Variables bound to values.

For the purpose of the algorithm that follows, note that a set refers always to the first case of a set of equivalent unbound variables.

7.2.1 Primitive Bind

The basic operation of unification. It binds all variables in an equivalence set.

bind (S, <v>)

binds all variables in the equivalence set S to the number or record <v>. This function is typically called when an unbound variable x ∈ S is unified with <v>. e.g.

bind ( {x, y},  r(a:x) )

removes x and y from the equivalence set {x,y} and binds each of them separately to r(a:x).

bind(S, T)

Merges the equivalence sets S and T. This is typically called when an unbound variable x ∈ S is unified with some unbound variable y ∈ T.

7.2.2 Unify(x,y)

Let x, y be two store variables. The algorithm Unify(x, y) proceeds as follows. The cases are treated separately.

Both unbound variables

If x ∈ S and y ∈ T, then bind(S,T). Note that T is not necessarily distinct from S.

One variable is bound to a value

If x ∈ S and y is bound to a value, then bind(S,y). If y ∈ T and x is bound to a value, then bind(T,x). (Note that a bound variable is indistinguishable from its value.)

Incompatible Record types

If x is bound to r(f₁: x₁ … f_n: x_n) and y is bound to r'(f₁^': y₁ … f_m^': y_m), where r ≠ r' or {f₁, f₂, …, f_n} ≠ {f₁^', f₂^', …, f_m^'} then raise a failure exception.

Compatible Record types

If x is bound to r(f₁: x₁ … f_n: x_n) and y is bound to r(f₁: y₁ … f_n: y_n), then

Do for i from 1 to n
Unify(x_i, y_i)

top