KernelCORE - Syntax and Semantics


Contents

Syntax

A Kernel/CORE program consists of type definitions and relation definitions. Those relations are flat, which means that a relation is a set of pairs of non-functional values (we call them flat, printable, or first-order values).

Types

In Kernel/CORE types are defined using two constructs: disjoint union (square brackets) and product (curly brackets).

type_exp   ::=   [ 'label type_exp, ..., 'label type_exp ]  |  { 'label type_exp, ..., 'label type_exp }
label  ::=  identifier

Unit type corresponds to the empty product, i.e., {}; it carries only one value also denoted by {}. Empty type which carries no value corresponds to the empty union, [].

Names of union and product fields are called labels and are prefixed by a single quote. Types can be named and recursive types can be defined.

type_def   ::=   <type> type_name <is> type_exp
type_name  ::=  identifier
E.g.:
 <type> List_of_bits <is> [ 'empty {}, '0 List_of_bits, '1 List_of_bits ];

Within one union or one product definition labels cannot repeat, however globally they can be reused. Thus, for example, we can write:

 <type> two_letter_alphabet <is> ['a {}, 'b {}];
 <type> three_letter_alphabet <is> ['a {},'b {}, 'c {}];

Every basic type defines an enumerable set of flat values, which are finite, edge labeled, rooted trees. Basic types are represented as vertices of a minimal labeled and-or graph, called type graph: product types are presented by and vertices and union nodes by or vertices. The unit type is an and vertex without children. The type graph can be seen as a deterministic top down tree automaton: the set of values of the type corresponds to the set of trees accepted by the corresponding tree automaton.

Relations

A relation f (which generalizes the notion of a function) is a set of ordered pairs of flat values, Df x If , where Df and If are two sets of values of some basic types. Intuitively, Df denotes the domain of f and If the image of f . The pair of basic types (tD,tI) corresponding to Df and If is called the type of f and denoted by f:tD -> tI.

A relation f from unit type, i.e., f:{} -> tI , is called a constant. A named relation definition in Kernel looks like follows:

 (Tx -> Ty) rel_name = ... ; 

The definition of the relation, represented above by "...", is an expression E of the following form:

E  ::=  rel  |  @label  |  $var  |  'label E  |  E . label  
       |  product  |  union  |  E :: E  |  ( $var P -> E )  
product ::=  { 'label E, 'label E, ... }   
union   ::=  [ E, E, ... ] 
rel, label, var ::=  identifier

where @label is a reference to a member of a product constructor, $var is a parameter name, and P is a template (also called pattern). The intuitive meaning of the above constructs are:

Intuitively, the product value constructor represents parallel composition in which correponding field values are calculated by evaluation of sub-expressions. A series of product annotations can be added just before the closing bracket, in order to specifies how the interaction of parallel sub-processes with environment is performed. Presently two cases are defined:

  1. no annotation - in this case the subprocesses interact with the environment synchronously on all channels; and
  2. <serialize> ChName - interaction of the subprocesses with the environment via channel ChName is serialized and the order of the interaction is determined by the actual enumeration order of the fields in the product constructor.

The union operation is just the union of relations. In general, union of functions can yield a relation which is not a function. We introduce union annotations which help us control that non-determinism during the program execution (run-time). The anotations are added just before the closing square bracket of the union.

  1. no annotation - the first sub-expression which is defined (during the run-time) is considered,
  2. <longest> ChName - the sub-expression which is defined and which interacts the longest with the environment via channel ChName is considered,
  3. <shortest> ChName - the sub-expression which is defined and which interacts the shortest with the environment via channel ChName is considered,
  4. <unknown> - any sub-expression which is defined can be considered (explicit non-determinism). That modifier has to occur as the last one.

It is required that every Kernel Language expression is of first order type, i.e., its domain and image types are flat (i.e., non-functional). In particular, an expression defining the body of a functional abstraction have to be a constant.

Moreover, every product constructor is required to define a constant, and thus the sub-expressions of a product constructor have to be constants as well.

Channel declaration

There are two special Kernel relation definitions, namely <input> and <output>. They are used to define channels (also called pipes). For example:
({} -> ['0, '1]) read_chan = <input>;  -- input pipe which carries bit values 
(['0, '1] -> {}) write_chan = <output>;  -- output pipe 
In expressions, channels can be used as any other Kernel relation. Notice that an input channel is always a constant and that an output channel image type is aways {}, i.e., that the domain type of an input pipe is unit, as is the image type of an output pipe.

Sub-expressions and free references

For every Kernel sub-expression p we define the set of the free references in the usual recursive way:

  1. If p is a parameter name or a label of a product constructor, i.e., p=$x or @l, then free(p)={p}.
  2. If p is a relation name, then free(p) is the empty set.
  3. If p is a composition or a union, i.e., p=p_1::p_2::... or [p_1,p_2,...] then free(p)=free(p_1) + free(p_2) + ...
  4. If p = {f_1 p_1, f_2 p_2, ...}, then free(p) = (free(p_1)+free(p_2) + ...) - {@f_1,@f_2,...}.
  5. If p = q.l or 'l q, then free(p) = free(q).
  6. If p = ($x ... -> q), then free(p)=free(q) - {$x}.

A sub-expression p is called closed if free(p) is empty.

Let V be a set of variable names, each typed by a flat type. We call context, any partial mapping from (a finite part of) V into values of corresponding types. E.g., if V = {x,y} with x of type Tx and y of type Ty, then C:{x -> vx, y -> vy} is a context, with vx,vy being some values of types Tx and Ty, respectively. The mapping defined by a context naturally extends to Kernel expressions.

Patterns (templates)

Templates play the role of filters for values of the actual parameters. Every flat type definition defines the set of values (trees) together with a syntax for unary predicates on them, which we call patterns. Intuitively, a pattern is a initial part of the tree (from the root) which selects all values with the same initial part.

P  ::=  type_exp  | { 'label P, ..., 'label P } | 'label P 

The semantics of a pattern for a type is a non-empty subset of values of the type.

In order to simplify the notation, we introduce pattern <any> which matches everything. Also, not all elements of a product type have to be explicitly enumerated. For example, let us consider the following type:

<type> Person <is> {'gender ['man, 'woman], 'name String, 'age Number};

Instead of writing: {'gender 'man, 'name String, 'age Number} for the pattern selecting men, one can write: {'gender 'man, 'name <any>, 'age <any>}, or simply: {'gender 'man, <any>}, as long as the type of the pattern can be deduced from the context.

Semantics

A Kernel program describes functional and behavioral aspects of computing. The set of synchronization events of a Kernel program is derived from channel declarations: unbounded unidirectional FIFO queues, also called pipes. Any sequence of those events is called a trace. Intuitively, a Kernel relation defines a relationship between argument values, resulting values, and traces.

More precisely, let c1,...,cd be channels. A trace m over those channels is a d-tuple of sequences, i.e., a list of d sequences, one per channel: m = (m[c1],m[c2], ..., m[cd]). Such multidimensional sequences form a monoid, where concatenation, denoted by dot ".", is defined compose-wise. We have m = m[c1].m[c2]. ... .m[cd].

Apart form concatenation, we will need another binary operation on traces, which however is not always defined. Intuitively, m * n denotes the shortest trace which is a continuation of m and n. I.e., w = m * n if and only if there exist traces m' and n' such that w = m.m' and w = n.n'. Moreover, for all traces m'',n'',u such that m'=m''.u and n'=n''.u, the trace u is the empty trace. In case when traces are uni-dimensional, then m * n equals m or n, if it exists at all. Otherwise we have: m * n = (m[c1]*n[c1], ..., m[cd]*n[cd]).

Non recursive definitions

With every Kernel sub-expression p of type (Ta -> Tb) we associate its semantics, sem(p), i.e., a mapping from a context defined on free(p) to a set of quadruples (m,n,a,b), where m,n are traces, a is a flat value of type Ta, and b is a flat value of type Tb. Instead of writing sem(p)(C), we will write sem(p)C. When p is closed then sem(p)C does not depend on C.

Intuitively, a quadruple (m,n,a,b) in sem(q) means, that q can map a into b with trace mn by effectively "consuming" m (the n part of the trace represents the "look-ahead").

unit
sem({})C = { (1,n,{},{}) | for any trace n }, where 1 is the empty trace
identifier
sem($x)C = { (1,n,{},C($x)) | for any trace n }
variant
sem(l q)C = { (m,n,a,l b) | (m,n,a,b) in sem(q)C }
projection
sem(q.l)C = { (m,n,a,b.l) | (m,n,a,b) in sem(q)C }
product
Let q = {l1 q1, l2 q2 <serialize> c1 ... <serialize> ck} and c1,...,ck,ck+1,...,cd be all channels.
(m,n,a,b) is in sem(q)C if and only if:
union
Let q = [ l1 q1, l2 q2 ] and c1,...,ck,ck+1,...,cd be all channels.
 $\sem{\texttt{FIRST}[q_1,q_2]}_C \df{=}
 \sem{q_1}_C\cup \\
 \{(m,nn',a,b) \mid n'\in M, (m,n,a,b)\in\sem{q_2}_C,
 \forall (m_1,n_1,a,b_1)\in\sem{q_1}_C m_1n_1\not\perp mnn'\}$
 $\sem{\texttt{LONGEST}[q_1,q_2]}_C \df{=}\\
 \{(m,n,a,b)\in\sem{q_1}_C\mid 
 \forall (m',n',a,b')\in\sem{q_2}_C 
 (m'n'\perp mn \Rightarrow m\not\leq m')\}
 \cup \\
 \{(m,n,a,b)\in\sem{q_2}_C\mid 
 \forall (m',n',a,b')\in\sem{q_1}_C 
 (m'n'\perp mn \Rightarrow m'<m)\}
 $


 $\sem{\texttt{SHORTEST}[q_1,q_2]}_C \df{=}\\
 \{(m,n,a,b)\in\sem{q_1}_C\mid 
 \forall (m',n',a,b')\in\sem{q_2}_C 
 (m'n'\perp mn \Rightarrow m'\not\leq m)\}
 \cup \\
 \{(m,n,a,b)\in\sem{q_2}_C\mid 
 \forall (m',n',a,b')\in\sem{q_1}_C 
 (m'n'\perp mn \Rightarrow m<m')\} $


\item[composition:]

 $\sem{q_1 :: q_2}_C \df{=} \\\{(m_1m_2,
 m_2^{-1}(n_1\vee m_2n_2), a, b) \mid
 (m_1,n_1,a,c)\in\sem{q_1}_C,(m_2,n_2,c,b)\in\sem{q_2}_C\}$

\item[abstraction:] $\sem{(\$x\,P \mapsto q)}_C\df{=} \{(m,n,a,b)\mid

 (m,n,\{\},b)\in\sem{q}_{C.[\$x\mapsto a]}\}$, where $C.[\$x\mapsto a]$
 denotes the following context:
 \[
 C.[\$x\mapsto a](v) \df{=}\left\{
   \begin{array}{ll}
     a & \mbox{ if } v = \$x \\
     C(v) & \mbox{ otherwise }
   \end{array}
 \right.
 \]

\end{description}

The above rules describe the semantics of all non-recursive Kernel definitions.


We say that $S=\sem{q}_C$ for a subexpression $q:T_1\mapsto T_b$ is: \begin{itemize} \item \emph{functional}, whenever $(m,n,a,b),(m,n,a,b')\in S$ implies

 $b=b'$;

\item \emph{prefix-deterministic}, whenever $(m,n,a,b),(m',n',a,b')\in S$

 and $mn\perp m'n'$ implies $m=m'$ and $b=b'$;

\item \emph{finite}, whenever $\{(m,a,b)\mid (m,n,a,b)\in S\}$ is

 finite;

\item \emph{rational}, whenever $\{(m,a,b)\mid (m,n,a,b)\in S\}$ is

 rational;

\item \emph{simple}, ??? \item \emph{k-simple}, ??? \end{itemize}

\begin{prop}

 All above operations on $S$ preserve the above properties of its
 arguments. I.e., if $\sem{q_i}_C$ are \emph{functional} (resp.,
 prefix-deterministic, finite, rational), then variant, projection,
 concat, inter, first-match, longest-match, shortest-match,
 composition, and abstraction are \emph{functional} (resp.,
 prefix-deterministic, finite, rational).

\end{prop}

\begin{proof}

 No proof so far. It has to be proved!

\end{proof}