|
An informal specification of BIR.
Authors: Robby, Matthew Hoosier
In this documentation, we describe the constructs available in
the Bogor modeling language. We use grammar in the Extended Backus-Naur
Form (EBNF) to describe BIR production rules and regular expression to
denote lexical definitions of terminal symbols. For simplicity, we
often use double quotes (") to denote terminal symbols. For example,
"system" denotes the terminal symbol equivalent to the regular
expression 's' 'y' 's' 't' 'e' 'm', where the single quotes (') denote
characters. We use angle brackets to denote non-terminal symbols (e.g.,
<system>). We also liberally use angle brackets to denote
terminal symbols defined using regular expressions.
In the following sections, we describe each BIR construct in
details. For each construct, we give its concrete syntax and some
well-formed-ness rules. When describing the concrete syntax grammar, we
will choose readability of the grammar over ambiguity issues. However,
in some important cases, we will explicitly present the issues and how
we resolve them. We use paragraphs with "Namespace" as the header to
describe namespaces of BIR constructs (whenever applicable).
Identifiers that belong to the same namespace must be distinct. In some
cases, a namespace can be shadowed by another namespace; we will
explicitly describe the well-formed-ness rules in those cases.
We also give the Java abstract syntax tree (AST) class for
each construct in paragraphs with "Abstract Syntax Tree" as the header.
Note
In some cases, the UML diagrams used to present these syntax tree class
hierarchy may be out-of-date. The tool used to generate these is
presently incompatible with the J2SE 5.0 language in which Bogor is
written.
Figure 1,
“Concrete Syntax for Systems�? presents the
concrete syntax for BIR models. A BIR model begins with a declaration
of the system identifier <system-id> followed by
declarations of its members:
Namespace. System has its own
namespace and it is not shared with any other constructs.
Abstract Syntax Tree.
The abstract syntax tree class for BIR system is the System
class[1].
The top level Java AST class for system member is the SystemMember
class.
Before we describe each system member, we first describe BIR identifiers,
types,
and literals,
as they are used in various places when describing the system members.
Figure 3,
“Concrete Syntax for Identifiers�? presents the
concrete syntax and lexical definitions of BIR identifiers. They are
categorized into two: (1) normal identifiers, and (2) type variable
identifiers. There are two kinds of normal identifiers: (a) basic
identifiers, and (b) Bogor identifiers. Basic identifiers are similar
to Java identifiers. That is, they cannot have sequences of characters
that are equal to BIR keywords, boolean literals, the null literal, and
the float and double special literals (see bir-keywords).
Bogor identifiers are provided for convenience when
translating from other languages because one can encode almost anything
inside the Bogor identifiers. For example, when translating a Java
program that has system as a local variable identifier, the identifier
will clash with the BIR "system" terminal symbol (i.e., the BIR system
keyword). One can resolve this issue by always encode Java identifiers
inside one of the Bogor identifiers. For example, one can use a
convention to always encode Java local variables names using the
"[|"..."|]" identifier delimiters to avoid name clashes (e.g.,
"[|system|]").
Type variable identifiers can only be used as variables for
types (see generic types).
Notice that in the lexical definitions for identifiers in Figure 3,
“Concrete Syntax for Identifiers�?, we use
parameterized terminal symbols for simplicity (e.g.
<bogor-id-char(x)>).
Notice also that we use escape characters that are usually used in
other programming languages such as Java (e.g., '\n' and the unicode
escape characters). For simplicity, we define identifiers with -id as
their suffixes to be equivalent as <id>, unless specified
otherwise.
Examples.
this_is_a_basic_id
{|this is a bogor id|}
(|this is a bogor id|)
<|this is a bogor
id|>
[|this is a bogor id|]
/|this is a bogor id|\
\|this is a bogor id|/
+|this is a bogor id|+
.|this is a bogor id|.
'this_is_a_type_var_id
Abstract Syntax Tree. Identifier
does not have dedicated Java abstract syntax tree class, instead, we
use java.lang.String
objects to represent BIR identifier.
Figure 4,
“Concrete Syntax for Types�? presents the concrete
syntax for BIR types. Similar to other modern programming languages,
BIR basic types are categorized into two general categories: (1)
primitive types, and (2) non-primitive (reference) types. In addition
to basic types, BIR features generic types (used strictly for
extensions). BIR also features functional types to support the
functional programming sub-language of BIR.
Abstract Syntax Tree.
The top level Java AST class for BIR (syntactic) type is the ASTType
class.
We describe each type category in details in the following
subsections. We use an italized font to denote variables representing
BIR values.
BIR primitive types consist of boolean types, integral types,
real types, and enumeration types, as well as special primitive types
such as the thread ID type used to hold BIR thread descriptors. In
addition, BIR features facilities to introduce new first-class
primitive types.
BIR does not support automatic primitive type promotions such
as in Java. For example, suppose we have a Java code as follows:
double x, z;
int y;
...
z := x + y;
Notice that in the last line, x
is automatically promoted to the double type before the addition is
evaluated. In contrast, x
should be casted explicitly in BIR. This restriction (or lack of
feature) is done to simplify the semantics of BIR constructs such as
the binary expressions. That is, when implementing the evaluation
binary expressions, one does not need to worry about type promotions.
Instead, all type promotions are handled in the cast
expression construct.
The boolean type "boolean" represents a type whose values
represent logical truth quantities (i.e., true and false).
Compatibility. The boolean type is
incompatible with the other types.
Default Value. The default value
for the boolean type is false.
Basic Operators. The
"&&", "||", "=>", "==", and "!=") binary
operators (see binary expression)
and the "!" unary operator (see unary expression)
can be used for boolean type.
Abstract Syntax Tree. The Java AST
class for (syntactic) boolean type is the ASTBooleanType
class.
There are four kinds of integral types:
-
the integral type "int", whose values range from
-2147483648 to 2147483647, inclusive.
-
int range types "int" "(" x
"," y
")", whose values range from x
to y,
inclusive. The range limitters x
and y
are either integer literals (see literals) or integer constants
(see constants).
Note that x
must be equal or less than y.
The minimum value for x
is -2147483647, and the maximum value for y
is 2147483647. Values outside the x
and y
range cannot be assigned/casted unless the "wrap" modifier is specified.
If the "wrap" modifier is specified, then values outside x and y are
wrapped so it become in range. There is a degree of freedom of what
should be done if a value outside of the range of an unwrapped range
type is assigned/casted to that type (e.g., raising an exception).
An advantage of having tighter bounds for values is that
one can compute fewer number of bits necessary to store the values a
priori. This is a very useful feature when doing explicit-state model
checking.
Note that the range is not checked statically, instead, it
is checked at run-time. Furthermore, range types do not affect the
semantics of arithmetic operators. That is, they are treated as if they
are non-range types. Thus, only the semantics of assignment (see assignment
action) and type casting (see cast expression)
are affected.
-
the long type "long", whose values range from
-9223372036854775808 to 9223372036854775807, inclusive.
-
long range types "long" "(" x
"," y
")", is similar to the int range type except that the mininum value for
x
is -9223372036854775808 and the maximum value for y is
9223372036854775807.
Compatibility. Integral types are
compatible with each other and with real types.
Default Values.
-
The default value of the int type is 0.
-
The default value of a int range type is 0, if x
<= 0 <= y.
Otherwise, the default value is x.
-
The default value of the long type is 0.
-
The default value of a long range type is 0, if x
<= 0 <= y.
Otherwise, the default value is x.
Basic Operators. The "+", "-",
"*", "/", "%", "==", "!=", "<", ">=", ">",
">=", "shl", "shr", "ushr", "&", "|", and "^" binary
operators (see binary expression)
and the "+" and "-" unary operators (see unary expression)
can be used for integral types.
Examples.
- int range types:
int (0, 8),
int wrap (-1, 8)
- long range types:
long (0,
10000000000L), long wrap
(0, 10000000000L)
Abstract Syntax Tree.
-
The Java AST class for (syntactic) int type is the ASTIntType
class.
-
The Java AST class for (syntactic) int range type is the ASTIntRangeType
class.
-
The Java AST class for (syntactic) long type is the ASTLongType
class.
-
The Java AST class for (syntactic) long range type is the ASTLongRangeType
class.
The BIR "float" and "double" real types follow Java's float
and double types, respectively (i.e., it uses the standard specified in
the Java Language Specification). The
special Not-a-Number (NaN) literal for BIR float type is "NaNf", while
the NaN literal for BIR double type is "NaNd". Furthermore, the special
positive/negative infinities for BIR float type and BIR double type are
"pINFf"/"nINFf" and "pINFd"/"nINFd", respectively.[2]
Compatibility. Real types are
compatible with each other and with integral types.
Default Values. The default value
of both float and double type is 0.0.
Basic Operators. The "+", "-",
"*", "/", "%", "==", "!=", "<", "<=", ">", and
">=" binary operators (see binary expression)
and the "+" and "-" unary operators (see unary expression)
can be used for float or double types.
Abstract Syntax Tree. The Java AST
class for (syntactic) float and double types are ASTFloatType
and ASTDoubleType
, respectively.
An enumeration type <enum-id> is a type whose
values are defined by the user using symbols (see enumerations).
Compatibility. Enumeration types
are incompatible with each other, and they are incompatible with other
types.
Default Value. The default value
of an enumeration type is the first declared element of the type.
Basic Operators. The "==" and "!="
binary operators (see binary expression)
can be used for enumeration types.
Abstract Syntax Tree. The Java AST
class for an enumeration (syntactic) type is the IdType
class. That is, it is parameterized by the enumeration
type's identifier.
The thread ID type represents a type whose values are BIR
thread descriptors. The intuition of having a separate type for thread
descriptors in a strongly typed language is that one can track where
the thread descriptors flow (i.e., it can only flow to a variable whose
type is the thread ID type).
Compatibility. The thread ID type
is incompatible with other types.
Default Value. The default value
of the thread ID type is the first thread descriptor created.
Basic Operators. The "==" and "!="
binary operators (see binary expression)
can be used for the thread ID type.
Abstract Syntax Tree. The Java AST
class for the thread ID (syntactic) type is the ASTThreadIdType
class.
4.1.6. Primitive
Extension Types
BIR supports introductions of new primitive types as
first-class types (see type extensions).
We call these types primitive extension types. A primitive extension
type is refered using the extension that declares the type followed by
dot (.) and the identifier of the type (e.g., <ext-id>
"." <prim-ext-type-id>). If the primitive extension type
is declared as a generic type, then the
arguments to its type variables must be supplied (i.e.,
<type-args>). This process is called instantiation of the
generic type. The arguments must match the constraints on the type
variables of the generic types (e.g., type compatibility and matching
number of parameters and arguments).
Compatibility. Primitive extension
types are incompatible with each other, and they are incompatible with
other types.
Default Value. The default values
of primitive extension types are defined by the module that declares
the types.
Basic Operators. The "==" and "!="
binary operators (see binary expression)
can be used for primitive extension types.
Abstract Syntax Tree. The Java AST
class for primitive type extension (syntactic) type is the ASTExtType
class.
Basic non-primitive types are reference types. There are five
kinds of non-primitive types: (1) record types, (2) array types, (3) a
string type, (4) a lock type, and (5) non-primitive extension types.
Compatibility. Non-primitive types
are incompatible with each other, with the exceptions of record and
array types.
Default Values. The default value
of non-primitive types is null.
Basic Operators. The "==" and "!="
binary operators (see records) can be used
for non-primitive types.
A record type <record-id> represents a type
whose values are named-tuple values consisting its fields' values (see record
declaration).
Abstract Syntax Tree. The Java AST
class for a record (syntactic) type is the IdType
class. That is, it is parameterized by the record type's
identifier.
An array type <basic-type> "[" "]" represents a
type whose values are fixed-size indexed collections of values of type
<basic-type>. Each array value has an implicit field
length that store the length/size of the array. The array index ranges
from 0 to the array length - 1.
Compatibility. In general, array
types are not compatible with each other. However, there is a degree of
freedom in this matter. For example, it is convenient to relax array
types compatibility when modeling Java programs as follows: an array
type is compatible with another array type if their ranks are equal
(i.e., the number of array dimensions), and the array base (non-array)
types are compatible for record types and equal for other types.
However, this relaxation causes the static type checking of BIR to
accept programs with bad types (unsound). This problem is also present
in Java and it can be illustrated as follows:
class Foo { public void bar(Object[] baz) { if (baz != null && baz.length > 0) { baz[0] = new Foo(); } } }
We can now have a bad Java code that uses the Foo class as follows:
new Foo().bar(new String[1]);
The above code passed the static type checking in Java, but,
it will fail at runtime (i.e., it raises an exception of type java.lang.ArrayStoreException).
Thus, when this relaxation is used in BIR, every assignment to an array
element should be type checked again at run-time.
Abstract Syntax Tree. The Java AST
class for an array (syntactic) type is the ASTArrayType
class.
The string type "string" represents a type whose values are
immutable sequences of characters.
Abstract Syntax Tree. The Java AST
class for the string (syntactic) type is the ASTStringType
class.
The lock type "lock" represents a type whose values are used
for synchronization purposes. Although the guarded-command language
aspect of BIR provides means for synchronization, however, when
modeling programs from feature rich programming languages such as Java
programs, one often needs higher-level abstract data types. The lock
type is specifically designed to model Java monitors. A lock value
contains several information:
-
the thread descriptor of the lock's owner (if the lock is
held),
-
how many times the owner acquired the lock (if the lock is
held),
-
a wait set that contains thread descriptors where the
corresponding threads are waiting on the lock, and
-
a notification set that contains thread descriptors where
he corresponding threads have been notified after waiting on the lock.
Abstract Syntax Tree. The Java AST
class for the string (syntactic) type is the ASTLockType
class.
4.2.5. Non-Primitive
Extension Types
A non-primitive extension type is similar to a primitive
extension type, except that it is a reference type.
Abstract Syntax Tree. The Java AST
class for non-primitive type extension (syntactic) type is the ASTExtType
class.
The alias type represents the type associated with it (see type
alias declaration).
Abstract Syntax Tree. The Java AST
class for the alias type is the IdType
class.
BIR supports generic type that is strictly used for extension
types (see extensions) and
action/expression extension type arguments (see action
and expression
extensions). The genericity of the types arises because of the use of
type variables. Generic type cannot have corresponding value until it
is instantiated. Since generic type is only used for extensions, the
values of instantiated generic type depend on the extension module that
declares the generic type.
Abstract Syntax Tree. The Java AST
class for generic (syntactic) type is the ASTExtType
class.
BIR supports pure functional expression similar to what is
available in functional programming languages such as SML. A functional
type represents a type whose values are pure functional expressions.
The "unit" symbol indicates a function does not take parameters. String
values are used to represent functional types because all functional
expressions should be declared (named) globally, thus, the string
values are enough to distinguish which functional expressions are
referred.
Default Value. There is no default
value for a functional type because functional types cannot be used for
variables.
Abstract Syntax Tree. The Java AST
class for pure functional (syntactic) type is the ASTFunType
class.
Figure 6,
“Concrete Syntax for Literals�? presents the
lexical definitions for BIR literals. The literals are syntactic
representation of BIR values. BIR does not have a dedicated type for
characters, instead, the int type is used. Note that unicode escape
characters are not shown. A unicode pre-processor is used to convert
unicode escape characters to the corresponding unicode characters.
Examples.
- boolean literals:
true,
false
- char literals:
'a',
'b', 'c'
- int literals:
1,
2, 3
- long literals:
1l,
2l, 3L
- float literals:
1.0f,
2f, 3.0F, 4F
- double literals:
1.0d,
2d, 3.0D, 4D
- string literals:
"a",
"b", "c"
Abstract Syntax Tree.
The top level Java AST class for BIR literal is Literal. The
boolean, character, int, long, float, double, string, and null literals
Java AST classes are the BooleanLiteral
class, the IntLiteral
class, the IntLiteral
class, the LongLiteral
class, the FloatLiteral
class, the DoubleLiteral
class, the StringLiteral
class, and the NullLiteral
class, respectively.
Figure 8,
“Concrete Syntax for Constant Declaration�?
presents the concrete syntax for BIR constant declaration. Constants
are useful to parameterize BIR models. A constant declaration consists
of declarations of its identifier <const-id>, its
elements, and the literals assigned to the elements. The type of
constant elements depend on the type of the literals assigned to them.
Constant elements are not variables, instead, they are simply value
bindings of the literals to the constant element identifiers. The
bindings cannot be changed at run-time. The fully qualified name of a
constant element is <const-id> "."
<const-elem-id>. A constant element can only be referred
using its fully qualified name.
Namespace. The namespace for
constants is the global namespace that is shared with enumerations,
records,
extensions,
type-aliases,
global
variables, threads and
functions, virtual tables,
and functional
expressions. Each constant has its own namespace for its
elements.
Examples.
const C { N = 5; // int constant M = (int wrap (0, 255)) 5; // int wrap (0, 255) constant L1 = 5L; // long constant L2 = 5l; // long constant L3 = -5L; // long constant L4 = -5l; // long constant F1 = 1.0F; // float constant F2 = 1.0f; // float constant F3 = -1.0F; // float constant F4 = -1.0f; // float constant F5 = 1F; // float constant F6 = 1f; // float constant F7 = -1F; // float constant F8 = -1f; // float constant D1 = 1.0; // double constant D2 = 1.0D; // double constant D3 = 1.0d; // double constant D4 = -1.0; // double constant D5 = -1.0D; // double constant D6 = -1.0d; // double constant D7 = 1D; // double constant D8 = 1d; // double constant D9 = -1D; // double constant D10 = -1d; // double constant C = 'c'; // char constant B1 = true; // boolean constant B2 = false; // boolean constant S1 = null; // string constant S2 = "s"; // string constant }
Abstract Syntax Tree.
The Java AST class for BIR constant is the ConstantDefinition.
The top level Java AST class for BIR constant elements is Constant
class. The Java AST class for boolean constant element, char constant
element, int constant element, long constant element, float constant
element, double constant element, and string constant element are the BooleanConstant
class, the IntConstant
class, the IntConstant
class, the LongConstant
class, the FloatConstant
class, the DoubleConstant
class, and the StringConstant
class, respectively.
7. Enumeration Declaration
Figure 10,
“Concrete Syntax for Enumeration Declaration�?
presents the concrete syntax for BIR enumeration type declarations. An
enumeration type declaration consists of declarations of its identifier
<enum-id> and its elements. The elements represents the
valid values of the enumeration type. The fully qualified name of an
enumeration element is <enum-id> "."
<enum-elem-id>. An enumeration element can only be
referred using its fully qualified name, except in virtual tables on
enumerations (see virtual table
declaration).
Namespace. The namespace for
enumerations is the global namespace that is shared with constants,
records,
extensions,
type-aliases,
global
variables, threads and
functions, virtual tables,
and functional
expressions. Each enumeration type has its own namespace for
its elements.
Examples.
enum Day { Sunday, Monday, Tuesday, Wednesday, Thursday, Friday }
Abstract Syntax Tree.
The Java AST class for BIR enumeration is the EnumDefinition
class. There is no dedicated Java AST class for enumeration element,
instead, the java.lang.String
class is used.
Figure 12,
“Concrete Syntax for Record Declaration�? presents
the concrete syntax for record declarations. A record r' can extend
another record r (i.e., r!= r'). The record r is called the (direct)
super record of r', and r' is called a (direct) sub-record of r. Super
record and sub-record relations are transitive relations (e.g, r is a
super record of r''s sub-records). The fields of r is implicitly
present in r', however, the implicit fields share the namespace of the
fields of r'. Furthermore, constructs that can be used for r can also
be used with r', but not necessarily the other way around. A record can
also extend multiple records.
The "top" modifier is used to indicate that the record is the
top record. That is, all other records that do not explicitly extend
another record extend the top record. In a BIR model, there can be at
most one top record. The "throwable" modifier is used to indicate the
record can be thrown as an exception. All sub-records of a throwable
record are also throwable.
In BIR, we allow a degree of freedom of how the language is
implemented. The language can be implemented in a way that is biased
toward a specific domain. For example, when modeling Java, it is
convenient to have each record contains an implicit lock. Thus, the
lock operations (see lock action)
and the lock tests (see lock test expression)
can be used directly on records. Another example, it is also convenient
to have arrays that can be treated as the top record because arrays are
objects in Java (i.e., they can be casted to the Java top object java.lang.Object). Thus, if a
top record is specified in a BIR model, then all array types "extend"
the top record in a sense that it can be casted to/from. This degree of
freedom, however, should be used carefully and it should be documented
clearly when used.
Namespace. The namespace for
records is the global namespace that is shared with constants,
enumerations,
extensions,
type-aliases,
global
variables, threads and
functions, virtual tables,
and functional
expressions. Each record type has its own namespace for its
fields. Note that fields of a record are implicitly present in its
sub-records. This means that the inherited fields will also use the
namespace of each sub-record's field namespace.
Examples.
top throwable record A { int x; } throwable record B { int y; } record C extends A, B { int z; }
Abstract Syntax Tree.
The Java AST class for BIR records is the RecordDefinition
class. The Java AST class for BIR record field is the TypedId
class.
|