A version of this paper was presented at the 1995 Computer Science Conference and appears in the proceedings.
Classes defining properties of objects, but not objects themselves may be defined and "mixed in" to the concrete hierarchy to provide services in a uniform way. The property classes mixed in must all come from independently rooted hierarchies. Thus, one may define a class List, derived from Root, as a typical linked list structure and may define a property class Sortable, providing element comparison methods. Then the class Sortable_List may be defined using multiple inheritance from the concrete class List and the Property class Sortable. Since the expectation is that the features of a class defined in the concrete hierarchy will be accessed more frequently than those from the property parts, it is desirable to devise an implementation methodology that makes access to the former as fast as possible.
Objects are referenced via "reference variables," which are similar in some ways to pointers in that they imply indirect semantics and a "heap-like" implementation. The implementation is not revealed to the programmer however, and the notions of pointers and references are kept completely distinct. The semantics of a reference variable is that it references the value "as a whole", rather than a sub-object as is the case in C++ in some situations.
The language also provides an operator IS to query an object as to its actual (polymorphic) type. We may thus inquire if anObject IS List, and get a Boolean reply. We may also inquire if anObject IS Sortable. In general these tests may only be performed at runtime as a variable declared to be of a class type may hold a reference to a value of any derived type.
We may also coerce a reference variable to a derived type. This is necessary if, for example, our List class was declared to hold Root objects and we have been inserting Expression objects into a value of type List. When these objects are retrieved from the list the only type known to the compiler is Root, but the programmer knows them to be of type Expression. To regain the full protocol of Expression for these objects the programmer may coerce them to type Expression. The compiler will insert run-time checks as necessary to guard against errors.
Finally, implementations may be complicated by the fact that we may define variables to be of a property class type and objects having these properties may be referred to by these variables. We may have a Sortable_List object, myList, and a variable sortableItem of type Sortable. The assignment
sortableItem := myList;
is
legal.
The above features detail the main features complicating the implementation. On the other hand, the builder is aided by the fact that the notion of a reference variable, which is the only means of accessing a value of a class type, is completely abstract. The implementation of reference is not revealed to the programmer, unlike the case with pointer variables, which are otherwise similar. Thus the implementor is free to use whatever means are required to represent the features of objects.
In seeking a simple implementation strategy, we would like to use fixed, compile-time offsets as much as possible. This is the typical mechanism with ordinary records, though the presence of schemata in Extended Pascal complicates this to some extent. Since schemata permit structures to have a size that is known only at run-time, some implementations do not use fixed offsets for all record structures, but employ run-time computations via something like a dope vector. It is, however, possible to put the dope vectors at fixed offsets from some base, which might, in extreme cases, need to be computed at runtime.
In any case, we would like to provide a solution that does not employ searches of run-time records with associated comparisons made to guide further steps of the search. While it may be necessary to traverse run-time links, we would like to do so only when the number of steps along the chain is known to the compiler. While we must, of course, make simple comparisons to do so, we need not fetch information on which to base the comparisons.
As mentioned in Section 2, it is also desirable to optimize access to features defined in the concrete part of the hierarchy, perhaps at the expense of slowing down access to features defined in property classes. The paper [6] by Brown describes a scheme for implementation of symmetric multiple inheritance, in which the language does not guide the implementor in which of the parents of a class is more important. Similarly, [7] by Stroustrup, gives implementation details for multiple inheritance in C++. Here we seek something more specialized.
TYPE
A = CLASS(...) (* A may have a parent *)
PROCEDURE method_a...;
END;
P = PROPERTY CLASS
PROCEDURE method_p...;
END;
Also, while we have used only procedures, methods may be functions as well. Note that the class AA inherits a constructor create from Root. We also assume that it inherits a method method_a from its concrete parent A and that P declares method_p. Finally, we assume that class A has been constructed without use of any property classes anywhere in its hierarchy.
TYPE
PP = PROPERTY CLASS(P)
PROCEDURE method_q...;
END;
AA = CLASS(A, PP)
field_a:...
field_b:...
PROCEDURE method_b...;
END;
VAR
anObject: AA;
aProperty: PP
BEGIN
anObject:= AA.create;
aProperty := anObject;
...
Example 1
The solution here employs two types of run-time records. The first (object-records) represents the objects themselves. There is one or several such records for each object, one for each independent hierarchy of which its class is composed. If an object is created from a class that employs multiple inheritance, then the object's run time record is composed of several segments, linked together. The first segment represents all instance variables, but not the methods, derived from the concrete part of the hierarchy and the others represent the various property classes.
The second type of run time record (class-records) represents run-time information associated with classes themselves. These class records hold pointers to the code implementing methods (including constructors and destructors) of the class as well as a pointer to the record representing one of the parents of the class. If the class represented is a concrete class this pointer is to the unique concrete parent. If it is a property class then this may be to any of its parents. The layout of each record is fixed by the compiler and does not vary at runtime. Likewise the links between the records do not vary at runtime for any given object. The class-records logically exist for the life of a run-unit. Object-records are created when a concrete class is sent a constructor message as in
myList := List.create;
They
become undefined when the object is sent a destructor message as:
myList.destroy;
The first segment of the object-record also contains a pointer to the concrete class of which the object is a member (its own-class), and a pointer to one of the property classes which was mixed-in to form the own-class.
Each other segment of the object-record chain contains a pointer to a property class associated with that segment as well as a pointer to the next segment of the object-record.
Note that in this scheme it is not necessary to link the class-records together except for the parent links. It is also unnecessary to duplicate most class-records. Thus, in general there will be only one record for each class. This is complicated, however, by the ability to override property class methods, as will be shown below in Section 4.3.
See Figure 1 for a possible layout for Example 1. Note that the fact that class A and property class PP were mixed to from class AA is not directly represented in the class records. It is, however represented in the pointers from the object records of the object anObject. In this scheme, the inherited method_a occupies a slot at the same offset in the class records for both A and AA. This is similar for method_p in the class records.
The (concrete) reference variable anObject is represented by a pointer variable to the first object-record. Since the compiler has knowledge of the layout, it knows that field_a is at a certain offset in that first segment and that field_p, which was inherited from property class P is in some fixed offset in the second segment. Likewise the compiler knows that a pointer to code for method_a may be found by following a pointer to the class record and then applying a fixed offset and that for method_q may be found by a similar computation from the second object-record.
Note that field_a will always occupy the same slot in its object-record, as will method_q. Also, in any class derived from class AA, the segment for class PP will always occupy the second place.
Variables referencing property classe values are somewhat more complex. Here we represent them with a pair of pointers, with the first (object-pointer) pointing to the first object-record of the object as a whole. The other pointer (property-pointer) points to that object-record segment that represents the property class in question: namely the property class whose type is the declared type of the reference variable. We note that the actual type may have a more refined type than this declared type, so that if aProperty was declared to have type P, the second pointer would still point to the same location shown here. In some circumstances, in which the property class is formed from multiple inheritance of property classes, the picture is somewhat more complex as the class in question will require several segments, with the first representing the property class as a whole, and the other representing various of the properties mixed in to form it. This is discussed later.
It is important to note a few things about this layout. First, it is possible that a smart linker could decide, having seen all of the components of a run-unit, that some classes are never instantiated, and strip their class-records from the executable. It is also possible to decide in certain cases that the links to property records can be reduced to offsets and so avoid the chasing of links implied by the above. This is hard for the compiler, however, because of the possibility of cross-module references. In some cases, as when a class is not exported, the compiler itself may make such optimizations.
Example 2 and the corresponding Figure 2 show the effect of such simple inheritance, in which a new class is derived from AA and a new property is mixed in. Also shown here is an override of a method of the concrete hierarchy. Overriding property class methods is somewhat more complex. In this example, the property class R is defined starting a new property hierarchy, and AAA mixes it in with AA.
TYPE
R = PROPERTY CLASS
{ New root.}
field_r:...;
field_s:...;
PROCEDURE method_r...;
END;
AAA = CLASS(AA, R)
field_c:...
field_d:...
PROCEDURE method_a...;
override;
PROCEDURE method_c...;
END;
Example 2
As we see, the layouts implied by this have new fields in the first segment, a new method in the class-record, and a replacement (indicated by *) in the method_a slot in that class-record. Likewise, a new property segment is required in the object-record chain. This segment contains the instance variables declared in the new property class, as well as a pointer to the property class itself so that methods may be accessed. Note that since property PP is still an ancestor of this new class, the object-record representing PP is still second in the chain. Thus, a reference variable of type AA may refer to an object of this new type without modification of the access protocols established above for class AA.
As mentioned above, the situation is more complex when property class methods are overridden. In particular, a class-record representing a property class may have to be replicated to hold the pointer to the new method body. This is illustrated in Example 3 and Figure 3. Here the situation is as in Example 2, except that we also wish to override method_p which was inherited (indirectly) from PP.
TYPE
R = PROPERTY CLASS { New root.}
field_r:...;
field_s:...;
PROCEDURE method_r...;
END;
AAA = CLASS(AA, R)
field_c:...
field_d:...
PROCEDURE method_a...; override;
PROCEDURE method_c...;
PROCEDURE method_p...; override;
END;
Example 3
In figure 3 we show both a reference to an object of type AA and to one of type AAA. Note that the class-record for class PP has to be duplicated in the run-time system to be able to hold a pointer to the new method body. Again, the knowledge that this must be done resides in the compiler, and is discovered when the override of a property method appears in the class declaration of AAA. This new class-record does not represent a new property class, but only a segment of a new concrete class. However, the duplicated PP record points to the original PP record as if it were its parent. This is required for type inquiry, when the program tests if anAAAObject IS PP.
Notice importantly, that we may rely on the declared type of a reference variable to define the access to all features accessible via that reference. We need not know the run-time value to which the variable refers, relying on the fact that if it refers to a value of a type other than the declared type of the reference, it must be a subtype of it, and therefore uses the same layouts as far as defined in all directions by the class of this reference.
Next we note that although we have been describing the above in which one of the classes mixed in is a concrete class the same layouts are implied if all of the classes are property classes. The only difference is that the resulting structure does not describe any object as a whole, since property classes are not instantiable, but only describes a sequence of consecutive segments of object-records in objects of classes into which this property class structure is mixed. In either case, there is exactly one object-record per hierarchy used to form the own-class of the reference.
It should also be obvious to implementors, that there is no need to allocate the various object-records making up an object in separate operations. In fact it is desirable to do so all at once. Thus, Figure 4, describes a more typical physical layout. Note that we still need pointers to the various segments (unless an optimizer can determine that they are to the same offsets in all instantiated objects) because the same property class may be mixed in to several different concrete classes, and the several property classes mixed in to different parts may be in different orders. Therefore, it is not possible in general to associate a fixed offset in this structure with a given property class variable, except relative to its segment.
the parameter implies that the reference may not be changed to another object. If the class of the method is a concrete class, then the parameter may be represented by a pointer to the first segment of the object-record of the object that receives this message at run time. For a property class, however, the reference to self is a pair of pointers as is the case for other property class references. The property-pointer of this pair must be set by the caller of the method, to point to the segment representing the property class in which this method was originally declared rather than the declared type of the reference variable mentioned in the message statement or expression. If this is a property class variable, then an adjustment may need be made. If it is a concrete class variable, then a property-pointer will need to be generated.
If objects must be permitted to move during a computation, a slight refinement maintains a pointer to each class-record, which must be updated when the class-records move. Thus, again, we can implement type inquiry with simple pointer comparisons.
Type coercion is also quite simple for references of concrete class type, relying on the declared type of the reference variable being coerced and the type to which it is being coerced. If they both represent concrete classes, then nothing need be done except insert a run-time check that the class of the reference is indeed the class of the coercion. If one of the references is concrete and the other not, then we either throw-away the property-pointer, or generate one from knowledge of the types of the two references.
When both references are property classes the situation is a bit more complex in that the property-pointer must be adjusted. This adjustment could cause a property-pointer to move either way in its chain. When we assign a reference of a derived property class to one of one of its ancestors, we must move the property pointer down the chain a fixed number of steps, depending on the two types. Coercion in the opposite direction is only legal when one property class has been derived from others. References of the ancestor classes may be coerced to the derived class(but not to one another, as they may come from different hierarchies). If the coercion is legal the compiler knows from its defined layouts how many steps along the object-record chain must be stepped to obtain a pointer to the object-record of the coerced type. Unfortunately these steps are upward along the chain and as we have not doubly linked the chain, we need to traverse it from the origin, for which we have the object-pointer. These steps and the result type must be guarded with run-time checks, however, since a downward reference may always be made in error.
[2] Joseph Bergin, A Report on Object-Oriented Extensions to Pascal, SIGPLAN Notices, Volume 24, Number 11.
[3] Programming Language Pascal, ISO/IEC 7185:1990.
[4] Programming Language Extended Pascal, ANSI/IEEE770X3.160-1989, ISO/IEC 10206:1991.
[5] Tony Hetherington, An Introduction to Extended Pascal, ACM SIGPLAN Notices, Volume 28, Number 11, November 1993, pp. 42-51.
[6] Bob Brown, Non-Linear Type Extensions, ACM SIGPLAN Notices, Volume 29, Number 2, February 1994, pp. 39-43.
[7] Bjarne Stroustrup, The Design and Evolution of C++, Addison-Wesley, 1994.