19.3 Type constraints

In Dylan, variables, parameters, return values, and slots can all have type constraints. Dylan's dynamic nature means that type constraints can be looser than is typical of a static language, or can even be deferred altogether, in support of rapid prototyping or evolutionary development. Type constraints in a dynamic language serve three primary purposes:

1. Type constraints are required for method dispatch: the methods of a generic function are distinguished by the types of their required arguments. The generic function chooses the applicable methods by sorting them according to the type constraints of their parameters.

2. Type constraints can be used optionally to enforce program restrictions. The compiler ensures that a variable, parameter, return value, or slot will never take on a value that is incompatible with the type constraint of the parameter, return value, or slot. (If the compiler cannot prove at compile time that an incorrect type is impossible, it inserts a run-time check to enforce the type constraint.)

3. Type constraints allow the compiler to generate better code, because they are a contract between the programmer and the compiler that the variable, parameter, return value, or slot in question will never take on a value that is incompatible with its type constraint; hence, the compiler needs only to generate code for dealing with the declared type.

Many Dylan compilers use type inferencing to determine the possible types of variables, parameters, and slots that do not have explicit type constraints. Within a library, the compiler essentially knows everything about the variables and functions that are not exported at the library interface — it can analyze all uses of variables, and all callers and callees of functions. Through this analysis, the compiler can develop a worst-case scenario of the possible types of every variable, parameter, return value, and slot. As a result, these compilers generate efficient code even if the programmer does not fully declare all types (as would be required in most static languages).

Comparison with C: Static languages such as C have little need for type inferencing, because the type of every value must be declared, and the types can be checked easily at compile time. On the other hand, when a problem domain is ill-specified, the program is evolving through development, or a value may take on one of several types, the programmer must construct union types, and must use variant records or other bookkeeping to track the actual type of the value manually.

Dylan automatically handles this bookkeeping and uses type inferencing to minimize the associated overhead. At the same time, when the type of a variable can change at run time, Dylan also automatically tracks the changing type.

Some compilers have a facility for generating performance warnings, which inform you when type inferencing is not able to determine types sufficiently to generate optimal code. Some compilers have a facility for generating safety warnings, informing you when type inferencing is not able to determine types sufficiently to omit run-time type checking. As an example, consider these definitions (which are similar to, but not exactly the same as, the definitions on which we settled in Chapter 14, Four Complete Libraries):

define abstract open class <sixty-unit> (<object>)
  slot total-seconds :: <integer> = 0, init-keyword: total-seconds:;
end class <sixty-unit>; 
define method decode-total-seconds 
    (sixty-unit :: <sixty-unit>)
 => (hours :: <integer>, minutes :: <integer>, seconds :: <integer>)
  let total-seconds = abs(sixty-unit.total-seconds);
  let (total-minutes, seconds) = truncate/(total-seconds, 60);
  let (max-unit, minutes) = truncate/(total-minutes, 60);
  values (max-unit, minutes, seconds);
end method decode-total-seconds;

Because we made the choice to store total-seconds as an integer, and because 60 is an integer constant, the compiler can infer that the truncate/ calls are for an integer divided by integer. There is no need to consider whether to use floating-point or integer division.

If we were more concerned with testing out ideas, we might have left unspecified the type of the total-seconds slot (implicitly, its type would then be <object>), or, if we wanted to keep the option of having times more accurate than just seconds, we might have specified that its type was <real>, allowing for the possibility of using floating-point numbers, which can express fractional seconds.

If we left the type of the total-seconds slot unspecified, the compiler would need to check the arguments to truncate/, on the off chance that an argument was not numeric at all. In some compilers, you would be able to get a compile-time safety warning stating that a run-time type error is possible (which, if unhandled, will result in program failure), and that the check, and the possibility of a run-time error, could be avoided if the compiler knew that total-seconds was a <real>.

What is a safe program? Dylan is always safe in that a programming error cannot cause a corruption of the program (or of other programs). For example, an out-of-bound array access or passing an argument of incompatible type simply cannot happen. The compiler will either prove that the requested action is impossible, or will insert code to verify bounds or type at run time, and will signal an error if the bounds or type is incorrect.

When we discuss safety in this section, we are referring to whether or not such errors will be visible to the user. If we have not provided for a recovery action, signaling of an error will halt the program. See Chapter 20, Exceptions, for an example of how run-time errors can be handled by the program.

Comparison with Java: Java recognizes the need for safe operations, and has eliminated many of the unsafe practices of C and C++, adding such checks as array-bounds checks and type-cast checks at run time. However, Java retains the C mathematical model that trades performance for correctness. Java integers are of a fixed size, and computations that cannot be represented in that size silently overflow. In contrast, Dylan requires numeric operations to complete correctly or to signal an error. Several Dylan implementations are also expected to provide libraries for infinite-precision numerical operations.

If we specified the type of the total-seconds slot as <real>, the compiler would have to dispatch on the type of total-seconds, using either floating-point or integer division as necessary. In some compilers, we would be able to get a compile-time performance warning stating that this dispatch could be omitted if the compiler knew that total-seconds was of a more restricted type.

Note that the type of the return value of decode-total-seconds can be inferred: max-unit and minutes must be <integer> (inferred from the definition of truncate/), and seconds must have the same type as total-seconds (<integer>, in our example); thus, the compiler does not have to insert any type checks on the return values of decode-total-seconds. Dylan enforces declared return types in the same way as it enforces parameter types, by eliminating the check where type inferencing can show it is not needed, and using the enforced types to make further inferences.

From this example, you can see how the compiler can get a lot of mileage from a small number of constraints, and how it can point you to the places where further clarification will produce the most performance and safety benefits. At the same time, Dylan does not require that you have all your types thought out in advance of compiling the program; the dynamic nature of the language allows Dylan to defer considering type information until the program is actually running. In good Dylan development environments, there is support for resolving and continuing from run-time type errors during program development (rather than requiring editing of the code and recompilation).

Remember that your code is more suited to reuse when it has fewer and more general type constraints. If you have a compiler that can issue safety and performance notes, try to generalize and minimize your type constraints, being guided by your safety and performance requirements. Often, just the constraints required to specify method applicability will be sufficient for good safety and performance. Declaring the types of module variables, slots, and return values of functions is also useful and can help to document your program. Declaring types for constants and local variables can be useful for enforcing program correctness, but is unlikely to create optimization opportunities, and might actually reduce performance, because the compiler will insert type checks to enforce such constraints if they are overly restrictive.