Type Checking in Lisp

Summary: Common Lisp supports both dynamic (run-time) and static (compile-time) typing.
prev next

site map



Imagine if, in addition to requiring arrays to be homogeneous, some statically typed language required that characters had to be stored in variables of a particular type: uppercase, lowercase, alphanumeric, whitespace, non-printing, etc. This would make working with characters very awkward, and working with strings (arrays of characters) nearly impossible. The functions operating on strings would all go away entirely or be replaced with more specialized versions which only worked on characters of a particular type. This is very awkward. Instead, every programming language allows for a single character data type. When distinctions between whitespace and non-printing, or between upper and lower case are important, the data itself is examined. This typing of variables and the attendant lack of facility for dealing with heterogeneous data is exactly what C does for everything BUT character variables. Now imagine how much easier programming would be if all variables could hold any object as easily as char variables hold any character, and that variables representing non-atomic data could contain elements of any type of data, in the same way that strings can contain any character. This is what Lisp does.

In statically typed languages, such as C, variables and functions are typed: the programmer must indicate the type of data which will be stored in each variable, and the types of the arguments and return value of functions. In Lisp, the data itself is (conceptually) typed by a tag. This is sometimes called manifest typing. The variables which hold the data are not themselves required to be typed, and may hold any data (though the variables may also be typed if desired by the programmer). By examining the tag, each datum may have its type determined at run time, if necessary, by the program.

The lack of required variable declarations in Lisp does not provide any less safety than in statically typed languages. Because of the tagged types, Lisp can, in the worst case, check the types at run time for those items that cannot be automatically recast. In addition, the type system in Lisp can specify much finer grained types than are usually available in statically typed systems. Thus it is possible to use the type system to ensure, for example, that a particular integer be odd, that a character be ASCII, or that an array not be a string.

In addition, Common Lisp allows the option of declaring the type of variables. These declarations may specify the type to be a "union", "intersection" or other combination of basic types. It is not necessary to predeclare a new union type.

If the item is a union of two types of vastly different sizes, or a dynamic complex data type such as an array or hash table, Lisp will only use as much memory as is actually required at run time, whereas C will always allocate the memory associated with the larger of the two sizes.

A type declaration is code written by a programmer that tells the computer that a given variable should only hold variables of a given type, or that a given function should have a given signature. It does two things:

  1. It allows the compiler to check the programmer's code at compile time for code which is inconsistent with the declarations. If the compiler can be certain at compile-time that the types are correct, it need not insert any run-time type tests in the compiled code. Some implementations allow the programmer to specify that such run-time tests should be avoided in any case. When this is provided for, it can be done on a global basis or in only selected regions of code.

  2. It allows the compiler to generate more efficient code. When the compiler is able to be certain of the type of data held in certain variables, it can use more efficient machine instructions to manipulate that data. This results in fewer function calls and less garbage.

In Common Lisp, such declarations are completely optional. Correct code without any type declarations at all, or with only some type declarations, will still work correctly. Incorrect code will still give reasonable error messages at run-time for incorrectly typed data or incorrect arguments to functions.

Some compilers use knowledge of the types of constants and known system functions, as well as those declarations that happened to be provided, to infer the types of variables and functions when a declaration is not provided by the programmer, or to check those declarations that have been provided by the user.

Finally, Common Lisp types can be specified very precisely. Rather than using coarse specifications such as int or float, specific ranges can be specified, as well as specific objects or members of a set, intersections, unions and complements of types, and even arbitrary predicates. How more precise information is used by the compiler varies, but some implementations can use this for complex compile-time reasoning about the correctness of a program, and all implementations can use this for precise run-time type checking.