| Unix System Programming with Standard ML | ||
|---|---|---|
| Prev | Appendix B. Coping with the Compiler's Error Messages | Next | 
The SML language is designed so that you can write your code without having to declare each variable with its type. Ideally you should be able to write no types anywhere and the compiler can figure out the type of each expression entirely by context. The language comes very close to this ideal. Only some ambiguities with overloaded numeric literals and record patterns spoil it.
Type expressions in SML can be quite complex. If a function is polymorphic then its type will feature type variables which will be tiresome to get straight. Having the type checker figure them out for you is a big advantage.
The disadvantage of this design is that when there is a type error the cause of the error can be very obscure. Where an identifier is used in several locations in your program the type checker compares each location to see if the use is consistent. For example if you use an identifier as the argument to a function that takes a string and also to one that takes an integer then the identifier can't be both a string and an integer. You will have to decide which one is wrong. If you use the identifier in a great many locations you may have to inspect all of them to find out which one is incorrect.
When the type checker is studying your program it reads it from top to bottom and decides on the type of an identifier from the first location it encounters that supplies it with decisive information. Every following location is checked against this type and if there is a mismatch then an error is reported. If it happens that the first location is the wrong one then all of the remaining locations will report errors.
The message that is generated for each type error will typically contain an abstract of the offending source code and a report of two type expressions that didn't match. Usually the code is a function call and the mismatch is between the expected type of the argument and the actual type of the argument expression. To figure out the type error you have to compare the two type expressions. They often contain internal type variables written like 'Z. A type variable will match with any type. Type variables with the same letter in the same type expression must be the same type.
Sometimes you will reach a point where the type checker insists that there is an error at some location and you are sure that it's not there but somewhere else. A good strategy is to put in an explicit type constraint to point out to the type checker what you think the type must be. The checker will then point out any other locations that don't match that type. You can put a type constraint on any expression, including literals.
The following sections show some typical examples and what went wrong in each case.
The simplest error is an argument mismatch when the argument type is obvious.
| fun f() = print 3 | 
| type1a.sml:1.11-1.18 Error: operator and operand don't agree [literal]
  operator domain: string
  operand:         int
  in expression:
    print 3 | 
The message talks about an operator, the function, and an operand, its argument. The domain of an operator is the type that it expects. In this case it expected a string and was given an int.
Distinguishing the operator and operand is harder with curried functions.
| fun f() = 
let
    fun g(a, b) = a + b
in
    foldl g 0.0 [1, 2]
end | 
| type1b.sml:2.1-6.4 Error: operator and operand don't agree [literal]
  operator domain: real list
  operand:         int list
  in expression:
    ((foldl g) 0.0) (1 :: 2 :: nil) | 
Here the operator is the expression (foldl g 0.0) which must take a list of reals for the final argument. The error is that a list of integers was supplied. We can surmise that lists in square brackets are represented internally in the compiler as applications of the list constructor operator ::.
If you leave out a semicolon in a sequence expression you will usually end up with a type error. Here's a simple example.
| fun f x =
let
    val msg = "hello"
in
    print msg
    print "\n"
end | 
| type1c.sml:2.1-7.4 Error: operator is not a function [tycon mismatch]
  operator: unit
  in expression:
    (print msg) print | 
To the compiler it looks like you are passing the print function as an argument to the (print msg) expression. But this expression isn't even a function. Its type is unit.
Some kinds of expressions don't end up in the error report how they started out in your code. Here's a silly example with two calls to the function g.
| fun f lst = 
(
    if length lst = 1
    then
        g print lst
    else
        g (fn s => (print s; print "\n")) (*lst*)
)
and g printer lst = app printer lst | 
| type2a.sml:2.1-8.2 Error: types of rules don't agree [circularity]
  earlier rule(s): bool -> 'Z
  this rule: bool -> 'Y list -> 'Z
  in rule:
    false => g (fn s => (<exp>; <exp>)) | 
The if expression becomes a case expression internally so that (if b then x else y) becomes (case b of true => x | false => y). The source position in the message covers the range of the lines of the if expression.
The two cases are called rules. Each rule is treated like a function from the type of the case argument (here bool) to the type of the case result (here represented as the unknown type 'Z). The type checker has a problem with the else part. The expression still needs an extra argument of type 'Y list before it can return the case's type. This is because I forgot the lst argument.
The type checker uses the term "object" for the expression after the case keyword. Here is an example of a mismatch with the rules.
| fun process (cmd: string) inp =
(
    case cmd of
      [] => []
    | (last::rest) =>
        (
            print last;
            app print rest;
            inp
        )
) | 
| type2b.sml:3.5-11.3 Error: case object and rules don't agree [tycon mismatch]
  rule domain: string list
  object: string
  in expression:
    (case cmd
      of nil => nil
       | last :: rest => (print last; (app <exp>) rest; inp)) | 
The rules clearly want a list of strings but there is an erroneous type constraint that says that cmd must be just a string.
Here's an example of a type error that propagates through a couple of levels of function call.
| fun run() = print(process "stop")
and process cmd =
(
    case cmd of
      "go" => go()
    | _    => stop()
)
and go()   = (3, "done")
and stop() = (4, "stopped") | 
| type3a.sml:1.1-12.28 Error: right-hand-side of clause doesn't 
agree with function result type [tycon mismatch]
  expression:  int * string
  result type:  string
  in declaration:
    go = (fn () => (3,"done"))
type3a.sml:1.1-12.28 Error: right-hand-side of clause doesn't 
agree with function result type [tycon mismatch]
  expression:  int * string
  result type:  string
  in declaration:
    stop = (fn () => (4,"stopped")) | 
The expected result type for the go and stop functions is determined to be string from the call to print in the first line. The error messages report the entire group of functions that are joined by the and keyword which doesn't localise the error much. If the run function comes last then the error is localised better.
| fun process cmd =
(
    case cmd of
      "go" => go()
    | _    => stop()
)
and go()   = (3, "done")
and stop() = (4, "stopped")
fun run() = print(process "stop") | 
| type3b.sml:11.13-11.34 Error: operator and operand don't agree [tycon mismatch]
  operator domain: string
  operand:         int * string
  in expression:
    print (process "stop") | 
Alternatively you can put a type constraint in a function declaration to break up the type propagation. This makes it clearer to the compiler and to anyone reading the code what is expected.
| fun run() = print(process "stop")
and process cmd: int * string =
(
    case cmd of
      "go" => go()
    | _    => stop()
)
and go()   = (3, "done")
and stop() = (4, "stopped") | 
| type3c.sml:1.13-1.34 Error: operator and operand don't agree [tycon mismatch]
  operator domain: string
  operand:         int * string
  in expression:
    print (process "stop") |