Thursday, January 5, 2012

Type Errors as Warnings

Paul Snively and I are having a bit of disagreement about the value of treating type errors as warnings during development. It's hard to make anything like a cohesive point on Twitter so I thought I'd write a quickie post on what I mean and why I think it's valuable.

What I'm talking about is not revolutionary. Eclipse JDT does it and I'm sure others do as well. The idea in a nutshell is that when the compiler encounters a type problem in addition to reporting a problem and instead of stopping it should optionally elide the offending code and replaces it with code that will throw an exception if executed. As a trivial example the Java code

int foo(String x) {
   return x*2;
}

would get an error report and be replaced by the equivalent of

int foo(String x) {
  throw new TypeError("Expected an int but got a String at line 42 of Bar.java.");
}

That kind of loose handling of type errors obviously shouldn't be enabled for production builds or even continuous integration builds - otherwise I might as well use a dynamically typed language that does it better anyway. But for development I find that kind of behavior very useful. And while my toy example was in Java I have the same desire when working on any large program in a statically typed language.

My main use case is modifying a data structure definition that is used several places in a large program. As one concrete example, if I modify a language AST definition I may not want to bother fixing up the optimizing code path until I've ironed out the kinks in the non-optimizing code path. Perhaps the whole idea is rubbish and any work I do on the optimizing path would be wasted. Or if I had a Boolean field but realized I should have used something more meaningful or with more options then I could break code everywhere but want to fix and unit test parts of the program incrementally, allowing me to think about more manageably sized chunks than "everything this one change breaks."

When a program gets changed it may very well pass through stages where parts of it are nonsense, or at least not provably sensible. Rather than having to fix everything up before exploring the consequences of my change I find that it is sometimes handy to work in a more piecemeal fashion, restoring sense to some parts and exploring the consequences. Compilers that can treat type errors as warning support that work style. Without such support I frequently end up manually peppering my code with exceptions and TODOs. Why not let the compiler do that bookkeeping for me?

Edit: Clarifications and Rebuttals

I'm not talking about optional typing where you can turn off type checking. Nor am I talking about gradual typing where you can turn type checking on or off for various parts of your program. Optional and gradual typing might (or might not) be nice, but they're orthogonal to what I'm talking about. All I'm suggesting is that when a static type checker (optional or not, gradual or not) finds a problem I always want the error report but during development I don't necessarily want the errors to prevent code generation. And while there might be sophisticated ways to generate code around type errors the most straightforward is to emit code for an exception or program termination plus some diagnostics.

There are suggestions in the comments that the result will be something like the wars over turning on -Wall (warn for all known potential problems). But -Wall isn't the right comparison, since I'm not suggesting that any type checks can be turned off. What I'm proposing is more nearly the equivalent of turning off -Werror (error on warnings). The difference is that code will often work (or at least "work" with scare quotes) in the presence of warnings. The temptation to ignore warnings can be quite strong. But in the case of type errors my suggestion would produce an executable that absolutely can't work if an offending code path is executed. Thus the temptation to turn type errors into warnings on production builds should be minimal.

If a compiler writer is seriously concerned that this behavior would be abused for production builds then the answer might be to only expose it via an API that can be used by IDEs, Emacs SLIME style modes, etc, but which isn't available in the supplied command line batch compiler. That sounds like overkill to me, but whatever.