Thursday, January 5, 2012

Type Errors as Warnings

Paul Snively and I are having a bit of disagreement about the value of treating type errors as warnings during development. It's hard to make anything like a cohesive point on Twitter so I thought I'd write a quickie post on what I mean and why I think it's valuable.

What I'm talking about is not revolutionary. Eclipse JDT does it and I'm sure others do as well. The idea in a nutshell is that when the compiler encounters a type problem in addition to reporting a problem and instead of stopping it should optionally elide the offending code and replaces it with code that will throw an exception if executed. As a trivial example the Java code

int foo(String x) {
   return x*2;
}

would get an error report and be replaced by the equivalent of

int foo(String x) {
  throw new TypeError("Expected an int but got a String at line 42 of Bar.java.");
}

That kind of loose handling of type errors obviously shouldn't be enabled for production builds or even continuous integration builds - otherwise I might as well use a dynamically typed language that does it better anyway. But for development I find that kind of behavior very useful. And while my toy example was in Java I have the same desire when working on any large program in a statically typed language.

My main use case is modifying a data structure definition that is used several places in a large program. As one concrete example, if I modify a language AST definition I may not want to bother fixing up the optimizing code path until I've ironed out the kinks in the non-optimizing code path. Perhaps the whole idea is rubbish and any work I do on the optimizing path would be wasted. Or if I had a Boolean field but realized I should have used something more meaningful or with more options then I could break code everywhere but want to fix and unit test parts of the program incrementally, allowing me to think about more manageably sized chunks than "everything this one change breaks."

When a program gets changed it may very well pass through stages where parts of it are nonsense, or at least not provably sensible. Rather than having to fix everything up before exploring the consequences of my change I find that it is sometimes handy to work in a more piecemeal fashion, restoring sense to some parts and exploring the consequences. Compilers that can treat type errors as warning support that work style. Without such support I frequently end up manually peppering my code with exceptions and TODOs. Why not let the compiler do that bookkeeping for me?

Edit: Clarifications and Rebuttals

I'm not talking about optional typing where you can turn off type checking. Nor am I talking about gradual typing where you can turn type checking on or off for various parts of your program. Optional and gradual typing might (or might not) be nice, but they're orthogonal to what I'm talking about. All I'm suggesting is that when a static type checker (optional or not, gradual or not) finds a problem I always want the error report but during development I don't necessarily want the errors to prevent code generation. And while there might be sophisticated ways to generate code around type errors the most straightforward is to emit code for an exception or program termination plus some diagnostics.

There are suggestions in the comments that the result will be something like the wars over turning on -Wall (warn for all known potential problems). But -Wall isn't the right comparison, since I'm not suggesting that any type checks can be turned off. What I'm proposing is more nearly the equivalent of turning off -Werror (error on warnings). The difference is that code will often work (or at least "work" with scare quotes) in the presence of warnings. The temptation to ignore warnings can be quite strong. But in the case of type errors my suggestion would produce an executable that absolutely can't work if an offending code path is executed. Thus the temptation to turn type errors into warnings on production builds should be minimal.

If a compiler writer is seriously concerned that this behavior would be abused for production builds then the answer might be to only expose it via an API that can be used by IDEs, Emacs SLIME style modes, etc, but which isn't available in the supplied command line batch compiler. That sounds like overkill to me, but whatever.

29 comments:

  1. I think the correct solution to the compiler problem is to create a simpler compiler that simply doesn't even include the optimization path.  Then you do your what-if modelling on the simple, non-optimizing version, and when you're happy there, then you switch over and fix the optimizing version.  Also, you can stub out the code yourself.

    The problem I have with optional typing is that I know my fellow programmers way too well.  If typing is optional, they will turn it off and never turn it back on again, even for the production build.  I've seen way too many instances of this.  And the next thing you know, we are back at purely dynamic typing.  The advantage of static typing is the discipline it enforces on yourself and everyone else- that all code has to be more or less reasonably sane.

    ReplyDelete
  2. I agree completely, I've had numerous occasions lately where this would be very useful. AST changes are a fine example. If I'm working on the front-end of a compiler, I don't care about CodeGen, and introducing type errors there will just make me quickly make them compile and add TODO's which I might forget about for a while. Especially with Scala you might end up with nothing at all compiling because of one error somewhere.

    ReplyDelete
  3. Brian, the most used Java IDE (Eclipse JDT) does this exact thing -- have you ever seen or heard of Java programmers shipping production code which throws Eclipse's compilation errors?

    ReplyDelete
  4. Agreed, James, I'll side with you on that one. And yes, I don't think that "what if people use this feature to ship broken code" holds much credence: if your tests don't completely fall over if you attempt to do this, you have bigger problems.

    I'm curious to hear Paul's side of the argument, though, he must have good reasons to dislike this.

    ReplyDelete
  5. I'm not sure what to think about this.  On a large Java project I started writing unit tests in Groovy for convenience, but since Groovy is strongly but not statically typed, I came to find only afterward at runtime that many tests that were broken, when compile errors would have brought these to my attention immediately.  Given some of these tests were integration tests run only in a lengthy CI job, problems in theory could have snuck through the unit test run and caused further downstream development issues.

    ReplyDelete
  6. James,

    As you say, your idea is not new. At the University of Washington, some of my colleagues have implemented optional typing for Java in a system called DuctileJ. By using the DuctileJ compiler plugin, you can compile and run a Java program even if it does not pass the type checker. The semantics are thus: a well-typed program that has been de-typed will execute with identical semantics as a typed program. If there are type errors, then the semantics diverge at some point when ill-typed code is executed. The specifics, and more details, are at the project home page: http://code.google.com/p/ductilej/

    It works moderately well, but there are many corner cases in the implementation. Currently, only type errors can be worked around, and not syntax errors. I tried to implement a similar system in Scala (which is harder to get to compile) last year. It failed miserably, in large part because one needs most of the compiler machinery to reimplement scala's method dispatch in a ducktyping manner.

    ReplyDelete
  7. I spent ten years fighting C/C++ programmers, trying to get them to turn warnings on.  I know for a fact bugs where shipped that -Wall would have caught.  And for about every two programmers I worked with who would at least go along and humor me, I had to work with one who, by their actions, would rather die than enable warnings.  This is why I know, if the option is available, it will be used in shipping code

    But, to specifically answer your question, a couple of minutes with google brought me to:
    http://www.dreamincode.net/forums/topic/119157-force-compile-java-files-with-errors/

    ReplyDelete
  8. Two general comments here:

    1) It is a philosophy of mine that, at least when working in the ML derived languages, when I start fighting the type system, it is I, and not the type system, who is at fault.  I am screwing up.  If you want to be able to compile your program with large sections omitted, and this is hard or impossible, than maybe a refactoring of your code is called for, not a change to the type system.
    2) We've known since at least the 60's that the earlier a bug is found and fixed, the cheaper it is to fix.  This whole conversation is predicated on the assumption that, it's cheaper to fix bugs later and not now.  Fred Brooks is spinning in his grave, and he's not even dead yet.

    ReplyDelete
  9. This is a good answer to the question 
    Things possible in Eclipse that aren’t possible in IntelliJ?
    http://stackoverflow.com/a/8753704/23572

    ReplyDelete
  10. Absolutely agree. If this behaviour were added to the Presentation compiler only, then it wouldn't be a problem for production code, because it wouldn't be possible from the normal compiler. I think this is a good idea for *all* code, not just type code.

    ReplyDelete
  11. This is a slightly different thing. We're talking about stuff which would be errors under normal circumstances, and in the JDT still show up as errors in Eclipse. But the .class file is still produced. With the javac compiler, the class files are produced, so you won't even be able to start running the unit tests.

    ReplyDelete
  12. We're not talking about optional typing here. We're talking about generating class files anyway, even when there are errors in the source file. The errors will still show up in Eclipse, the code won't pass the normal compiler, so will never get into production builds.

    ReplyDelete
  13. I like it, and while I have no doubt people WILL ship code compiled like this, I've grown tired of decreasing my productivity so that people who should never be allowed to write programs professionally can be restrained in their idiotic ways.

    ReplyDelete
  14. We're not talking about a change to the type system, though -- does the Eclipse compiler change the type system of Java? No, it just emits byte code that does something like 'throw new CompilationError("blabla");' whenever there's an error it can work around.

    _ The IDE will still display the errors! They are not optionally errors. _

    This is not at all related to  fixing bugs early. A change in a type hierarchy that affects a big part of a program might cause tens or hundreds of compile errors. If the change can't be handled by a simple refactoring, and many changes can't, you will now have to stop what you were doing and fix all the errors _right now_ (and often you will not really fix the actual thing that broke, you will just insert a TODO comment and just enough code to make it compile.

    If some of these errors could be propagated to runtime, you can continue and finish what you were doing, make it work, and _then_ fix the rest of the compilation problems one by one.

    But if there are programmers out there would actually ship code that has compilation errors... I think they have bigger problems than this and possibly shouldn't be programmers.

    ReplyDelete
  15. If you don't want to keep the original incorrect type information, I think gradual typing should be the solution. 

    ReplyDelete
  16. I believe Haskell has or will have capability similar to this.  I just watched a talked where Simon Peyton-Jones was mentioning that basically what will happen is Haskell will memo-ize (for lack of a better term) the error that would have been a compiler type error.  Then if you ever run that code you effectively get the error at run time you would have seen during compilation.  My description might be a bit off, but that's the general idea...

    ReplyDelete
  17. do you have a pointer to SPJ's talk in question?

    ReplyDelete
  18. if (leslee_smarts >= james) {
      return witty_statement;
    }
    I don't actually have anything intelligent to say here, but this blog desperately needs a woman's input.

    ReplyDelete
  19. I think it's in this interview: http://channel9.msdn.com/Blogs/Charles/YOW-2011-Simon-Peyton-Jones-Closer-to-Nirvana

    ReplyDelete
  20. "My project is to automatically detect Composite design pattern in Java 1.4 source code."  He wanted to do analysis on the code, not deploy it.  I'm not clear on why he need to compile unless he was analyzing the byte code and accidentally wrote "source code" but either way I don't think that's an example of your nightmare scenario.  I've put other comments into the article.

    ReplyDelete
  21. Around 27:00 SPJ responds to a question about optional typing with something related to what I'm talking about and then makes it clear that it's not optional typing because some programs that would work under optional typing won't work under his proposed scheme.

    ReplyDelete
  22. The reason for the sophistication in SPJ's discussion about wrappers that throw exception is just that Haskell is lazy and can get away with some things that won't fly in strict languages.  GHC doesn't have to do those wrappers, but it can so why not?

    ReplyDelete
  23. Interesting idea, although your "replace offending code with an exception" needs further development -- you'll later want to replace the exception with the original code in order to correct it, etc. Perhaps comments or #ifdef's of some sort would work better.

    I used to work with a developer who when he left work every night left his code in a compilable state. I never managed that -- often due to the sort of issues you describe. It was odd, I couldn't understand how he could develop without breaking some eggs, and he couldn't understand why I had to.

    ReplyDelete
  24. I'm not talking about modifying source code, I'm talking about the object code (byte code, machine code, whatever) emitted by the compiler.  

    ReplyDelete
  25. He talked about that a couple of times at YOW! in Brisbane.

    He mentioned it during his keynote which will eventually be posted:

    http://channel9.msdn.com/Blogs/Charles/YOW-2011-Simon-Peyton-Jones-and-John-Hughes-Its-Raining-Haskell#c634604558582399127

    ReplyDelete
  26. This is one thing I love about Objective-C: type checks are warnings by default, and not errors.  In 99.9% of cases, the warning is correct and you're doing the wrong thing; in the 0.1% case where it's not correct, you can use a C-style cast to tell the compiler that you know what you're doing.  It's a dynamically-typed language with static type checking, and works extremely well for both large and small code bases.

    ReplyDelete
  27. Maybe, instead, "scoped" compilation, or lazy compilation as is done in C++ templates. Code that results in type  errors only does if instanced (ie: invoked, generated).

    What you're in need of is this "scoped" compilation mode for tests: compile everything needed from this entry point.

    No need to ignore type errors, only unnecessary code.

    How a language as dynamic as Java can do that is another matter.

    ReplyDelete
  28. Obviously not. You didn't have anything intelligent to say.

    ReplyDelete
  29. I think it's a good idea. Let me add that the code should not simply be replaced, but kept in some way (commented out? I hate commented-out code, duh). Otherwise, you'll lose any sense of what you actually intended at that point.

    ReplyDelete