Friday, April 24, 2009

Java Has Type Inference and Refinement Types (But With Strange Restrictions)

When enthusiasts talk about Scala they frequently start with "it's a lot like Java but with all this other cool stuff like type inference and structural types and..." I guess it's okay to introduce Scala that way. I've done it myself. But it does give the impression that Scala has somehow heaped complexity onto the Java. Oddly enough, it hasn't, at least not in the areas of type inference and structural types. Java has both. Sorta. They're just not very useful.

In this short article I'm going to try to show that while type inference and structural types appear to be additions to Scala over and above Java, they can also be explained as the removal of peculiar restrictions from Java. The goal is not just to illustrate these two points, but to make a broad statement: Scala isn't a just bunch of features heaped on top of Java. To a significant degree it's a generalization of things already found in Java. The result is that many aspects of Scala are both simpler and more regular than Java.

Type Inference

Java folk get excited about Scala's type inference. Usually positively, but sometimes negatively. But here's the thing: every practical statically typed language has some type inference. Every one. Even Java has type inference. It's just very limited. Let me show you by starting with a simple class

class Foo {
  public Foo bar() { return this; }
  public String toString() { return "Foo!"; }
}

With that I can write

System.out.println(new Foo().bar().bar().bar());

It's silly code but it illustrates my point. Java doesn't need me to explicitly annotate the return type of each call to bar(). Java infers the type resulting from new Foo().bar() as being a Foo, which in turn can be sent a bar() message resulting in a Foo, etc. If I change things just a little the type system complains

System.out.println(new Foo().bar().bar().baz()); // bzzt error!

In Java I "only" have to annotate types when I want to assign to a variable or return from a method. Which, come to think of it, is an odd restriction. The compiler obviously knows the type, so why do I have to tell it yet again?

Foo f = new Foo().bar().bar().bar();

Thus, rather than saying Scala adds type inference we can equally well say that it removes a silly restriction from Java's type inference[1].

Structural Refinement Types

The fact that you don't have to annotate types when building expressions in Java is well known even if people don't usually call it type inference. What's not as well known is that Java has refinement types. So I'll have to explain, this time starting with Scala.

In Scala you can say method doSomething takes any object with a bar method, like so

def doSomething(x : {def bar : Unit}) = ...

That's called structural typing - the doSomething method puts a constraint on the structure of objects that may be passed to it. But actually, in Scala, that's just shorthand for the following refinement type

def doSomething(x : AnyRef {def bar : Unit}) = ...

That says that x must be a subtype of AnyRef and must have a bar method. The type of x is a "refinement" of AnyRef with an additional structural constraint. Scala's refinement types aren't limited to just refining AnyRef, but that's really not important to my point here. My point is that Java has refinement types, too.

If you're a Java programmer, you're probably scratching your head thinking I've lost my sanity. But I promise I haven't. I ask you, in Java what exactly is the type of this expression?

new Object() { 
  public void bar() {
     System.out.println("bar!"); 
  } 
}

If you said "Object!" think again. In Java it's actually something more sophisticated. Go paste this into your favorite Java IDE and run it. I'll wait

public class Test {
  public static void main(String[] args) {
    (new Object() { 
      public void bar() { 
        System.out.println("bar!"); 
     } 
    }).bar();
  }
}

Surprised? Remember my point about limited type inference above? Here, Java has clearly inferred that the result of the expression is an Object refined with a bar method[2]. And so it lets you send a bar() message to the object. Rename either the method or the call to "baz" and you get a type error. But if you want to do anything interesting with the (new Object {public void bar() {...}};) object, like assign it to a variable, return it from a method, or apply it as an argument to a method then Java has a problem. It has no way to denote the refinement. So the refinement gets "lost" from the static type system. Poof. Gone. It's part of the object, you can find it via reflection, but the static type system won't let you get at it. The static type system only lets you get at a refinement "immediately" in a surrounding expression. Which is pretty close to useless. When you look at it that way, Scala hasn't so much added refinement types as removed another silly Java restriction.

Conclusion

When Java people first start looking into Scala it's common to think of Scala as being some complicated mishmash. In large part, evangelists (including me) are to blame. It's surprisingly hard to convey a nice feature of language A to users of language B without making it sound like something totally new, hence something that will complicate their lives. Hopefully this article has helped illustrate that in a very real sense some of the things that appear to be "new" in Scala are really "old" but generalized to be made more useful.

So this article is a call to all Scala pundits out there: when comparing Scala to another language, especially Java, go ahead and get excited about a "new" feature...but then see if you can't explain it as a generalization of an "old" one. You might be surprised how often you can and your audience might come away less intimidated.

Foot Notes

  1. Technically, Scala's type inference is quite a bit more sophisticated than Java's. But I won't let inconvenient facts like that get in the way of making a good point.
  2. Technically, I think the Java spec says that the type isn't structural, but some anonymous nominative type that the compiler can infer and use over a very limited scope. Then again, maybe not. I'm too lazy to go re-read the spec. Also, as already established, I won't let inconvenient facts like that get in the way of making a good point.