Wednesday, November 10, 2010

Local Version Control

My company uses a centralized version control system. It works reasonably well, but of course has all the downsides of centralized VCS. So I wanted to explain how I use a local version control system to supplement even if there isn't a real "bridge" like git-svn.


Before I start let me sum up why you might want to.

  1. It's not uncommon to work on task A only to have it interrupted by super critical task B. Local VCS makes it easy to put aside the work in progress for task A, go back to the state of the system before task A started, get task B done, and then pick up task A again.
  2. Centralized version control almost never manages an entire configuration. In every development environment I've ever worked there are some local settings that you can't check in because they are for you only. Examples include editor preferences, personal credentials for a service, and machine specific configurations. A local VCS can hold your personal configuration just as well as shared configuration.
  3. It can be hard to get the manager of a centralized version control system to create branches for your experiments. And checking in half-assed code to a central VCS's main line is usually frowned upon. A local VCS means you can take short steps forward, roll back, and branch in different directions at a whim all while maintaining as much history as you want.
  4. You can do all the above even while offline.
  5. If your central VCS uses a lock/edit/unlock (ick!) model then a local VCS makes it possible to do an edit/merge model locally and only lock the moment you're ready to commit.
  6. If your local VCS is a distributed VCS (DVCS) then you can share your ongoing work with immediate teammates without polluting the main code line with something experimental or half-baked.

Which One?

Which local VCS to use? Other than the last use case it doesn't matter, much. Just beware that Subversion isn't so good at merging branches multiple times. Also, if your central VCS likes to play with read/write permissions then it's a bad idea to use a local VCS that also wants to do so. More generally the local and centrallized VCS can't keep metadata in the same place, so Subversion can't be used both locally and remotely. I'd heartily recommend git, Mercurial, Darcs, or Bazaar whatever you're most comfortable with.

The Basic Model

It's very simple to have local VCS for the most common workflow. The basic idea is to have a local repository that manages the same directories managed by your centralized VCS. Then a typical work session might look like

  1. Synch with central VCS
  2. Immediately commit to local VCS with a comment like "synch" so you know it wasn't your work
  3. Do some work
  4. Commit to local VCS
  5. Do some more work
  6. Commit to local VCS
  7. Finish your work
  8. Commit to local VCS
  9. Synch from central VCS, merging as necessary
  10. Make sure your stuff still works
  11. Commit to local VCS with a comment like "merge"
  12. Commit to centralized VCS

That's as simple as it can be. Other than the occasional commit to local VCS, it's exactly what you do now. The process works because the way your local VCS and distributed VCS track what has changed is different. The centralized VCS doesn't know anything about all the intermediate commits to local VCS. It just thinks you've made a bunch of changes and committed them. Similarly, the local VCS is oblivious to the central VCS. At step 2 it thinks you've just checked in some massive change.

There are more advanced uses of a local VCS but the above model seems to cover about 95% of what I use mine for.


A slightly more advanced model when using DVCS is to clone the local "master" repository and work in the clone. This model can be very useful when working on something very experimental which is likely to take a long time with frequent interruptions for higher priority tasks. Upstream commits and synchs become a bit more complicated, but "shelving" is automatic: to jump on a quick high priority task just go work in the master repository or make another clone from the master.

Alternatively, if your local VCS supports another branching model then the same idea can be done with a branch.


If you're using a DVCS then work-in-progress can be shared across a small team within the larger organization. However, it's probably a good idea to have one person (or one repository) responsible for synching from the centralized VCS to avoid spurious merge conflicts since the local VCS has no way to know that different people synching from the central VCS are pulling files with a common history.

The Fine Print

If you use an IDE to control your central VCS you may have to stick to command line for your local VCS or, if you prefer, use something like one of the Tortoise* style plugins for a graphical file manager. I've never experimented with getting an IDE to try to understand that one set of directories is managed by two version control systems, but I can't imagine that it would be pretty.

Some centralized VCSs really, really prefer to know what you're working on. They may have a lock/edit/unlock cycle (ick!) or they may just use knowledge of your "working" files as an optimization (Perforce does this). If so, cloning can be a problem because the checkout bit won't be tracked in the clone. If you modify files in the clone then push back to the master the centralized VCS will disagree with you about what state the file is in. The same problem goes for a shared DVCS model. It's not an insurmountable problem, but one to be aware of. Basically before committing to centralized VCS all the changed files will need to be "checked out" and merged with your work.


Stick to the basic model at first and you'll see it's very easy to get most of the benefits of local VCS. You'll quickly realize that the local VCS is an incredible safety net even with just the simple model. Later you can branch out *ahem* into more advanced uses.

Given the simplicity and safety of having a local VCS it's insane to work with only a centralized VCS. And who knows, maybe your little experiment will be the catalyst for moving your company to a DVCS.

Monday, October 18, 2010

Phantom Types In Haskell and Scala

Some literate Haskell and Scala to demonstrate 
1) phantom types
2) I am a masochist
3) ???
4) profit!

The code is probably best viewed with a syntax colorizer
for one language or the other but I've colorized all my

> {-# LANGUAGE EmptyDataDecls #-}
> module RocketModule(test1, test2, createRocket, addFuel, addO2, launch) where

object RocketModule {

None of these data types have constructors, so there are
no values with these types. That's okay because I only 
need the types at compile time. Hence "phantom" - 
ethereal and unreal.

> data NoFuel
> data Fueled
> data NoO2
> data HasO2

   sealed trait NoFuel
   sealed trait Fueled
   sealed trait NoO2
   sealed trait HasO2

The Rocket data type takes two type parameters, fuel and
o2.  But the constructor doesn't touch them.  I don't
export the constructor so only this module can create

> data Rocket fuel o2 = Rocket

   final case class Rocket[Fuel, O2] private[RocketModule]()

createRocket produces a rocket with no fuel and no o2
> createRocket :: Rocket NoFuel NoO2 
> createRocket = Rocket

   def createRocket = Rocket[NoFuel, NoO2]()

addFuel takes a rocket with no fuel and returns one with
fuel.  It doesn't touch the o2
> addFuel :: Rocket NoFuel o2 -> Rocket Fueled o2
> addFuel x = Rocket

   def addFuel[O2](x : Rocket[NoFuel, O2]) = Rocket[Fueled, O2]()

Similarly, addO2 adds o2 without touching the fuel

> addO2 :: Rocket fuel NoO2 -> Rocket fuel HasO2
> addO2 x = Rocket
   def addO2[Fuel](x : Rocket[Fuel, NoO2]) = Rocket[Fuel, HasO2]()

launch will only launch a rocket with both fuel and o2

> launch :: Rocket Fueled HasO2 -> String
> launch x = "blastoff"

   def launch(x : Rocket[Fueled, HasO2]) = "blastoff"

This is just a pretty way of stringing things together,
stolen shamelessly from F#.  Adding infix operations is
a bit verbose in Scala.

> a |> b = b a

   implicit def toPiped[V] (value:V) = new {
      def |>[R] (f : V => R) = f(value)

Create a rocket, fuel it, add o2, then 
launch it

> test1 = createRocket |> addFuel |> addO2 |> launch

   def test1 = createRocket |> addFuel |> addO2 |> launch

This compiles just fine, too.  It doesn't matter which 
order we put in the fuel and o2
> test2 = createRocket |> addO2 |> addFuel |> launch

   def test2 = createRocket |> addO2 |> addFuel |> launch

//This won't compile - there's no fuel
// > test3 = createRocket |> addO2 |> launch

//    def test3 = createRocket |> addO2 |> launch

// This won't compile either - there's no o2

// > test4 = createRocket |> addFuel |> launch

//    def test4 = createRocket |> addFuel |> launch

// This won't compile because you can't add fuel twice

// > test5 = createRocket |> addFuel |> addO2 |> addFuel |> launch

//    def test5 = createRocket |> addFuel |> addO2 |> addFuel |> launch

Friday, October 15, 2010

Systems Design: Making a New Feature (Nearly) Too Much Trouble

EDIT: Since I wrote this BART has finally implemented an integration between parking payment and the Clipper card. It's a little more complicated to set up than it should be, but it works. I'll leave the rant up because the meta-point still stands.

This article will look like a random rant about public transportation, but it's really a random rant about bad systems design, especially when extending an existing system without understanding the interactions between the new and the old.

Every day I ride BART (the San Francisco bay area subway system) to and from work. The default procedure for BART is to put some cash or credit into a machine and get a ticket of that value. Then every time you want to ride you run the ticket through a turnstile, ride the train, then run the ticket through an exit turnstile to have value deducted from your ticket based on the distance between the stations. Keep doing that until the ticket is nearly depleted and then add some more value at a ticket machine.

In order to minimize the number of trips to the ticket vending machine I was buying the max value it would allow, $60. But the paper tickets are easily torn, easily lost, and um, easily destroyed in a washing machine. Before you ask yes I did. Fortunately it was a nearly depleted ticket.

Losing $60 on a maxed ticket wouldn't be pleasant. So I decided to get a Clipper card, an RFID based card that you can tie to a prepaid or credit account and use for BART, Muni (San Francisco's municipal bus and light-rail system), and Cal-Train (a commuter rail that connects San Francisco to Silicon Valley). Sounds like a good thing. But in practice it turns out to be (nearly) too much trouble.

I skipped over a point in describing my default procedure with tickets. You see, I have to park at the BART station near my house and BART charges a small $1 fee for parking. So the real procedure is: feed the ticket to the entrance turnstile, use the ticket at another machine to pay for parking, then ride the train, etc. That's important because, get this, you can't use the Clipper card for parking. I have no idea why not, but you can't. Period.

Okay, that's annoying. But I knew that when I got the card so I had a plan: I would continue to buy BART tickets but for much smaller amounts, like $10, in order to pay for parking. It wouldn't be perfect but still much better. Washing away, er accidentally tearing up $10 is much less painful than losing $60.

It was a good plan. But my plan was dashed by some system designer's brain fart. The machine that accepts a parking fee will not accept your ticket unless it has been used at the entrance turnstile. So if I enter using the Clipper card I can't use a ticket for parking.

If you're mentally asking "WTF?" at this point then you and I are on the same page. The designers had to work extra to add this restriction. Maybe it seemed like a good idea before the card system existed: somebody buying parking with a ticket not marked for entrance may have jumped the turnstile. Or something. I don't know. I do know that with the new system it's just a way to render the Clipper card (nearly) too much trouble.

Why too much trouble? Because other than tickets the parking machine only takes cash and only in $1 and $5 denominations. Not credit cards, not debit cards, not twenties. Nope. Only ones and fives. I don't know about you but I often have nothing but twenties because that's what ATMs dispense. Besides, the goal of me getting the Clipper card was to get rid of a bit of paper, not create a need to carry more bits of paper (with pictures of dead white guys).

The card is now very nearly too much trouble. Nearly. I'll still keep the Clipper card. I can use it for my return trips and thus cut the value of my tickets in half. And I can use it for Muni and Cal-Train. But my original goal, getting rid of paper, is dead, slaughtered on the altar of bad systems design.

Now let me get meta. The above sad but true story is an object lesson in extending an existing system. The next time you add a feature to a system you're working on ask yourself if interactions with existing features are going to cause your users problems. At the very least such bad interactions can lower the user's valuation of all your hard work in creating the new feature. But that's not all.

One of the strange things about user experience is that users subjectively put far more weight on negative experiences than positive ones. A new feature should create a positive experience and make users want to use your system even more. But if a new feature interacts poorly with old features then the user's impression can actually be more negative than it was before you extended your system. It would be a damn shame if trying to help your users eventually led them to conclude your whole system was (entirely) too much trouble.

Friday, October 8, 2010

How To Estimate Software

A guide to estimating percentage complete and remaining time.

I haven't looked at the problem. Completed: 0% Time esimate: about 2 weeks.

I've looked at the problem. Completed: 50% Time estimate: about 2 more weeks.

I've implemented almost everything. All that remains is the difficult stuff that I don't really know how to do. Completed: 90% Time estimate: about 2 more weeks.

I've done everything. All that remains is documentation, code review, tests and error handling. Completed: 99% Time estimate: about 2 more weeks.

I still haven't finished documentation, code review, tests, or error handling. The truth is I've been gold plating what I already wrote. But I just saw another problem that seems more interesting. Completed: 100% Time estimate: 0 more weeks.

Monday, September 13, 2010

Moron Why C Is Not Assembly

In response to a previous article a poster in some forum called me an idiot and said "everybody knows that C is a portable assembly language with some syntax sugar." The idiot insult hurt deeply and I cried, but once the tears were dry I resolved to write a bit more on why C isn't assembly, or is at best an assembly for a strange, lobotomized machine. Also, I may have misspelled "more on" in this article's title.

To compare the two things I kinda need to define what I'm comparing. By C I mean an ANSI C standard, whichever version floats your boat. Most real C implementations will go some distance beyond the standard(s), of course, but I have to draw the line somewhere. Assembly is more problematic to define because there have been so many and some have been quite different. But I'll wave my hands a bit and mean standard assembly languages for mainstream processors. To level set: my assembly experience is professionally all x86 and I had a bit of RISC stuff in college plus I did some 6502 back in the day.

Assembly-Like Bits

There are some very assembly like things about C and its worth mentioning them. The main one is that C gives you very, very good control over the layout of data structures in memory. In fact, on modern architectures, I'd bet that this fact is the primary ingredient in the good showing that C (and C++) have in the language shootout. Other languages produce some damn fine executable code, but their semantics make it hard to avoid pointer indirection, poor locality of reference, or other performance losses.

There are a few other assembly-ish thing in C. But not much. Now to why C is just not assembly. I'll start with...

The Stack

As far as I'm concerned the number one thing that makes C not assembly is "The Stack." On entry into a function C allocates space needed for bookkeeping and local variables and on function exit the memory gets freed. It's very convenient, but it's also very nearly totally opaque to the programmer. That's completely unlike any mainstream assembly where "the stack" (lower case) is just a pointer, probably held in a register. A typical assembly has no problem with you swapping the stack (e.g. to implement continuations, coroutines, whatever) while C defines no way to do much of anything with the stack. The most power that C gives you on the stack is setjmp/longjump - setjmp captures a point on the stack and longjmp lets you unwind back to that point. setjmp/longjmp certainly have an assembly-like feel to them when compared to, say, exceptions, but they are incredibly underpowered when compared to the stack swizzling abilities of a typical assembly language.

Speaking of continuations, let's talk about...

Parallel and Concurrent Code

The C standard has almost nothing to say about parallel and concurrent code except that the "volatile" keyword means that a variable might be changed concurrently by something else so don't optimize. That's about it. Various libraries and compilers have something to say on the subject, but the language is pretty quiet. A real assembly, or at least a real assembly on any modern architecture, is going to have all kinds of stuff to say about parallelism and concurrency: compare and swap, memory fences, Single Instruction Multiple Data (SIMD) operations, etc. And that leads a bit into...


void doubleArray(int * restrict dest, 
     int const * restrict src, size_t size) {
   for (size_t i = 0; i < size; i++) {
      dest[i] = src[i] * 2;

Under gcc -std=c99 -S at -O1 I get pretty standard sequential code. Under -O3 I get highly parallel code using x86 SIMD operations. An assembler that changed my code that way would be thrown out as garbage but a C compiler that DIDN'T parallelize that code on hardware that could support it would seem underpowered. And given the highly sequential nature of the original code the transformation to parallel code goes way beyond mere de-sugaring.

Optimization leads me to ...

Undefined Behavior

The C standard says a bunch of stuff leads to undefined behavior. Real compilers take advantage of that to optimize away code, sometimes with surprising results.

What assembly has undefined behavior on that scale? mov al, [3483h] ;(Intel syntax) is going to try to copy the byte at 3483h into the al register, period. It might trap due to memory protection but the attempt will be made.


Alan Perlis famously said "a programming language is low level when its programs require attention to the irrelevant." What's relevant depends on the problem and the domain, so a corollary would be that a language that's at too high a level prevents paying attention to some of what is relevant for your needs.

C is a low level language as compared with most languages used for most day to day programming. If you spend all day hacking Ruby or Haskell then C might look like assembly. But C is actually at a substantially higher level than assembly. Assembly both permits and requires you to do things that C hides.

Please stop perpetuating the myth that C is sugar for assembly because real compilers make complicated, non-obvious transformations to get from C to machine code that assemblers don't. And please stop saying that C is a portable assembly because if it is then it's the assembly for a very peculiar machine that is both underspecified and a bit stupid when compared to real hardware.

Friday, August 27, 2010

Why Scala's "Option" and Haskell's "Maybe" types will save you from null

Cedric Beust makes a bunch of claims in his post on Why Scala's "Option" and Haskell's "Maybe" types won't save you from null, commenters say he doesn't get it and he says nobody has argued with his main points. So I thought I'd do a careful investigation to see what, if anything, is being missed.

First, right off the top here: Scala has true blue Java-like null; any reference may be null. Its presence muddies the water quite a bit. But since Beust's article explicitly talks about Haskell he's clearly not talking about that aspect of Scala because Haskell doesn't have null. I'll get back to Scala's null at the end but for now pretend it doesn't exist.

Second, just to set the record straight: "Option" has roots in programming languages as far back as ML. Both Haskell's "Maybe" and Scala's "Option" (and F#'s "Option" and others) trace their ancestry to it.

Now, to the post.

First Objection

in practice, I find that hash tables allowing null values are rare to the point where this limitation has never bothered me in fifteen years of Java.

Really? Let me give one concrete example. You're writing a cache system (in the form of an associative map) in front of a persistent storage with nullable values. Question: how do you differentiate between "I haven't cached this value" from "I've cached the value and it was null in the database"? The answer is that you can't use "null" to mean both things, it's ambiguous. That may not "bother" Beust, but I find it irritating that you end up having to use two different mechanisms to say this in Java.

With Maybe/Option you just say that the map holds optional values. A fetch result of None/Nothing means you haven't put that key in your map. A result of Some(None)/Just Nothing means you've got the key but it maps to nothing, and a result of Some(value)/Just value means that the key is in the map and that there's a "real" value there in storage.

Save Me!

Back to Beust

The claim that Option eliminates NullPointerException is more outrageous and completely bogus. Here is how you avoid a null pointer exception with the Option class (from the blog):

val result = map.get( "Hello" )
result match {
  case None => print "No key with that name!"
  case Some(x) => print "Found value" + x

See what's going on here? You avoid a NullPointerException by... testing against null, except that it's called None. What have we gained, exactly?

Here, is, IMHO the heart of what people are arguing with Beust about. He thinks this code is no different from the kind of "if (result == null)" code you write in Java. Sure, identical. Except for one, huge, glaring, major, stab you in the face difference: in Java the type system says Map<T>#get will return a T. It doesn't say it MIGHT return a T. That's left to the documentation where it's out of reach for the compiler. Here in Java.

Map<String, Integer> myStuff = new HashMap<String, Integer()>;
myStuff.put("hello", 42);
int result = myStuff.get("goodbye") * 2; // runtime NPE

Java NPEs when you try to use the result of the fetch. On the other hand, when you use a Scala map you get an Option[T] while Haskell's says it returns a Maybe t. The type system says you can't use the underlying value directly. Here's me trying to pull something from a map and manipulate it directly in both Haskell

mystuff = fromList[("hello", 42)]
result = (lookup "goodbye" myStuff) * 2 -- compile time error

And Scala

val myStuff = Map("hello" -> 42)
val result = (myStuff get "goodbye") * 2 // compile time error

In both cases I get a static error.

Now...there are a lot of (IMHO crappy) debates on static vs dynamic typing. But one thing must be very clear: the two situations are at least different. This isn't the same as a Java NPE (or the equivalents in Ruby/Python/whatever). Option/Maybe tell the static type system that you may or may not have what you're looking for.

Blowups Can Happen


The worst part about this example is that it forces me to deal with the null case right here. Sometimes, that's what I want to do but what if such an error is a programming error (e.g. an assertion error) that should simply never happen? In this case, I just want to assume that I will never receive null and I don't want to waste time testing for this case: my application should simply blow up if I get a null at this point.

There is a cavernous hole in Haskell Maybe and Scala Option. Beust's post does not talk about it at all, though. The hole is that Haskell and Scala are Turing complete, and that means that the type system cannot prevent you from throwing optionality out if that's what you really want to do. In Haskell

loopForever :: a
loopForever = loopForever -- really, forever

stripOption :: Maybe t -> t
stripOption Just x = x
stripOption _ = loopForever

-- or better yet
stripOption = fromMaybe loopForever

And Scala

final def loopForever[A] : A = loopForever // and ever, amen

def stripOption[T](o : Option[T]) : T = o match {
  case Some(x) => x
  case None => loopForever

// or, better still

def stripOption[T](o : Option[T]) : T = o getOrElse loopForever

In their respective languages language the following compile just fine but will cause non-termination at runtime.

result = (stripOption (lookup "goodbye" myStuff)) * 2
val result = stripOption(myStuff get "goodbye") * 2

Of course, if you really want to write that in real code you would replace "loopForever" with throwing an exception. That's exactly what Scala's Option#get and Haskell's Data.Option.fromJust do.

result = (fromJust (lookup "goodbye" myStuff)) * 2
val result = (myStuff get "goodbye").get * 2

Why did I write a silly loop? To make a point: Turing completeness punches a hole in the type system. "Throw" style operators exploit essentially the same hole. Every programmer should know this in the bottom of their heart.

Whether using the loopForever version or the exception version, stripOption/get/fromJust is different from null dereferencing. It requires an explicit step on the part of the programmer. The programmer says "I know what I'm doing here, let me continue, blow up if I'm wrong." But, IMHO, it's easy enough to say that it kills his objection of "forcing" him to deal with null instead of just blowing up. Yes, there's a hoop to jump through, but it's tiny. KABOOM!

Nullable Types

Next point:

When it comes to alleviating the problems caused by null pointer exceptions, the only approach I've seen recently that demonstrates a technical improvement is Fantom (Groovy also supports a similar approach), which attacks the problem from two different angles, static and runtime:

Static: In Fantom, the fact that a variable can be null or not is captured by a question mark appended to its type:

Str   // never stores null
Str?  // might store null

This allows the compiler to reason about the nullability of your code.

One technical quibble. Since Groovy doesn't have static types this static bit isn't relevant to Groovy.

Anyway, the equivalents in Scala and Haskell are String vs Option[String]/Maybe String. More characters, but the same idea. Except that Fantom and Nice and other languages with nullable types don't tend to support saying ??String whereas Option[Option[String]] and Maybe (Maybe String) are totally reasonable. See my caching example above.

Safe Invoke

Runtime: The second aspect is solved by Fantom's "safe invoke" operator, ?.. This operator allows you to dereference a null pointer without receiving a null pointer exception:

// hard way
Str? email := null
if (userList != null)
  user := userList.findUser("bob")
  if (user != null) email =
// easy way
email := userList?.findUser("bob")?.email

Note how this second example is semantically equivalent to the first one but with a lot of needless boiler plate removed.

And in Scala that might look like

userList flatMap (_ findUser "bob") flatMap (

while haskell would use

userList >>= findUser "bob" >>= email

There are some more keystrokes indeed, but here's the thing...


So, can someone explain to me how Option addresses the null pointer problem better than Fantom's approach

Option/Maybe/flatMap/>>= are all ordinary library code. No magic. The types are algebraic data types, an idea that is massively useful. The methods belong to a class of types called Monad. If Option/Maybe don't work for you as-is you can build your own versions. With nullable types and ?. operator you're stuck with what the language provides. If they don't work for your particular case then you can't create your own variants. Quick, in Fantom extend the ?. operator so that works for exception-like semantics such that failure carries a reason instead of just computing null. In Haskell and Scala that already exists and it's ordinary library code in a type called Either.


In practice, outside of learning, I've never ever gotten the equivelent of an NPE by abusing Haskell's fromJust or Scala's Option#get, not even in testing. On the other hand NPE's are a fact of my life when dealing with nulls in Java et al. And while it's true that testing has revealed the vast majority of potential NPEs (or equivalent) in my life, it certainly hasn't dug up all of them. I've seen plenty of production systems brought to their knees by NPEs.

Exceptions plus a deliberate weakening of the type system allow developers to create "oh, just blow up" code when that's the right answer. But the developer has to explicitly say so.

Fantom/Nice style nullable types plus a "safe invoke" operator are a nice improvement over plain old null. But they are not composable (no ??T) and they are special purpose magic aimed at a very narrow domain instead of general purpose magic such as algebraic data types and higher order functions that are broadly applicable to huge problem domains.

So there you have it, Cedric, what are we missing in your post?

PostScript on Scala and Null

Which brings me back to Scala's null. For interoperability reasons Scala has a full blown Java-like null. But the experience that all the Scala developers I know have is that NPE is mostly only a problem when dealing with existing Java libraries because Scala programmers and libraries avoid it. Plus, Scala's "Option(foo)" promotes foo to either Some(foo) or None depending on whether it's null or not that at least enables the map/flatMap style of "safe invoke." It's far less than perfect, but about the best that can be done when dealing directly with Java.

For Haskell, null is a non-issue.

Friday, August 13, 2010

Sometime in 1977


"I know, let's open the book with a program that displays 'hello, world.'"


"Great idea! I'll start the patent search."

Monday, August 2, 2010

On Removing Java Checked Exceptions By Means of Perversion


What do you do in Java when you need to throw a checked exception in a context where it's not allowed, for instance when implementing a third party interface with no declared throws clause? If you're smart and its allowed you just write the code in another language. Otherwise you probably just wrap the bastard checked exception in an unchecked exception and be on your way.

But the wrapping solution a) requires an extra object allocation and b) unwrapping the original exception is a bit messy.

No, no, those justifications for what I'm about to do are horse shit rationalizations. The fact is I just want to pervert Java. I make it my job to pervert languages for their own good. Or at least, my own amusement.

Instead of "throw" meet "chuck" which turns checked exceptions into unchecked chucked exceptions. This post is an executable Java program.

Edit: This is an idea that can be traced back to Java Puzzlers by Bloch and Gafter and perhaps further. I've just refined it a bit with a bottom return and a way to catch the exception.



public class Unchecked {


The key to today's lesson in programming aberration is the fact that Java allows you to cast to a generic type parameter but can't actually check the cast at runtime due to type erasure. The following method does an unchecked cast from a Throwable to some generically specified subtype of Throwable and then immediately throws it. Neither the static nor dynamic type system ever have a chance to detect any deviant behavior such as casting to a "wrong" type. The code also lets the caller specify any return type. A previous article explains why that's useful.


   private static <T extends Throwable, A> 
     A pervertException(Throwable x) throws T {
      throw (T) x;


Then chuck puts some lipstick on that kinky little pig by telling it to act like it's throwing a RuntimeException. Et voilà, no need to declare the exception in a "throws" clause.


   public static <A> A chuck(Throwable t) {
      return Unchecked.
        <RuntimeException, A>pervertException(t);


Some sample code shows how to use chuck.


   public static int testChuck() {


We can't throw an IOException because it's checked


      // throw new IOException("checked, oh noes");


But we can chuck an IOException


      return chuck(new IOException("unchecked, hellsyeah"));


And, just like with a throw the next line won't compile because it's unreachable.


      // System.out.println("dead code");


If you never want to catch the exception or you want to handle it with a top level "catch Exception" then that's all there is to it. But if you want to catch and handle the IOException specifically then there's one tiny flaw in the system. Sadly, the following won't compile because IOException isn't statically known to be thrown by testChuck() and Java wouldn't want you to catch something that can't be thrown now would it?


//   public static int wontCompile() {
//      try {
//         testChuck();
//      } catch (IOException e) {
//         System.out.println("Why did you leave me with" + 
//                            " the sad clown of life?");
//      }
//   }


The fix is to add a way to tell the compiler what exceptions we might chuck, kinda like the "throws" keyword but this one chucks. Totally different. The chucks method threatens to throw a T but in a sudden fit of shame it does nothing - per a tip from a commenter it takes a class argument to avoid Java's hideous generic method invocation syntax.


   public static <T extends Throwable> 
     void chucks(Class<T> clazz) throws T {}


And here we go, we can catch our chucked exception.


   public static void main(String[] args) {
      try {

      } catch (IOException e) {
         System.out.println("Caught chucked exception " + 
                             e + ".");


Running that should display "Caught chucked exception unchecked, hellsyeah."

My depraved perversion work is done. Now I may rest. I leave you with this imponderable: how many unchecked checked exceptions will Unchecked.chuck() chuck?


Tuesday, July 20, 2010

When CONSTants Vary


Occasionally somebody will suggest that statically typed language X (Java, C#, Scala, F#, whatever) would be better off with C++'s "const", a type qualifer that supposedly makes objects immutable in certain contexts[1]. I'm not convinced. Const requires a fair amount of work and provides fairly small guarantees. This post is a bit of literate C++ program demonstrating some of the weaknesses in const.

There are a couple of holes that I want to dismiss quickly. First there's the weakness that const can be cast away. But hey, the C and C++ motto is "casting away safety since 1972." Blowing your leg off with a cast is half the fun of using the C derived languages. Second, there's the useful, if slightly misnamed, weakness of the "mutable"[2] keyword which can be applied to a field and says "this field can be mutated even in const contexts." Mutable is useful for things like memoizing.

But even ignoring those very deliberate issues, there are still a couple of fundamental weaknesses in how C++ deals with const.

For my example I'm going to use a simple toy class, a Counter that can be incremented (mutating its state) and then queried to see what the current count is.


#include <iostream>

using namespace std;

class Counter {
   int _count;

   Counter() : _count(0) {}
   void inc() {


A perfectly good use of const. The count() query doesn't change the state of this Counter.


   int count() const{
     return _count;



So far, so good. But we can easily destroy const with a bit of indirection. Imagine I need something that holds a pair of Counters and, for whatever reason, I need to do so via pointers.


class CounterPair {
   Counter *_c1;
   Counter *_c2;

   CounterPair() {
     _c1 = new Counter();
     try {
        _c2 = new Counter();
     } catch(...) {
        delete _c1;

   // technically should have copy constructor 
   // and operator=, but I'm too lazy

   ~CounterPair() {
      delete _c1;
      delete _c2;


These two methods query the state of the CounterPair and are safely const


   int count1() const {
     return _c1 -> count();

   int count2() const {
     return _c2 -> count();


Up to this point everything is golden. But these next two methods modify the state of the CounterPair, yet are marked const.


   void inc1() const {
     _c1 -> inc();

   void inc2() const {
     _c2 -> inc();


What gives? Well, C++ doesn't appear to understand the that state and memory are two different things. I didn't modify the memory that lies directly in one CounterPair envelope but I'm clearly modifying the state of the CounterPair. "Transitive constness" is one area where the D programming language does much better than C++.


An even more subtle issue happens with aliasing. The following function does not modify its first argument but does modify its second. The first is marked const.


void incSecondCounter(Counter const &c1, Counter &c2) {;

int main(int argc, char** argv) {
   Counter c;


But if we create a bit of aliasing then c1's referent gets modified


   Counter const &c1 = c;
   Counter &c2 = c;
   cout << c1.count() << endl;
   incSecondCounter(c1, c2);
   cout << c1.count() << endl;


In this toy program that looks silly, but in larger programs aliasing can be quite subtle. And aliasing is a tricky, tricky problem to deal with well at the type level so even the D language will have an equivalent hole in const. At best what const says here is that "the object won't be modified via that particular reference," which is a heck of a lot weaker than "the object won't be modified."


Const correctness requires a lot of typing of both the push-the-buttons-on-the-keyboard kind and the keep-certain-static-properties-consistent variety. Yet it gives back relatively little. Even without poking holes in const using casting or "mutable" it's easy to get mutation in something that's supposed to be immutable. C++'s variety of const doesn't deal with indirection at all well and aliasing is a tricky nut that would substantially alter a language's type system. Is const worth adding to other languages? I'm not sure.


  1. C'99 also has a const keyword with very similar semantics. So if you replace references with pointers and strip out the thin veneer of OO this article would cover C as well.
  2. Misnamed because fields are generally mutable even without the keyword. But I suspect the committee would balk at "mutable_even_under_const."


Friday, July 9, 2010

Code Monkeyism's Post Is Unfit For Serious Reading

Stephan Schmidt makes several points in Scala Is Unfit For Serious Development . Some are even valid. Unfortunately, the valid points are buried in over-generalizations and attacks on the attitudes of the developers and community.

It's unfit because the developers and the community are unwilling.

At the end of his little diatribe he thanks 2 members of the community for helping him with his problems. Sounds pretty willing to me.

1. The developers are interested in language research and writing papers, not in making a language for real development

So far from true. Yes they write papers. But real development is very much a goal of Scala. If it weren't then Java interoperability would be ditched yesterday. Java interop is a giant albatross around Scala's theoretical neck. Some examples of how Java interop fights Scala: null sucks, the jvm doesn't do general tail calls, structural types are hacked in using reflection, and traits/mixins require a rube-goldberg copy mechanism. Java interop is only a practical concern for real programmers who want to get stuff down now with existing libraries.

2. The version hell that is 2.8 - calling this version 2.8.0 is deceptive. As Python (3000) or Ruby (2.0) this should have been a major version called 3.0

Some truth. Ruby made nasty breaking changes in 1.9.1 and Python isn't immune to them. None-the-less, it really should be called Scala 3.0 to communicate the extent of the changes.

3. The developers do not care about binary compatibility, deprecated, soft migration or API compatibility. Things are changed on a whim - from one RC to the next, and they change major parts of the language/libraries in point releases.

There is a significant technical challenge to maintain binary compatibility given the fact that the JVM has no multiple-inheritance mechanism. But that shouldn't be confused with a lack of caring. Martin Odersky has promised that the next major release will focus on solving the binary compatibility issue once and for all. Deprecation - could have been done better. Definitely. Then again, Schmidt is playing with beta software where API INcompatibility are pretty much guaranteed.

4. They remove and add classes just as they see fit.

A repeat of point 3.

5. The community loves cutting edge. Instead of fixing bugs they upgrade to all new released RCs - so have you. It's called cutting edge for a reason: You'll cut yourself. And it's called bleeding edge because you will bleed.

An overgeneralization. "The community" is apparently represented by the library causing Schmidt his problem. Other developers seem to be more careful. lift, for instance, has been very careful about its migration policy and just released version 2.0 against the stable version of Scala.


Scala's binary compatibility sucks due to a real technical challenge interoperating efficiently with the JVM. The 2.8 series is really 3.0. Deprecation could be done better. One library that Schmidt used decided to fix a bug and compile against a beta version of Scala. Schmidt jumped into the beta version. A small bit of versioning hell ensued. From this Schmidt concluded that the Scala development team do not care and that Scala is unfit for development.

In the meantime, back in reality, the small Scala development team is continuing to move forward with a great language. The community continues to put out awesome libraries. Businesses like Novell, Twitter, and FourSquare continue to make bets on Scala.

Is Scala flawed. Hell yes. But if you think it's a great language then contribute. Help make it and its ecosystem better. Help the developers of libraries maintain bug fixes on stable branches, or help test for breaking changes and push for a deprecation model, or contribute constructively to the binary compatibility challenge. Don't take potshots against the developers from the sidelines.

Thursday, May 27, 2010

Who Will Throw the Hammer This Time?

Steve Jobs 1984 Who will throw the hammer this time?

During the 1984 Super Bowl, Apple released an ad where gray, drab people are transfixed on a giant screen with a talking head who tells them how utopian their uniform world is. A female athlete jogs in and throws a hammer at the screen, the screen explodes, and a new voice announces the release of the Apple Macintosh.

The ad riffed off of the George Orwell dystopian novell Nineteen Eighty-Four. Orwell's 1984 features a powerful central state that controls society absolutely. One of its mechanisms for control lies in saying one thing while meaning the opposite. The Ministry of Peace is in charge of war, the Ministry of Truth is in charge of the falsification and censorship, etc.

In an exchange with Gawker's Ryan Tate, Steve Jobs attempts some verbal judo but comes out with some near doublespeak of his own. Tate says "Revolutions are about freedom" and Jobs replies "Yep...Freedom from porn."

Look, I'm not advocating for porn. It seems to be doing just fine without me. But I am irritated by Jobs deliberately muddling "freedom" as in "in a democracy you have the freedom to vote" with the opposite "in a dictatorship you have freedom from worrying who to vote for." Combined with licensing that squashes Flash, Scratch and RunRev and the Apple world starts looking gray and drab. Meanwhile, Jobs explains that all this "freedom from" stuff is really what we want in his magic little utopia.

So we come full circle and I ask who will throw the hammer this time?

Post Script

BoingBoing has a video with a premise similar to my picture.

Tuesday, May 25, 2010

Anatomy of an Annoyance

The warning "Type safety : A generic array of T is created for a varargs parameter" is so irritating it makes me want to beat Java with a snoracle, an implement that allows defunct tech companies to breath while underwater. This article will explain what the warning is, where it comes from, and why it is pretty boneheaded. The warning also shows just how hard it is to extend an existing language without creating pain for your users.

My sample use case: anytime you build a Map in Java it takes approximately 6 metric tons of boilerplate.

Map<Integer, String> numbers = 
   new HashMap<Integer, String>(3);
numbers.put(1, "one");
numbers.put(2, "two");
numbers.put(3, "three");
// etc

It would sure be nice if that could be done in one line of code as in most sane languages. So let's fix it! First, we need the ubiquitous Pair class[1].

public static class Pair<A, B> {
   private A first;
   private B second;

   public Pair(A first, B second) {
      this.first = first;
      this.second = second;

   public A getFirst() {
      return first;

   public B getSecond() {
      return second;
   /* a real Pair class would also have equals and 
     hashCode, but for this code I don't need them */

And then, to make it more pleasant to use, we write a static function that can take advantage of Java's half-assed type inference.

public static <A, B> Pair<A, B> pair(A first, B second) {
   return new Pair<A, B>(first, second);

Finally, the pièce de résistance

public static <A, B> Map<A, B> map(
      Pair<A, B>... pairs) {
   Map<A, B> map = 
      new HashMap<A, B>(pairs.length);
   for(Pair<A, B> pair : pairs) {
      map.put(pair.getFirst(), pair.getSecond());
   return map;

And so, with faces smiling as the sunoracle shines down upon us, we use it

Map<Integer, String> numbers = 
   map(pair(1, "one"), pair(2, "two"), pair(3, "three"));

Ah, sweet sweet beauty[2]. Except one nasty, tiny, shoot-me-now-and-bury-my-carcass-in-downtown-Santa-Clara annoyance. That single lovely line of code that so innocently tried to use all our wonderful machinery is tagged with the warning "Type safety : A generic array of T is created for a varargs parameter." And the only way to "fix" it is to nuke the whole area from orbit with '@SuppressWarnings("unchecked"),' possibly hiding real problems in the same method. Gah.

History, Part The First

To understand the problem we have to go way back in history, at least as far as Java 1.0. Maybe even to Oak or to the egg and sperm that fused to make James Gosling. Whatever. In designing Java there was a question: since String is a subtype of Object shouldn't String[] be a subtype of Object[]. After all sometimes that's useful: if I have an array of Strings and you want to write a function that iterates through an array of Objects, aren't we good if I send you my Strings?

Well, no, we're not good. We're very bad. If String[] is a subtype of Object[] then Bad Things Happen™

void badThing(Object[] objects) {
   objects[0] = new Integer(42);

void test() {
   String[] strings = {"a", "b", "c"};

   // length of an integer?

If the above were allowed then we'd be trying to find the length of an Integer and that, um, won't work. Reading is okay but writing is bad. So to solve the problem the original Java designers made a pact with Satan. Instead of statically preventing code like that they added a dynamic check: anytime you store into a non-primitive array Java does an instanceOf-like check to make sure it's legit. In my sample code, the assignment is checked in "objects[0] = new Integer(42)". Failure will give an ArrayStoreException. That means every array store has to do expensive extra work and, more importantly to this article, at time of construction an array must be associated with a dynamic type tag. Like all pacts with Satan the negative consequences of this deal weren't obvious until much later.

History, Part the Second

With Java 1.5, Java programmers were finally able to move to the forefront of 1970's era technology with parameteric polymorphism, er, excuse me, generics. But see, there was this huge collection library and 3rd party libraries that extended it the hell are we going to make the future safe for the past?

The decision was to compile with "type erasure" - dynamic type tags no longer carry all the information in the static type. Now, there's nothing evil about type erasure. Some of my best friends are languages that compile with even more type erasure than Java. It's just that Java has "instanceof" and other similar things that don't play well with it. Relevant to this article we have the aforementioned pact with Satan where arrays need a dynamic type tag or Bad Things Happen™.

Concerned with Bad Things, the Java designers started spackling over some holes. The upshot: this code fails to compile with the static error "Cannot create a generic array of T"

public <T> T[] arrayOfT(T a1, T a2, T a3) {
   return new T[]{a1, a2, a3}; // bzzt static error!

Same deal with this, "Cannot create a generic array of Pair<A, B>".

public <A, B> Pair<A, B>[] twoPairs(
      A a1, B b1, A a2, B b2) {
   return new Pair<A, B>[]{pair(a1, b2), pair(a2, b2)};

But this is a very shallow spackling indeed. Keep reading.

History, Part the Third

Also with Java 1.5, the designers added varargs: the ability to pass an arbitray number of arguments of the same static type to a method. So convenient! But, uh, apparently the varargs guys and the generics guys didn't talk to each other until late in the game. See, the varargs folks decided to implement varargs as sugar over arrays rather than something that's actually type safe.

public void whatClassIsIt(Object... things) {

whatClassIsIt(pair(1, "one"), 
   pair(2, "two"), pair(3, "three"));

There's our old buddy erasure. The above code prints "class [LPair;" - array of Pairs. But note it's not dynamically known to be an array of Pair<Integer, String>.

Making Bad Things Happen™

Now we've got enough knowledge to screw things up nicely. We've got statically unsafe arrays that depend on runtime type tags to provide a reasonably early error. And we have an ability to create arrays that don't carry all the information needed to do the runtime check.

public static <A, B> Pair<A, B>[] pairs(
      Pair<A, B>... pairs) {
  return pairs;

Pair<Integer, String>[] pairs = 
  pairs(pair(1, "one"), pair(2, "two"), pair(3, "three"));
Object[] objects = pairs;
// no error in this assignment, statically or dynamically!
objects[0] = pair('c', 2.0); 

// arrgg ClassCastException

The code doesn't have an error at the point of assigment. The error does finally occur later when I attempt to use a value. That means that in real code the error could be very far from the real cause. The only diagnostic that prevents this Bad Thing™ from happening is the warning on the first line; our lovely "Type safety : A generic array of T is created for a varargs parameter".[3] And that warning has to occur in a place that doesn't know whether the use of the array is actually bad or not. Hence, our code's beauty (or at least what passes for beauty in Java) is shot.


Ah, if only. If only Java didn't have covariant arrays. Or, since it does, if only the designers had used something other than arrays to carry varargs. Woulda coulda shoulda. This article shows just how hard it can be to extend an existing language without shooting yourself and your users in the ass. Runtime checked array assignment + varargs in arrays + type erasure == pain.

Now you know why Scala's arrays are invariant.


  1. Q: What makes Java so manly? A: It forces every programmer to grow a Pair.
  2. Please don't rag me about how I wrote 17 lines of code to avoid writing 3 - presumably I'd reuse this infrastructure a lot.
  3. For an even nastier problem with the erasure scheme and arrays see this example of trying to write higher order code using generics and varargs.

Wednesday, May 5, 2010

Types à la Chart

I recently saw a debate on whether a certain language had a "weak" or "strong" type system. Naturally the debate ended in flames. As far as I could tell neither side knew what the other was talking about which isn't surprising as I've almost as many definitions of "weak" and "strong" typing as there are developers who use the words. Frankly the words have too many meanings to be useful.

If the first part of the problem is ambiguity then it's still only a start of the problems. Even if you get something crisp like "static" vs "dynamic" you've still managed to condense out a lot of useful information. Agda and Java live in incredibly different spaces even though they are both statically typed. Or how 'bout the world of difference between E and Ruby, two dynamically typed languages?

In this article I'm going to try to convince you that the space is far more complicated than the typical debates would allow. The design space is very highly dimensioned, perhaps infinitely so, but I'll boil it down to 2 important ones which none-the-less already show considerably more sophistication. Here's the set up: what can you know statically about an expression, like f(x), without dynamically executing/evaluating it and if all you know is static types? The two dimensions I use are 1) How "safe" is it - are there back channels that might violate abstractions? And 2) How rich and sophisticated can the static type be?

One big, fat caveat: this is almost certainly wrong in many details. I've listed a bunch of languages. There just aren't enough "21 days" in a lifetime to understand in detail exactly what each one can promise and how one type logic does or does not contain another. Naturally if you are an expert on one of these languages and I got something a bit wrong you will conclude I'm an idiot even if everything else is 100% right. Such is a the price of writing for a technical audience. In spite of any such errors I contend that even if some details are wrong the overall picture should be reasonably accurate and should convey my central thesis that this space is too complicated for binary distinctions.

Click to zoom

Chart of Static Type Properties


The horizontal axis is "power" or "sophistication" of the static type system's underlying logic. For this analysis I focused only on the ability of the static type system to describe actual data structures and computations. While it's interesting that C++, GHC Haskell, and Scala all have Turing complete type systems, none of their type systems seem to have the same ability to usefully describe computation as Agda's. Of course, I could be wrong. Feel free to bump the asterisked languages to the right accordingly.

The horizontal axis can also be read backwards in terms of type system simplicity.

The far left requires some explanation. The languages in this column tend to be dynamically typed which at first seems like a challenge for thinking about them in static terms. But there's a nice theoretical framework where you think of them as having one static type. A valid Scheme expression has "the Scheme type" and an invalid one does not. Different people who mean to emphasize different things will call these languages "untyped," "unityped", or "dynamically typed." For my purpose "unityped" gives the right connotation. This column corresponds to the untyped lambda calculus.

Moving to the right there are simple first order type systems. Such type systems are a bit richer than the unityped ones, but don't have any form of type abstraction: they don't allow a user to define types that will be parameterized by other types. This column corresponds to the simply typed lambda calculus.

Further right are second order type systems. These allow type abstraction qualified with "forall", and correspond to System F. In between first and second order is HM (Hindley-Milner) which allows full type inference but at the price of not allowing some second order things like higher rank polymorphism. Moving to the right are higher kinded languages, corresponding to System Fn up through System Fω

At the far right are dependently typed languages: languages that allows types to be parameterized by values[1][2]


The vertical axis is various kinds of "safety." For safety analysis I'm completely ignoring foreign function interfaces and the like. FFIs are useful and pragmatic but also make otherwise pretty safe languages no more safe than Assembly.[3]

You can also read this axis going the opposite direction in terms of a kind of operational power - Assembly isn't safe, but it lets you make the machine dance.

At the bottom end are memory unsafe languages. An expression like f(x) might very well totally hose memory.

A bit up from that is memory safe languages. Such languages have a few built in abstractions (like you can't write past the end of an array) that f(x) can't possibly violate. A couple of steps up is abstraction safety. That means that one developer's abstractions can be protected from another developer in some fashion. For instance, private variables can only be accessed through functions defined in a scope with access to the variable. Many popular "dynamic languages" are not abstraction safe because their powerful dynamic meta-programming facilities allow abstractions to be smashed. In between I've listed a few JVM languages as being configurably safe: the Java standard includes some pretty powerful reflection capabilities that allow many developer written abstractions to be violated, but the JVM standard includes a security mechanism that allows that to be turned off. Move these points up and down accordingly.

At the next level are capability safe languages. In most languages, any function call can have all manner of harmful side effects including formatting your hard drive or launching missiles. But at the capability safe level the expression cannot possibly do such crazy things unless granted that power in the form of an "object" that represents the capability to do so. While you cannot statically know whether f(x) has received a missile launching capability, you can statically know that if f(x) lives in a program that has not been granted missile launching capability then it will not launch missiles.

The pure level is for languages with a strict, statically enforced wall between side effecting expressions and non-side effecting ones. Examining the static type is sufficient to know the difference. Finally, the "pure & total" level is for languages where the expression f(x) is either guaranteed to terminate normally (no infinite loops, no exceptions)[4] or where static types indicate possible non-totality.

Random Notes

In many languages safety properties are enforced dynamically so a knee jerk reaction is "why is this stuff in a discussion about static properties." But remember the point was what I can statically know about an arbitrary expression f(x) without evaluating it. In Ruby I know with absolute conviction that, barring implementation bugs and foreign function interfacing, any arbitrary f(x) absolutely does not write past the end of an array. In C I don't know that.

It might seem odd to put Java further to the right than Haskell '98 but it's true. Haskell'98 has a slightly extended Hindley-Milner type system while Java has a kind of warped System F<: (use nesting to create rank-n polymorphism). Of course, a chart that ranked type systems on "bang for the buck", versatility vs ease of use, would rank things quite differently.

C# gets stuffed into the unsafe row because of the "unsafe" keyword and associated pointer machinery. But if you want to think of these as a kind of inline foreign function mechanism feel free to bump C# skyward.

I've marked Haskell '98 as pure but GHC Haskell as not pure due to the unsafe* family of functions. If you disable unsafe* then jump GHC Haskell up accordingly.

Assembly is problematic. In a certain weird sense some Assemblies are "memory safe" because they do protect their language abstractions by virtue of having almost none.


"Weak" and "strong" mean so many things that they mean nothing. "Static" and "dynamic" mean something but don't really convey many differences. This chart is not the end-all-be-all of language categorization, but it is the beginning of the start of an inkling of a glimpse at how rich the space really is. Hopefully this will make static typing advocates understand that dynamic typing doesn't necessarily mean total insanity while dynamic typing advocates can begin to see the range of sophisticated expression afforded by many modern statically typed languages. I highly recommend "What To Know Before Debating Type Systems" for more on the subject.


Inspired in part by this post and a follow up discussion, David MacIver is investigating the dimensions of language design that are actually useful in picking the right tool for the job.


  1. Type theory folk will note that I've pretty casually collapsed two legs of the "lambda cube" into one dimension of power. That's not very valid but that seems to work out in practice. It's hard to find a sophisticated dependently typed language (types indexed by values) that isn't also very rich when it comes to types indexed by types.
  2. C++'s ability to parameterize templates by a few forms of literal shouldn't be confused with dependent typing.
  3. Technically there's no such thing as "Assembly;" it's a family of vaguely related languages with semantics tied extremely closely to underlying machines.
  4. A total language can of course crash due to resource exhaustion or hardware failure. There's no way to prevent that.

Monday, April 12, 2010

C Is Not Assembly

In my last article I dug down into some of the joys of undefined pointer behavior in C. But I did it in the context of an architecture that most developers aren't too likely to ever see again, the Intel 8086. I wanted to show that this stuff matters even with more mainstream architectures because compilers are free to do a lot of non-obvious things. C is not assembly.

The United States Computer Emergency Readiness Team (US-CERT) "is charged with providing response support and defense against cyber attacks for the Federal Civil Executive Branch (.gov) and information sharing and collaboration with state and local government, industry and international partners."

With a U.S. Department of Homeland Security badge on their page you know they're serious. When you overflow your buffers the terrorists win. So you'd think they'd take C seriously.

I found a real WTF gem caused by programmers treating C like assembly. Vulnerability Note VU#162289

"In the C language, given the following types:

        char *buf;
        int len;

some C compilers will assume that buf+len >= buf. As a result, code that performs wrapping checks similar to the following:

len = 1<<30;
if(buf+len < buf)  /* wrap check */
  [...overflow occurred...]

are optimized out by these compilers; no object code to perform the check will appear in the resulting executable program. In the case where the wrap test expression is optimized out, a subsequent manipulation of len could cause an overflow. As a result, applications that perform such checks may be vulnerable to buffer overflows."

The advisory is careful to admit that compilers are free to do just that. Here's why: greatly simplified, the C standard says a pointer must point at a valid object or just past the end. Any pointer arithmetic that might cause a pointer to step outside those bounds yields undefined behavior. Now, there's some code missing from their snippet, but I have to assume that the compiler knows that buf points at the beginning of something and not into the middle. So by definition either buf + len >= buf or the program is free to do anything up to and including launching shoulder mounted kitten missiles at Capitol Hill.

Still, there is a WTF to lay on the compiler writer here. If a programmer writes an "if" test then presumably he or she had some reason to believe that sometimes the test might be true. Before optimizing away the conditional the compiler really should have issued a warning.

In order for this code to have any hope of working a few assumptions must hold: sizeof(int) <= sizeof(char *), overflow must happen "silently", etc. But there's another major assumption here: the buffer pointed to by buf must be located at the end of its address space. With a check like this, if there are any objects located higher in the same overflow segment then those objects are getting some kitten missiles. So another WTF is a developer making an assumption about how a compiler works in the face of undefined code.

Now, there are a few scenarios where all these assumptions might be justified. For instance, if you're targeting some special purpose embedded device then memory layout might be well understood. In such situations, the optimization performed by the compiler might be shocking indeed, even if technically permissible.

The problem is that the developer is thinking at the assembly code level but the C standard says the compiler doesn't have to "think" the same way. In assembly the distance between what you write and the object code generated is pretty small (barring some kind of complicated macro magic). In C the distance is quite a bit larger.

Repeat after me, C is not assembly.

The Solution?

In my mind the real WTF is the US-CERT's proposed solution.

"Cast objects of type char* to uintptr_t before comparison. The faulty wrapping check listed above would be written

#include <stdint.h>
if((uintptr_t)buf+len < (uintptr_t)buf)

Alternatively, developers can use size_t on platforms that do not provide the uintptr_t type."

Yes, that's right, the proposed solution is to replace one set of undefined behavior with another. Well...WTF? Didn't we get here because the compiler was free to do anything in the face of undefined semantics?

C is not assembly. Not even when you do some casting. Pointers might be segmented in such a way that overflow behavior for pointers is totally different from that of integral types. Or sizeof(int) > sizeof(uintptr_t). Or the cast to an integral type could do anything at all like invert the bits. Or shoulder mounted kitten missiles. Whatever.

The Real Solution

The best solution is "don't do that." Give the buffer a size and compare len to that size. The assumption that the buffer is the last object in its address space is just playing with fire. But I'll assume they really did need to use memory at the end of the address space. Fine. What's the solution?

It's this: you can't say "end of address space" portably in C so tread carefully when you use undefined behavior. Minimize the amount of undefined behavior you depend on. Pull the non-portable stuff out into separate compilation units and document that the assumptions. Maybe even write the non-portable stuff in assembly where you know what's going to happen, but at least write a unit test that ensures that the compiler behaves the way you think it does. And always, always remember that C is not assembly.

Thursday, April 8, 2010

Good Math, Bad Pointer Math

Heard about the new C bandwagon? According to April's Tiobe report C has eclipsed Java as the number 1 most popular language by a whopping 0.007%. So while all the rest of you Java suckers have just lost all professional relevance, I'm OK. I cut my professional teeth on C.

Sometime ago on Twitter I gave in-jest advice to new C programmers: all your code and data are in one giant mutable array indexed by pointers, good luck. David MacIver ( responded that the C standard didn't say that. Whereupon I swore a mighty blood oath of vengeance upon him for ruining my joke with facts.

Vengeance is hard so I'll get started later, maybe tomorrow or next week. Right now I'll write a bit about the facts behind pointers. One of the trickiest aspects of writing portable C programs is that a great deal of the specification says that the behavior of certain constructs is undefined. Old C hands sometimes point out some undefined behavior in a C newbie's program whereupon the newbie fires up gcc or msvc or whatever and shows that the code behaves exactly as it "should." Who's right? What does it mean to be undefined?

C was designed to be both 1) portable across a wide range of platforms and 2) work at a very low level. Those two goals are pretty much in opposition - low level details can vary widely across platforms. Don't think MacIntel vs Wintel, think IBM System/360 vs DEC PDP-11 vs Motorola 68x vs x86. And so the C standard says that certain things are syntactically valid but the behavior can depend on compiler, options, hardware, and the day of the week. For example according to the standard dereferencing a null pointer is undefined. It may segfault; silently appear to succeed but actually return garbage; update your Facebook page with your masturbation frequency; or none of the above. It's all good.

Why should you care about portability if you aren't planning to port? Well, for one thing, undefined behavior can vary across versions of the same compiler or even different compiler options so planning not to port is not a realistic option. For another, even when you think you know what is going to happen on your target platform you might be wrong in some cases - some security critical cases, even. And finally, if you're good at hacking C there's a good chance you'll someday be called upon to write code for some device with a very different architecture. You might as well get into good habits now.


While I was unable to find an example of Facebook updating I was able to find examples of two other distinct null pointer behaviors. The following program derefs a null:

#include <stdio.h>

int main() {

On one common platform the program caused a segfault as you might expect. But on a second, mystery platform (to be revealed shortly) the program ran to "successful" completion with the output "║►╤". Yup, garbage, not crash.

Here's another one. The standard says that a null pointer can textually be written with a 0 in cases where the compiler can statically prove that you want a pointer, but it does not guarantee that a null pointer will in fact consist of a bit pattern of all zeros. So if you declare

const int n = 0;

Then the following will always be true

NULL == 0;
n == 0;

But the following might or might not be true

NULL == (void *)n;
(int)NULL == n;
(int)NULL == 0;

Unfortunately, it's a bit difficult to find a platform where the null pointer is not a pattern of 0 bits so this non-portable code is surprisingly portable, but see the comp.lang.c FAQ.

Good Math, Bad Pointer Math

Here is an example of some code that does behave very, very differently.

#include <stdio.h>

int main() {
   const unsigned long base = 0xfffc;

   /* table header */
   printf("\t\tlong\t\tchar *\n");

   /* size of data types */
   printf("size\t\t%d\t\t%d\n", sizeof(base), sizeof((char *)base));

   /* the core table */
   for (unsigned long i = 0; i < 8; i++) {
      unsigned long n = base + i;
      char *p = (char *)base + i;
      printf("%08lx + %ld\t%08lx\t%08lx\n", base, i, n, p);

C aficionados will be horrified that I'm using non-portable %08lx instead of portable %p to output a pointer, but using %p would reveal too much. On the first, common platform the result was

  long  char *
size  4  4
0000fffc + 0 0000fffc 0000fffc
0000fffc + 1 0000fffd 0000fffd
0000fffc + 2 0000fffe 0000fffe
0000fffc + 3 0000ffff 0000ffff
0000fffc + 4 00010000 00010000
0000fffc + 5 00010001 00010001
0000fffc + 6 00010002 00010002
0000fffc + 7 00010003 00010003

So both longs and pointers are 4 bytes (32 bits) and both long and pointer math are pretty ordinary.

Now, on the second, mystery platform with one set of compiler options I get this

  long  char *
size  4  4
0000fffc + 0 0000fffc 0000fffc
0000fffc + 1 0000fffd 0000fffd
0000fffc + 2 0000fffe 0000fffe
0000fffc + 3 0000ffff 0000ffff
0000fffc + 4 00010000 10000000
0000fffc + 5 00010001 10000001
0000fffc + 6 00010002 10000002
0000fffc + 7 00010003 10000003

Still 32 bits but pointer math is, um, strange. 3 nibbles (12 bits) have been skipped. It gets perhaps weirder with a different set of compiler options on the mystery platform.

  long  char *
size  4  4
0000fffc + 0 0000fffc 0000fffc
0000fffc + 1 0000fffd 0000fffd
0000fffc + 2 0000fffe 0000fffe
0000fffc + 3 0000ffff 0000ffff
0000fffc + 4 00010000 00000000
0000fffc + 5 00010001 00000001
0000fffc + 6 00010002 00000002
0000fffc + 7 00010003 00000003

So now my 32 bit pointer is wrapping around at a mere 16 bits. And that's a clue to our mystery: we're kicking it old skool.

Mystery Revealed

For my first, common platform I'm using "gcc -std=c99" on 64 bit Ubuntu x86 but compiling using "-m32" to force 32 bit compilation. The second, mystery platform is the Watcom C compiler in C99 mode (wcc -za99) on the FreeDOS MS-DOS compatible operating system running in a VirtualBox virtual machine.

Using DOS keeps the machine in "real mode" - which basically means the chip (or virtual chip) behaves as an 8086[1]. Now, the 8086 is a peculiar beast. It's a 16 bit chip - registers are all 16 bits and most instructions work with 16 bit words - but it has a 20 bit memory address width. How do you squeeze 20 bits into a 16 bit word? Well, internally the 8086 uses a highly optimized form of data compression based on lzw that...

No, wait, that's not right. The answer is that the 8086 un-squeezes 20 bits into 32 bits - two 16 bit words instead of one. But here's where it gets weird. Instead of saying "we'll allow 32 bits of addressing but make it a hardware fault if anything above the low 20 bits is set" the 8086 designers said, well, this:

  XXXX0  the "segment" shifted left by 4 bits
+  XXXX  the "offset"
  XXXXX  final 20 bit address

A segment register is shifted left 4 bits and then added to an offset register to get an actual address. So now it's clear why 0x0000fffc + 4 = 0x10000000 with one set of compiler flags: the segment has been incremented by 1 in the bits that count[2]. But why is the result 0x00000000 with a different set of compiler flags?

In order to get the 0x0000fffc + 4 = 0x10000000 behavior the machine code emitted by the compiler has to keep track of overflow - so every pointer add is not just a register add, but a register add plus an overflow check and perhaps a second register update. There may also be a move of two registers to/from memory instead of just one. That could be quite a performance hit on a lowly 4.77MHz machine (the original clock speed on the IBM PC).

So, as an optimization, most C compilers for 8086 real mode allow you to say that no structure is ever going to be larger than 2^16 bytes (64K) so that the emitted code doesn't always need to do all those gymnastics. The upshot: pointers silently overflow at 16 bits. 0x0000fffc + 4 = 0x00000000 and 0x1000fffc + 4 = 0x10000000.

If you're playing along at home the first result was achieved using the -mh ("huge" memory model) compiler flag and the second one was with -ml ("large" memory model). Both were linked with "wlink name test file test.obj". For more about the gory details of real mode memory models see 16 bit memory models[3].

The Standard

Back to earth now. The C standard is fine with all of the above behavior. As a gross oversimplification the standard says that

  1. A pointer must be NULL or must point at something valid or must point just past the end of something valid. Pointer math that would result in a pointer that is outside of those bounds is undefined. It might work like any of my examples, or it might put a phone order in for pizza.
  2. A dereference of a NULL pointer or a pointer does not point at something valid, even if the pointer is only pointing one step past the end of something valid, is also undefined. You pray for a crash.

In short, the C standard doesn't pretend that your program is one giant array. It pretends each individual allocation creates island unto itself (and may be part of a larger island or composed of smaller islands). The C standard is pretty quiet about what can happen between those islands. It's just that in modern platforms memory tends to behave like one giant flat mutable array so it's easy to assume that that's part of the C model.

One way to look at it is that the hypothetical C standard memory model treats pointers somewhat the way higher level languages treat references and array indexing, except that those higher level languages define bad behavior as either being statically impossible (no pointer math) or dynamically checked (null pointer exceptions and array index checks) where the C standard just says "undefined."

Obviously looping around some arbitrary pointer location is not going to be portable, but this stuff has less obvious consequences. Code like the following is not guaranteed to work

char *buf = malloc(BUF_SIZE);
char *end_of_buf = buf + BUF_SIZE;


/* check for buffer overrun */
if (buf + offset >= end_of_buf) {
  /* buffer overrun */
} else {
  /* everything is okay */

As the loop above showed, buf + offset might in fact be less than buf on an overrun. The correct answer is to compare offset with BUF_SIZE.

In Defense of Undefined

When people explain C's choices it's common to say that it's about implementation efficiency. Indeed that's a huge part of it. But remember that C was specifically designed to allow very low level access; to allow much (though by no means all) of what assembly allows. So C's ability to write to arbitrary bits of memory allows useful things like talking to memory mapped devices. That kind of code is not portable, but there are important bits of some kinds of programs that aren't going to be portable no matter what you do. The trick is to keep those bits walled off and as small as possible.

So be careful with your pointers and I'll see all you former Java programmers in the unemployment line as I drive by in my new car with the custom plates "PTRMATH%saccount=34389983829 password=h3ll0k1ttySegmentation fault.


  1. Yes, I'm keeping a "virtual" machine in "real" mode. I'm just as confused as you are.
  2. Actually, one consequence of this funky scheme is that the same location in memory can be addressed multiple ways. 0x00120 + 0x0003 = 0x00123 and 0x00100 + 0x00023 = 0x00123. It's just that in the huge memory model the compiler's emitted code for pointer math keeps things canonical by using only the highest 4 bits of the segment.
  3. For even more weirdness associated with this segmentation stuff read a bit about the A20 Line, something that causes heartburn to bootloader writers to this day.

Monday, March 8, 2010

Robert Fischer Finally Admits that Scala is Functional

Robert Fischer says

If your language doesn’t lead people to re-discover point free programming at least in the small, then the language really isn’t taking function manipulation and functional language type conceptions seriously.

Thus Fischer has finally recognized that Scala is functional.

val double : Int => Int = 2 * // point free function definition
List(1,2,3) map double /* removes one point from List(1,2,3) map {x 
   => double(x)} */

Thank you Robert for finally getting to the truth.

Oh, the Underscore

Now, Robert may object because I didn't combine map and double into one super function, mapDouble. Okay

val mapDouble : List[Int] => List[Int] = _ map double

Crap, there's an underscore. Is it a "point"? Good question. And the answer is no. In fact, what it is is a hole in the expression. Oleg Kiselyov, somebody who actually knows about functional programming, says they're related to delimited continuations and what could be more functional than delimited continuations?

Also, Erlang Remains Unfunctional

For some reason Fischer really hates on Erlang. Point-free is definitely not Erlang's strong suit at all due to it's ability to overload on arity. I don't know why the Erlangers don't go picket his blog.

Post Script on Converses

David MacIver points out (correctly) that my post assumes that a statement implies its converse. To that I say "pffft, write your own damn parody, David!"