Saturday, January 9, 2010

On Erwann Wernli's idea of having the VM take care of dependency injection

My dear friend Erwann asks in his blog whether dependency injection should not be the task of the virtual machine. He considers code like this:

List l = new ArrayList();

He then notices (correctly, to my mind), that this statement contains more detail than the developer meant to express. "Whether the list is a new one or not, I don’t care, all I need is an empty one," Erwann writes. Fair enough.

There's real problems connected with the observation. The Java specification rules that new has to allocate a new object. Hence, the method

static void slack() {
new Object();
return;
}

cannot be optimized into a nop. Now, I believe you've already thought of a code snippet that cannot be optimized due to this rule. And of course it gets worse if the language forces us to use new all the time. it is the only way to get an object, even though we would have been fine with an old one, should it fulfill our criteria.

Here's the solution he proposes:

List l = obtain list ( size=0 );

To undo the awkwardness, he proposes hiding this monstrum in syntactic sugar, although it isn't clear to me what the sugar should look like in this instance.

Besides avoiding the new operator, his proposal leaves the concrete type of variable l open. He now proposes that depending on the "context," the VM can do voodoo to determine exactly which class may be most appropriate at the point in time. As an example, he cites changing the default FileInputStream to a zipped version of itself, as disk space runs low.

I find his thinking interesting, because it asks the good old question of who's responsible for what in programming. Having the machine figure stuff out for you is great if the figuring out is easy or secondary. Everyone loves Ruby's ActiveRecord, which creates the database table belonging to an entity class of yours all by itself. But choosing the right kind of a list? That's harder. There's a disadvantage to having the machine figure stuff out for you: your program gains in being opaque. The text-file that stores your program code would diverge further from the code to be executed at run time, because important decisions about the code are delayed to the run-time of the program. Since us humans can only see text, this is a real drawback.

I'm obviously sceptical to ubiquitious dependency injection, and I'm not the only one. Here's what Russ Olsen has to say about the matter (In his book Design Patterns in Ruby):

The best way to go wrong with any of the object creation techniques that we have examined … is to use them when you don't need them. Not every object needs to be produced in a factory. In fact, most of the time you will want to create most of your objects with a simple MyClass.new. … Remember, chances are You Ain't Gonna Need It.

Well, we have to balance our skepticism against what voodoo may bring us. Suffice it to say that I don't really believe that I want the VM to have any say in whether I zip my files or not. If I write a file to the disk that then the bash will assume to be a plain text email message, it better not be zipped on 5 % of my clients' computers. Another advantage of the voodoo is that we may also get to re-using objects that are laying around anyhow, waiting to be garbage collected. There's charm to that idea, although I wonder if the picture is as rose as Erwann paints it. After all, you'd need to find an object to be disposed first. Object Pooling sucks up memory and CPU cycles for their management.

Let me summarize that I'm skeptical about the solution that is proposed. Nonetheless, Erwann raises legitimate and important questions, such as: Don't our languages force the developer to supply and commit to more detail than the developer wishes? A mismatch between the mental model of the developer and the programming language is a flaw that deserves our tending to.

There's one little afterthought. Erwann claims that the mismatch he identified can be tackled only by making both language and VM aware of the problem. I believe the strength and beauty of Smalltalk is to come a far way in allopwing new ways of doing things, without having to dig to the VM and the language (See the blog post by Benjamin Pollack: Your Language Features Are My Libraries). Well, I think that Newspeak's late binding of class names could implement all of Erwann's ideas in a library. Here's the idea: If you type Array.new, then you may expect Array to point to a specific and global class in a namespace that is the same everywhere. What if all lookups had to go through a "lobby," as it were, and each module could not demand the module, it would have to be passed. Then, we could have the lobby figure out what List.new is supposed to mean, and have it be context-dependent. (Newspeak is a dream of a programming language. It breaks my heart to see how it isn't used widely. See the Wikipedia article for an overview.)

Summary

Programming languages do not currently reflect the mental model of the developer in instructions as easy as

List l = new ArrayList();

They specify more details than the programmer wishes to, and as a result, optimizations and opportunities for automation are lost. I suspect that Newspeak would be a great platform for experiments, and I'd like to thank Erwann Wernli for sharing his concerns. His blog is great by the way. Please go and read it!

Here is Erwann Wernli's article: Erwann Wernli: Should DI and GC be unified?.