Research, huh! What is it good for?

...absolutely nothing, according to this screed. Yes, I know it's half a year old or so, but I only just came across it. Which is a nice segue, because it (and that I felt the need to make that qualification) partly illustrates something that annoys me not a little about the industry: a short memory. Also, not looking stuff up. While the overall message (that contemporary Comp Sci research is a titanic fraud that should be abolished) is both shrill and apparently right-libertarian propaganda, I have a degree of sympathy with it.

We are asked to consider those mighty figures of the past, working at institutions like the MIT AI Lab, producing "fundamental architectural components of code which everyone on earth points their CPU at a zillion times a day". OK. Let's consider them. And it's admitted that "some of [them] - perhaps even most - were, at some time in their long and productive careers, funded by various grants or other sweetheart deals with the State." No kidding.

Go take a look at, for example, the Lambda the Ultimate papers, a fine product of MIT. Now, MIT is a wealthy, independent, private university. So who paid for Steele and the rest to gift the world with the profound work of art that is Scheme? AI Memo 349 reports that the work was funded in part by an Office of Naval Research contract. The US Navy wanted "An Interpreter for Extended Lambda Calculus"? Not exactly. In 1975 "AI" meant something rather different and grander than it does today. And it was largely a government-funded exercise. This Google talk gives a compelling sketch of the way that The Valley is directly a product of the military-industrial complex, that is interventionist government funding. Still today, the military are a huge funder of research, and buyer of software development effort and hardware to run the results upon, which (rather indirectly) pushes vast wodges of public cash into the hands of technology firms. Or even directly: Bell Labs, for instance, received direct public funding in the form of US government contracts.

In the UK, at least, the government agencies that pay for academic research (of all kinds) are beginning to wonder, in quite a serious, budget-slashing kind of way, if they're getting value for money. So, naturally, the research community is doing some research to find out. One reason that this is of interest to me is that my boss (or, my boss's boss anyway) did some of this research. Sorry that those papers are behind a pay gate. I happen to be a member of the ACM, so I've read this one and one of the things it says is that as of 2000 the leading SCM tools generated revenues of nearly a billion dollars for their vendors. And where did the ideas in those valuable products come from? They came, largely, from research projects. What the Impact Project is seeking to do is to identify how ideas from research feed forward into industrial practice, and they are doing this by tracing features of current systems back to their sources. Let's take SCM.

The observation is made that the diff [postscript] algorithm (upon which, well, diffing and merging rely) is a product of the research community. From 1976. With subsequent significant advances made (and published in research papers) in 1984, '91, '95 and '96. Other research ideas (such as using SCMs to enforce development processes) didn't make a significant impact in industry.

Part of the goal of Impact is to:
educate the software engineering research community, the software practitioner community, other scientific communities, and both private and public funding sources about the return on investment from past software engineering research [and] project key future opportunities and directions for software engineering research and practice [and so] help to identify the research modalities that were relatively more successful than others.
In other words, find out how to do more of the stuff that's more likely to turn out to be valuable. The bad news is that it seems to be hard to tell what those are going to be.

I focus a little on the SCM research because that original blog post that got me going claims that
most creative programming these days comes from free-software programmers working in their spare time. For example, the most interesting new software projects I know of are revision control systems, such as darcs, Monotone, Mercurial, etc. As far as Washington knows, this problem doesn't even exist. And yet the field has developed wonderfully.
I would be very astonished to find that a contemporary (I write in early 2008–yes, 2008 and I still don't have a flying car!) update of that SCM study would conclude that the distributed version control systems were invented out of thin air in entirely independent acts of creation. They do mention that the SCM vendors, when asked, tended to claim this of their products).

The creation of Mercurial was a response to the commercialization of BitKeeper, and BitKeeper would seem to have been inspired by/based on TeamWare. Those seem to have all been development efforts hosted inside corporations, which is cool. I'd be interested to learn that McVoy at no time read any papers or met any researchers that talked about anything that resembled some sort of version control that was kind-of spread around the place. The Mercurial and and bazaar sites both cite this fascinating paper [pdf] which cites this posting. Which tells us that McVoy's approach to DVCS grew out work at Sun (TeamWare) done to keep multiple SCCS repositories in sync. Something that surely more people that McVoy wanted to do. SCCS was developed at Bell Labs (and written up in this paper [pdf], in IEEE Transactions in 1975)

One of the learnings from Impact is that what look like novel ideas from a distance in general turn out, upon closer inspection, to have emerged from a general cloud of research ideas that were knocking around at the time. The techniques used in the Impact studies have developed, and this phenomenon is much more clearly captured in the later papers. So what does that tell us?

Well, it tells us that its terribly hard to know where ideas came from, once you have them. And that makes it terribly hard to guess well what ideas are going to grow out of whatever's going on now. So perhaps there isn't a better way than to generate lots of solutions, throw them around the place and see waht few of them stick to a problem. Which is going to be expensive and inefficient–upsetting for free-marketeers, but then perhaps research should be about what we need, not merely what we want (which the market can provide just fine, however disastrous that often turns out to be). Anyway, once I heard somewhere that you can't always get what you want.

Back to that original, provocative, blog posting. It's claimed that, as far as the problems that the recent crop of DVCS systems address that "As far as Washington knows, this problem doesn't even exist." Applying a couple of levels of synecdoche and treating "Washington" as "the global apparatus of directly and indirectly publicly-funded research", the it would perhaps be better to say that "Washington thinks that it already payed for this problem to be solved decades ago". Washington might be mistaken to think that, but it's a rather different message.

No comments: