peripatetic axiom: 2008

Interview form Agile 2008

At Agile 2008 I was interviewed by Amr Elssamadisy for InfoQ, and the video is now up.

Since it's the holidays I get to tourist around London at bit in a way that I tend not to much through the rest of the year. Today I took in the Rothko show at Tate Modern. It's the first time in a long time (maybe ever, or at least since Rothko gave them away) that so many of the colour field paintings have been in the same place at the same time.

Although these paintings are (at first sight) relatively featureless the viewer is intended to get right up close to them, to feel as if they are in the painting, to be overwhelmed. Although they are not figurative or representational they do carry a lot of content, a lot of emotional impact. There were people sitting in the room where the Seagram murals are, completely transfixed by the paintings. People have been known to burst into tears at the sight of these works.

This sort of modern art is perhaps one of the easiest for the "my four-year-old could do better" school of art critic to avoid thinking about. Of course, their four-year-old couldn't. One room of the exhibition goes some way towards explaining why not. These paintings are on (very) close examination revealed to be extremely complicated objects. Rothko put a lot of technique into these paintings: a lot of careful choice of materials, a lot of careful surface treatments, a lot of careful brush work. In the language of the curators most of them aren't a "painting" at all. They are mixed media on canvas, so complicated is the structure.

But none of this shows up when one stands before the works, only the emotional effect comes through. That's mastery. As with painting, as Steve says over here with reference to music (and code), it's all about the expressiveness.

Craftsmanship

If you read this blog you might be the kind of person who will be interested in the Software Craftsmanship conference being planned for next year in London. More proposals are needed to round out the programme, so get your thinking caps on.

XP Day London 08

Xp Day is done for another year.

I think that the incorporation of a large Open Space aspect worked, with some notes taken for doing it better in future. The programmed sessions seemed to be well received too.

Notes from many session are going up on the wiki, you might like to look, comment or contribute.

Deployment?

I'm sitting here in Zurich with some colleagues working on some guidelines for teams who are starting to use Agile in an evironment formely exclusively RUP-based. We have a stack of Agile books, all by big-name authors, from which we are farming references for them to use. Not one of these books has anything (so far as we can find) to say about deployment, actually getting the software into the hands of the user. The very word is not in the index of any of them. Oops.

"just work"

There's an idea in the management world that some work is "just work". I once had a holiday job in a factory that involved a good deal of one kind of "just work"—unpack a box of bottles of nose spray originally destined for one market and re–label them for another, put them into a different box. Well defined tasks, repeated many, many times. At the limit "just work" comes close to the physicist's idea of work: it's what happens when a force displaces its point of application.

Over the last century or so "just work" has been the topic of intensive study. Techniques for organising the execution of "just work" with the goal of maximum output are very well developed. But what is "just work" for programmers? The question arose in the comments attached to this discussion of distributed development. It also relates to some thoughts I've been having recently about the rush to adopt Lean practices in some quarters.

Andres says that

In knowledge work, “just work” is learning. [his emphasis]

If that's true then I think we have to say that a lot of contemporary development activity isn't knowledge work. Wiring up GUI controls? Transducing between HTTP requests and SQL queries? Mashing up web services? You have to learn how to do these things at all, it's true. But what do you learn while doing it once you know how? What learning takes place that is the work?

I'm reluctant to accept "finding out how to hack your way around the deficiencies of the current framework" as a form of knowledge work. Meanwhile, this maybe does give me a way to think about a curious claim (see comments #3 and #4 on that posting) from the Lean camp:

A pair team has to produce twice as much output at a given level of quality as two solo programmers in order to be worthwhile.

What would we have to believe, in order to think that this claim were true? I think that we would have to believe that the paired programmers are engaged in quite a strong sense of "just work". What is it that a pair of programmers necessarily do half as much of because they are pairing? I can't think of anything but typing. Are keystrokes really what it is that programmers do that adds value? At a slightly higher level of sophistication we might say (as Corey in fact does) that the measure of value added is function points. That's a much better measure than typing (which implies code)

Measuring programming progress by lines of code is like measuring aircraft building progress by weight.—Bill Gates

but still gives me slight pause. Function points are still a measure from the technical solution space, not the business problem space. One of the things that makes me a bit nervous about the rush to Lean is the de–emphasis on conversations with the user/customer (which Leaners have a habit of dismissing as wasteful "planning") in favour of getting tons of code written as fast as possible.

What I really want to see is getting users empowered to make money (or whatever it is they value) as fast as possible. That's knowledge work. And that requries learning.

Can you tell your POJO's from your NOJO's?

Ivan describes a nice distinction between Plain Old Java Objects in general and Non Object-oriented Java Objects in particular.

Certain (web, in particular) frameworks lead on in the direction of NOJO's. It begins with the DTO (just one kind of NOJO) and spreads from there. Is that good or bad? Can't say in the completely abstract, but if your current framework does lead you that way have you given thought to what design choices and architectural stances your are being lead to? Do you like them?

Another take on interviewing

Very nice posting here from Simon Johnson regarding the selection of candidates for programming jobs.

His blog doesn't allow comments, so I'll follow up here: The only real criticism I'd have of Simon's approach is that the list of desirable qualities in a candidate is backwards. That is, it should read

The person in front of you will complement your pre-existing team
The person in front of you can actually write a program
The person in front of you is smart

If the candidate won't complement your team it is of no interest whether or not they can program. Even if they are the semi–mythical 100x productive programmer (hint: those folks are not applying for your position).

If they can't write a program (of the kind that your company finds valuable) it is of no interest how smart they are. Folk wisdom in many parts of the industry suggests that enough smart can outweigh any other shortfall in a programmer. This is only true in very, very limited and highly specific circumstances (hint: you are not in the business of writing hard real–time life–critical massively–parallel seven–nines embedded control software for the black hole at the centre of the galaxy).

If you can get along with them, and if they can code, smarter is preferable to dumber. Your only problem now is to find some way to recognise "smarter", which turns out to be very, very hard indeed.

Registration open for XP Day London 'O8

Registration for XPDay is now open. XPDay sells out each year, so please book now. We operate a waiting list once the conference is full but only a handful of places subsequently become available.

The full cost of the conference is £350 but the first 50 people to book get a special “Early Bird” price of £275.

The few programmed sessions are set, and we encourage registered attendees to think about the topics they'd like to see discussed in the Open Space sessions. Attendees have the option to prepare a 5 minute "lightening talk" to promote their preferred topics, and these will go into the Open Space topic selection..

I look forward to seeing you there.

Programming Interview Questions

There's a cottage industry devoted to devising questions to ask folks applying for "programming" jobs, to collating such questions and then posting answers to them on the internet. I was reminded of this recently when an announcement of a new Google tool, moderator, came along. The example issue being moderated was "Interview Questions for Software Engineers". At the time of writing the winner (with 4 upvotes and 0 down votes) is my own proposal:

Would you please sit down here at this workstation and write some code with me?

If you are reading this then you have almost certainly been through many an interview for a job that will require some programming to be done. Think back and ask yourself how many times you were offered such a job without ever having been in the same room as a computer and an interviewer.

In my own experience (I've had 9 jobs involving programming, and many more interviews) it's been quite rare. One has to wonder why that is. I suspect that having a candidate do some programming is considered very expensive—and yet hiring the wrong candidate is unbelievably expensive. Hence the questions, which seem to be aimed at finding out if someone will be able able to program without watching them do any.

The second most popular question on the moderator site right as I write is

Given a singly linked list, determine whether it contains a loop or not

I can see what's being aimed at here, and yet it feels as if, in the stress of an interview setting, all the answer given will tell you is whether or not the candidate has seen the textbook answer before. How does knowing that help?

By the way, this is a systems programming question. The vast majority of working programmers don't come anywhere near any such class of problem. They do applications programming. There seems to be a sort of folk wisdom in a large part of the industry that having the skills (and, see above, mere knowledge) of a systems programmer will suite them well to being an application programmer. It's far from clear to me that this is true: the two areas of programming are quite distinct and require different (albeit overlapping) skill sets. I wonder how many interviewers use questions like this to create the impression (maybe even to themselves) that their organisation is involved in sexy, impressive systems work, whether or not that's true?

A while ago we did an informal survey of folks who work for Zuhlke in the UK, asking what had made them apply to us and what had made them subsequently accept an offer. We (the managers) were surprised by how many said that the very recruitment process itself had encouraged them to accept an offer.

This is what we do: Line mangers (such as myself) read CVs. There is an HR department, they don't come anywhere near recruitment. CVs that seem to describe a credible technical career go to the Chairman of the company. The candidates he likes the sound of are invited in for a first interview (of around an hour) to see if they are firstly, someone we'd be prepared to put in front of a client under our brand and secondly, someone we would enjoy working alongside. There's also a short written test of programming knowledge.

Those who make it through that are invited back for an interview with a small panel of our consultants who explore their technical knowledge in depth. First the candidate is asked to give a technical presentation on an interesting piece of work they did recently: stand at a white-board, explain the architecture, the technology choices, how it worked, how the project worked, what they did etc.

Then we set to finding out what they know. We have a checklist of technologies that we've used in projects and the questions take the form "what do you know about $technology?" No tricks, no puzzles. We do a breadth first pass to eliminate the things that the candidate doesn't know (and the process rewards optimistic answers at this point with a lot of embarrassment later on) and then we drill down into each area until at least one of the candidate's or the interviewers' knowledge is exhausted. This goes on for several hours.

Those who make it through that, and people have been known to walk out, get invited back to do some pair programming. That's the point (and not before) at which we decide if we'd like to make an offer.

And yes, this is all unbelievably expensive (I mean, the Chairman does first interviews?) And very, very effective. Our false positive rate is essentially zero and best of all, clients love our engineers and ask for them back again and again. Which makes it all worth while. I can't think of a cheaper way that would actually work.

PS: yes we are hiring. If you'd like a go a this, drop me a line.

97 Things Every Software Architect Should Know

The 97 things project is approaching a milestone: having 97 things which, in the opinion of one self–selecting group (which includes myself), every software architect should know. The intention is that the "best" 97 things go into a book to be published by O'Reilly, but that the forum will continue to be active, to gather new so–called "axioms" and discussion of same.

What do you think every software architect should know?

What do Android (and the rest) really mean?

If the FOSS rumblings from the mobile OS world are leaving you puzzled, you might benefit from a visit to Tim Joyce's blog. Tim's job is to understand what these things mean, and he's kindly embarking on a blog to share that understanding with the world.

Shameless self promotion dept.

I've been interviewed by Michael Hunter in his "five questions" column for Dr. Dobbs

Optimism

At the Tuesday Club the other week there was a bit of chat about what seems to be a foundational assumption of Agile: that in fact we can successfully execute software development projects, so we should run them in a way that brings about the greatest benefit.

This in contrast to what appears to be the (undeclared) foundational assumption of many other approaches: development project are so hard that we are most likely to fail, so we'd better find a way to do that as cheaply as possible.

This carried across into a panel discussion at the recent Unicom conference, where one of the attendees came up with this summary description of Agile:

Collaborative optimism to solve the right problems (and only the right problems) in an incremental way

I quite like that "collaborative optimism" bit.

Agile Adoption Patterns

I've just done a first pass through Amr Elssamadisy's cracking new book Agile Adoption Patterns. It's a timely work—enough teams have adopted enough Agile practices enough times that there should be patterns by now. And here they are.

There's a lot to like about the book. I especially appreciate the consistent message throughout that we do these practices because of their tie to business value. Not for truth and beauty, not for right versus wrong, but because they make our customers happy through the means of more money sooner, more often. There is an explicit mapping from various enhancements to business value onto clusters of the patterns. There is also mappings by business or process smell (of which there is a good, short catalogue presented). For each kind of business benefit the patterns within each cluster are logically related and given a partial order with respect to how well they help with that goal.

Separately, the clusters of patterns are described themselves as patterns, which is a nice organising principle. Agile Adoption Patterns is quite explicitly a handbook for finding out why your organisation might want to adopt Agile and then building a strategy to make that happen. A damn handy thing to have.

Scattered through the pattern catalogue there are little diagrams relating the patterns. I might wish for more of these.

The patterns themselves contain a "Sketch" section, in which a surprisingly large cast of characters play out little vignettes. Dave Developer comes to realise the value of stand–up meetings because he sees Scott ScrumMaster [sic] removing his blockers, and so forth. I know that this style of thing is very fashionable these days, but it makes me cringe. Not only that, but I have a terrible memory for names, so keeping these fourteen characters lined up in my head is a strain.

The patterns have the usual "Therefore" section with the things–to–do in it, followed by and "Adoption" section telling you how to get there. They are all quite meaty, concrete and honest. And best of all is the "But" section telling you common mis–steps, gotchas, misunderstandings and how to recognise same. This is outstanding! A common shorthand description of what a pattern is (in fact, it's in this book too) is words to the effect of a "solution to a problem in a context" Which is sort–of true a far as it goes but has its problems, mostly around the term ''solution".

Oh, how we love "solutions" in the IT industry. A more accurate but less compelling description what patterns are is that they are mappings form one context to another, in which certain conflicting forces are more agreeably disposed (or resolved) in the latter than in the former. These "But" sections describe common scenarios (I've seen a lot of them, and so have you, I'll bet: Continuous Integration but continuously broken build, anyone?) that you could have and still make some sort of Jesuitical argument to be in agreement with the "Therefore", but actually are very wide of the mark. The next time I write any patterns, they will have a "But" section.

The highest level organisation of the patterns is into those concerned primarily with Feedback, technical practices, supporting practices and the cluster patterns. I was most interested to note that "Coach" is a supporting practice, but the role of ScrumMaster is assumed throughout.

I am one of the latter, but I'd rather be one of the former. Not so much of an option these days. Well, that's the power of marketing for you.

Meanwhile, one of the nicest things about having patterns is having a shared vocabulary. I think that Agile Adoption Patterns will become the standard reference for talking about Agile adoption, which will make my professional life easier and that of may others. A service to the community.

"Opportunities for Improvement"

The pattern form used is has nine sections and I'm not sure it helps to have each section shown in the table of contents. I feel the lack of a ready–to–hand one or two page list of all the patterns by name. Twelve pages of TOC for 320 pages of content seems a bit excessive. The same syndrome shows up in the index. If all the parts of a pattern are within a four page range do they really all need their own individual index entry? This is really a shame as the book otherwise benefits from the several other complementary ways to navigate it.

There is a stunningly egregious formatting botch on page 207. I'm sure this volume will go into multiple printings and I hope that someone at A–W gets that fixed.

The Non–functional in a Functional world?

This post shows an implementation of map/reduce in Haskell. It's a one–liner (if you ignore all the other lines) and begins with this claim:

the type says it all
p_map_reduce :: ([a] -> b) -> (b -> b -> b) -> [a] -> b

The type says a lot, that's for sure. I says that map/reduce results in a value of type b when given a list of values of type a, a function that gives one b given two b's and a function that gives one b given a list of a's. That is a lot of information and it's impressive that it (at least) can be stated on one line. Is it "all", though?

Even allowing for some inter–tubes posting hyperbole "all" is a very strong claim. A Haskell programmer should know that. If one already knows what map/reduce does then the type tells you a lot about how to use this implementation of it. And if you don't, maybe not so much.

But that's not what caught my attention. The point of map/reduce is to recruit large pools of computational resource to operate on very large data sets efficiently. The (14 line, in fact—which is still quite impressive) implementation uses Haskell's Control.Parallel features to chop the list of a's into 16 chunks. That's not in the type. The intention is to have enough lumps of work to

bump all your cores to 100% usage, and linearly increase overall performance

which it may well do on a large enough data set, if I have fewer than seventeen cores. Currently, I do. There are nine cores in my flat right now. If my other laptop were here, it'd be eleven. But if I'm planning to use map/reduce in the first place, maybe I'm going to recruit all the machines in the office, which would currently be worth over twenty cores. This map/reduce isn't going to pin all of them.

Would Haskell's type system support giving p-map-reduce a dependent type with the number of worker processes in it? I don't know enough about it to tell. And why would we care? Well, that number 16 is a very significant part of the implementation of that function.

The only reason we'd bother to use map/reduce is because of its non–functional characteristics (as the properties of software that make it actually useful are called) and these are no–where to be seen in its type as a function.

Computer Weekly IT Blog Awards

The shortlist for the Computer Weekly IT Blog Awards 2008 is just out.

This blog was nominated (most kind, thanks), as were those of my colleagues Ben and Daryn, but Daryn's has made it onto the shortlist. This is fitting recognition for the fine work he has put into his blog, particularly his series of Ruby/Rails tutorials which are gathering some recongnition in that world. Feel free to go and vote for him!

XPDay London Session Submission Process

If you want to propose a session for XP Day London '08 (and I hope that you do), then here's what you need to do.

We want the conference to be largely self–organizing this year, so the submission process is suitably light weight.

Experience Reports

Please send a very brief outline of your report to submissions2008@xpday.org You will receive a URL to a google document. Make sure that you include an email address corresponding to a google account (or let us know if the address you send from is the one to use). The document will be writeable by you and the committee and all other submitters. Develop your report there. As in past years we invite the community to collaborate together to help maximize the quality of all submissions. After the close of submissions the committee will make a selection of experience reports to be presented at the conference. The selected reports will be shepherded between acceptance and the conference.

Conference Sessions

Most of the duration of the conference (other than experience reports) will be OpenSpace. There will be Lightning Talks early in the day where people can publicize topics that they would like to discuss. We encourage people to engage in this way. There will be optional rehearsals at XtC in the weeks running up to the conference. You do not need to attend these to make a lightening talk at the conference. If you would like to have such a rehearsal, send a title and a google account email address to submissions2008@xpday.org You will be allocated to a rehearsal slot. We invite groups outside London to schedule rehearsals too, and are happy to help with scheduling those. There will be a google calendar with the slots shown.

Some kinds of session can benefit from some preparation of materials and some logistics on the day. For example, a workshop involving some sort of server, or extensive materials. There will be a limited number of time slots during the conference for such sessions. Please submit a brief outline to submissions2008@xpday.org indicting an email address associated with a google account. You will be sent the URL of a google document within which to develop your proposal in collaboration with other submitters. After the close of submissions these proposals will be assessed by the committee and suitable sessions will be selected for the program.

We have decided to extend the submission period, so submissions for experience reports, rehearsals and programmed sessions now closes on Friday August 8th 2008.

This year we want to de–emphasize sessions introducing Agile or Scrum or XP or TDD or... and promote topics of current interest to practitioners. We want the conference to be a forum in which the state of the art is advanced. That doesn't mean that only experts are welcome, or welcome to present. Experts have their failure modes, simple questions from journeymen often reveal the essence.

All are welcome, so get your thinking caps on!

TDD, Mocks and Design

Mike Feathers has posted an exploration of some ideas about and misconceptions of TDD. I wish that more people were familiar this story that he mentions:

John Nolan, the CTO of a startup named Connextra [...] gave his developers a challenge: write OO code with no getters. Whenever possible, tell another object to do something rather than ask. In the process of doing this, they noticed that their code became supple and easy to change.

That's right: no getters. Well, Steve Freeman was amongst those developers and the rest is history. Tim Mackinnon tells another part of the story. I think that there's actually a little bit missing from Michael's decription. I'll get to it at the end.

A World Without Getters

Suppose that we want to print a value that some object can provide. Rather than writing something like statement.append(account.getTransactions()) instead we would write something more like account.appendTransactionsTo(statement) We can test this easily by passing in a mocked statement that expects to have a call like append(transaction) made. Code written this way does turn out to be more flexible, easier to maintain and also, I submit, easier to read and understand. (Partly because) This style lends itself well to the use of Intention Revealing Names.

This is the real essence of TDD with Mocks. It happens to be true that we can use mocks to stub out databases or web services or what all else, but we shouldn't. Not doing that leads us to write code for each sub-domain within our application in terms of very narrow, very specific interfaces with other sub-domains and to write transducers that sit at the boundaries of those domains. This is a good thing. At the largest scale, with functional tests, it leads to hexagonal architecture. And that can apply equally well recursively down to the level of individual objects.

The next time someone tries to tell you that an application has a top and a bottom and a one-dimensional stack of layers in between like pancakes, try exploring with them the idea that what systems really have is an inside and a outside and a nest of layers like an onion. It works remarkable wonders.

If we've decided that we don't mock infrastructure, and we have these transducers at domain boundaries, then we write the tests in terms of the problem domain and get a good OO design. Nice.

The World We Actually Live In

Let's suppose that we work in a mainstream IT shop, doing in-house development. Chances are that someone will have decided (without thinking too hard about it) that the world of facts that our system works with will live in a relational database. It also means that someone (else) will have decided that there will be a object-relational mapping layer, based on the inference that since we are working in Java(C#) which is deemed by Sun(Microsoft) to be an object-oriented language then we are doing object-oriented programming. As we shall see, this inference is a little shaky.

Well, a popular approach to this is to introduce a Data Access Object as a facade onto wherever the data actually lives. The full-blown DAO pattern is a hefty old thing, but note the "transfer object" which the data source (inside the DAO) uses to pass values to and receive values from the business object that's using the DAO. These things are basically structs, their job is to carry a set of named values. And if the data source is hooked up to an RDBMS then they more-or-less represent a row in a table. And note that the business object is different from the transfer object. The write-up that I've linked to is pretty generic, but the inference seems to be invited that the business object is a big old thing with lots of logic inside it.

A lot of the mechanics of this are rolled up into nice frameworks and tools such as Hibernate. Now, don't get me wrong in what follows: Hibernate is great stuff. I do struggle a bit with how it tends to be used, though. Hibernate shunts data in and out of your system using transfer objects, which are (lets say) Java Beans festooned with getters and setters. That's fine. The trouble begins with the business objects.

Irresponsible and Out of Control

In this world another popular approach is, whether it's named as such or not, whether it's explicitly recognized or not, robustness analysis. A design found by robustness analysis (as I've seen it in the wild, which may well not be be what's intended, see comments on ICONIX) is built out of "controllers", big old lumps of logic, and "entities", bags of named values. (And a few other bits and bobs) Can you see where this is going? There are rules for robustness analysis and one of them is that entities are not allowed to interact directly, but a controller may have many entities that it uses together.

Can you imagine what the code inside the update method on the GenerateStatementController (along with its Statement and Account entities) might look like?

Hmmm.

Classy Behaviour

Whenever I've taught robustness analysis I've always contrasted it with Class Responsibility Collaboration, a superficially similar technique that produces radically different results. The lesson has been that RA-style controllers always, but always, hide valuable domain concepts.

It's seductively easy to bash in a controller for a use case and then bolt on a few passive entities that it can use without really considering the essence of the domain. What you end up with is the moral equivalent of stored procedures and tables. That's not necessarily wrong, and it's not even necessarily bad depending on the circumstances. But it is completely missing the point of the last thirty-odd years worth of advances in development technique. One almost might as well be building the system in PRO*C

Anyway, with CRC all of the objects we find are assumed to have the capability of knowing things and doing stuff. In RA we assume that objects either know stuff or do stuff. And how's a know-nothing stuff-doer get the information to carry out its work? Why, it uses a passive knower, an entity which (ta-daaah!) pops ready made out of a DAO in the form of a transfer object.

And actually that is bad.

Old Skool

Back in the day the masters of structured programming[pdf] worried a lot about various coupling modes that can occur between two components in a system. One of these is "Stamp Coupling". We are invited to think of the "stamp" or template from which instances of a struct are created. Stamp coupling is considered (in the structured design world) one of the least bad kinds of coupling. Some coupling is inevitable, or else your system won't work, so one would like to choose the least bad ones, and (as of 1997) stamp coupling was a recommended choice.

OK, so the thing about stamp coupling is that it implicitly couples together all the client modules of a struct. If one of them changes in a way that requires the shape of the struct to change then all the clients are impacted, even if they don't use the changed or new or deleted field. That actually doesn't sound so great, but if you're bashing out PL/1 it's probably about the best you can do. Stamp coupling is second best, with only "data" coupling as preferable: the direct passing of atomic values as arguments. Atomic data, eh? We'll come back to that.

However, the second worst kind of coupling that the gurus identified was "common coupling" What that originally meant was something like a COMMON block in Fortran, or global variables in C, or pretty much everything in a COBOL program: just a pile of values that all modules/processes/what have you can go an monkey with. Oops! Isn't that what a transfer object that comes straight out of a (single, system-wide) database ends up being? This is not looking so good now.

What about those atomic data values? What was meant back in the day was what we would now call native types: int, char, that sort of thing. The point being that these are safe because it's profoundly unlikely that some other application programmer is going to kybosh your programming effort by changing the layout of int.

And the trouble with structs is that they can. And the trouble with transfer objects covered in getters and setters is that they can, too. But what if there were none...

Putting Your Head in a Bag Doesn't Make you Hidden

David Parnas helped us all out a lot when in 1972 he made some comments[pdf] on the criteria to be used in decomposing systems into modules

Every module [...] is characterized by its knowledge of a design decision which it hides from all others. Its interface or definition was chosen to reveal as little as possible about its inner workings.

Unfortunately, this design principle of information hiding has become fatally confused with the implementation technique of encapsulation.

If the design of a class involves a member private int count then encapsulating that behind a getter public int getCount() hides nothing. When (not if) count gets renamed, changed to a big integer class, or whatever, all the client classes need to know about it.

I hope you can see that if we didn't have any getters on our objects then this whole story unwinds and a nasty set of design problems evaporate before our eyes.

What was the point of all that, again?

John's simple sounding request: write code with no getters (and thoroughly test it, quite important that bit) is a miraculously clever one. He is a clever guy, but even so that's good.

Eliminating getters leads developers down a route that we have known is beneficial for thirty years, without really paying much attention. And the idea has become embedded in the first really new development technique to come along for a long time: TDD. What we need to do now is make sure that mocking doesn't get blurred in the same was as a lot of these other ideas have been.

First step: stop talking about mocking out infrastructure, start talking about mocks as a design tool.

Image

Rick Minerich asks the question "Why are our programs still represented by flat files?" To which my answer is "not all of them are."

Judging by the tag cloud on his blog, Rick works mainly in the Microsoft end of the enterprisey world. If VisualStudio is the main front end to your system then I can see why the question would arise. Of course, in other parts of the IT world programs stopped being in flat files a long time ago. Or even never were. But you have to go looking for that world. As luck would have it, Gilad Bracha has just recently explained some of the very pleasant aspects of that world. But so many programmers either don't know it exists, or once shown it refuse to enter.

A Painful Learning Experience Revisited

Painful but valuable. This old thing has popped up onto folks' radars again, with some interesting discussion here.

I'm sad that the PoMoPro experiment kind-of fizzled out. I occasionally try to talk Ivan into doing a second conference, but without success.

XP Day London Call

I'm pleased to announce that the call for submissions to XpDay London 2008 is now open.

Get your thinking caps on, and in due course an electronic submission system will be announced.

The Process for Programming (and why it might not help)

Paul Johnson writes about the infamous 1:10 productivity ratios in programing, and suggests that we don't know how to program, that is, there is no process for programming.

I take the view that on the contrary, we do know how to program, and there pretty much is a process for programming, but it isn't much recognized because it doesn't do what so many people so desperately want a process for programming to do.

The process for programming that I have in mind is very much like the process for doing maths—I suspect that this is why (reputedly) everyone who goes to work as a programmer at Microsoft is given a copy of How to Solve It. What both activities have in common is a need to do imaginative, creative problem solving within strict constraints.

Paul lists a bunch of activities, of which he suggests that "the core activity is well understood: it is written down, taught and learned". One of these is "curing a disease".

I find that questionable. Have you ever watched House? Apart from being a very entertaining update of Sherlock Holmes, what goes on in the early seasons of House is (within the limits of television drama, to the eyes of this layman) pretty close to the actual investigations and diagnoses I have read in The Medical Detectives, The Man Who Mistook His Wife for a Hat and a few other of the bits of source material. To a surprising extent, they aren't making that stuff up. See how the diagnostic team blunder about? See how they follow false leads? See how some apparently unrelated fact combines in the prepared mind with a huge mass of prior experience to prompt new ideas? See how they must take a chance on something that they don't know will work?

There are definite tools that are used again and again, primarily differential diagnosis (which, BTW, makes the average episode of House into a good short course on debugging), and the various physiological tests and so forth, but mostly it's a synthesis of knowledge and experience. If you ever get the chance to hear Brian Marick tell his story about how vets learn to recognize "bright" vs "dull" cows you'll get a good idea of just how the teaching of medicine works. And it works to a large extent by the inculcation of tacit knowledge.

Tacit knowledge is terribly hard stuff to deal with, it's in the "unknown knowns" quadrant. One of the things that makes teaching a skill difficult is the manifestation of tacit knowledge *. A bit part of what a lot of people want a "process" for programming to do is to make explicit exactly that tacit knowledge.

But there is only one way to do that really well, which is exactly why things like teaching hospitals exist. The way to do it is to try repeatedly with feedback from an expert. The external feedback is crucial. You may have heard somewhere that practice makes perfect. In fact, all practice does by itself is make consistent. It's the expert feedback that leads to improvement.

This one reason why folks are unhappy with what we do know about how to program: it cannot be learned quickly from a book, from a one-week course. There is another reason, perhaps worse: what we know about programming does not (and if parallel with other disciplines based on skill and judgment, it never will) equip programmers to produce a given outcome on demand, on time, every time.

The idea has got around (originating perhaps with "One Best Way" Taylor) that a process for any given job should be like a recipe—follow it exactly and get the same result every time. In some spheres of work such things are possible. Programming does not seem to be one of them, no matter how much we might wish it.

* Design (and other kinds) of patterns are in part about capturing this stuff, which they do variously well. I suspect that this is part of the reason why more experienced folks who haven't seen them before say things like "that's obvious" or "anyone who doesn't know/couldn't work that out shouldn't be a programmer".

XpDay 2008

XpDay London will be on 11 and 12 of December, at Church House in the City of Westminster. The call for submissions will come soon.

We're aiming for something a little different this year. The conference committee (in as far as it has one) has decided that if we belong to a community that really does value individuals and their interactions more than processes and tools, and responding of change more than following a plan, then our conference should work in a way consonant with that. I'm Programme Co-chair, so a lot of the mechanics of how that will work in practice fall to me to figure out–and this is almost done. I'll keep you posted.

Why You Aren't Donald Knuth

Nice interview over at InformIT (it's that Binstock guy again, why had I not heard of him a couple of weeks ago?) with Prof Knuth.

He makes this observation:

[...]]the idea of immediate compilation and "unit tests" appeals to me only rarely, when I’m feeling my way in a totally unknown environment and need feedback about what works and what doesn’t. Otherwise, lots of time is wasted on activities that I simply never need to perform or even think about. Nothing needs to be "mocked up."

I believe him. I also draw a few inferences.

One is that Knuth probably doesn't spend much time working on systems that do their work by intimately coördinating (with) six or eight other systems with very different usage patterns, metaphors, implementation stacks, space and time complexities, latencies, interface styles/languages/protocols etc. etc. etc.

Another is that he probably doesn't spend much time working on problems that are incompletely and ambiguously described (and that can't be fixed in a meaningful time and/or budget), part of an ad–hoc domain, intrinsically fuzzy, aiming or a moving target from a moving platform, subject to capricious and arbitrary constraints, etc. etc. etc.

And thus is the working life of the research computer scientist different from the working life of the commercial software engineer. I'm tempted to say (and so I will) that the jobbing industrial programmer spends more time in an unknown environment and needs more feedback (and, right now!) about what works and what doesn’t more often than a researcher does.

Postscript

By the way, anyone* who reads that interview and thinks to themselves "ha! See! Knuth doesn't unit test, so I don't need to either" needs to consider a couple of things:

you aren't Don Knuth
are you really prepared to do all the other stuff that Knuth does so that his programs still work without unit testing?
if so, whom do you imagine is going to pay for you to?

* The "I'm too clever to test" crowd does exist. I'd be amazed if any of them read this blog, but those of you who do will likely meet them from time to time. An electron-microscopically tiny proportion of them are right (see consideration 1).

Creation Under Constraints

It's a symptom of the bourgeoisie's self-hatred that creation (or worse yet, creativity) is seen as a wild, undisciplined, anarchic thing. And yet every person who'd be called an artist that I've ever met has been passionately interested in form and structure and rules and schemata and in doing work under constraints that force them to try new things. That's right, introducing constraints can increase creativity, can bring to light new and exiting possibilities that were, so to speak, hidden by the freedom.

Haiku. Silverpoint drawing. Strict counterpoint. You wouldn't want to not have any other way to express yourself, but what a learning exercise!

Andrew Binstock relates a certain set of constraints within which to practice OO programming.

One of them I like a huge amount:

Wrap all primitives and strings. [...] So zip codes are an object not an integer, for example. This makes for far clearer and more testable code.

It certainly does, which was the topic of the session which Ivan and I ran at Spa this year.

Another is this:

Don’t use setters, getters, or properties. [...] “tell, don’t ask.”

Indeed. What we now know as jMock was invented to support this style of programming. It works a treat.

This one, though, I'm a bit equivocal about:

Don’t use the ‘else’ keyword. Test for a condition with an if-statement and exit the routine if it’s not met.

See, depends on what's meant. It could be that a trick is being missed. On the one hand, I'm a huge fan of code that returns early (if(roundHole.depthOf(squarePeg).sufficient()) return;), and also of very aggressive guard clauses (if(i.cantLetYouDoThat(dave)) throw new DeadAstronaut();) On the other, I believe that the constraint that does most to force you to "get" OO is this:

Don't use any explicit conditionals. Never mind else, use no if.

Can you see how that would work? I learned this so long ago that I can't remember where from (by some route from "Big" Dave Thomas, perhaps?)

Oh yes, and for bonus points, Andrew mentions this: "Don’t use any classes with more than two instance variables". I suggest that you try not using any classes that have any instance variables, and see how far you get. You might find that to make progress you need a better language.

Pedanto-linguistics

So, there's a couple of systems that I've been using fairly intensively in the last couple of months, both concerned with the programme for Agile 2008 (which is looking pretty good, BTW).

One of them complains if one tries to browse certain URLs without having first logged in. In that case it tells me "You are not authorized to access this page". Which irks me every time I see it, because of course I am authorized to access that page, I just haven't yet presented the credentials that tell the system so. Which is what the system should invite me to do, rather than telling me to get hence.

The other system allows collaborative editing of documents and shows who last edited each file. When I am the person who last edited the file it says that the editor was "me". Which should, of course, be "you".

This sort of thing is a small, but I think under-appreciated, source of the discomfort that many people feel when using IT systems. We (I write as an implementor of systems) should be more careful in this regard.

Why bzr?

I've long been a (very contented) P4 user, both at work and at home. It's less than great to be emailing patches around, though. As I've been doing more work on measure I've found that I want more and more to do version controlled work on the same code here, there and everywhere. So, the new burst of interest in DVCS's has come along at just the right time.

I've chosen Bazaar as my first DVCS. Ed asked me why, and the answer turned out to be quite interesting (to me).

Firstly, why not git? Well, I am in general not a member of the linux world mainly because computers are for me (these days, anyway) a means not an end. I no longer enjoy installing this and (re)configuring that and (re)compiling the other just for the sheer joy of using up cycles and feeling as if I know something that others don't. Nor for squeezing out the last scintilla of performance, nor whatever it is that the 1337 do these days. I want a computer to do pretty much what I need out-of-the-box. That's why I'm a Mac user. So the idea that git provides the "plumbing" and then the "porcelain" (nice metaphor...) is built on top, well I'm bored already.

What does that leave? I head that darcs has these nasty performance cliffs that it falls off of, so no dice there. I have heard good things about Bazaar (specifically, that Nat likes it) so I'll go with that for now. So far, I really like it. Setting up exactly the distributed topology I need with the protections I want is proving to be a bit of a challenge, and the documentation seems to assume that I'm very familiar with...something...that I seem not to be (and I can't quite figure out what it is to go learn it, either), but so far so good.

I find myself a bit disappointed, though, that the FOSS crew have run quite so hard and fast with this. It's kind of a shame that people like Ed and myself can semi-legitimately get involved in a conversation about which DVCS and why. Summed over all the people like Ed and myself that's a lot of mental energy being poured down a black hole, across the industry as a whole. Especially since it seems as if there is almost nothing to choose between these tools. If only a strong industry player could have set the scene against which others could vary, but BitKeeper didn't (thinks: how much of this mess is really about sticking it to ~~Larry McVoy~~ "the man"?), and git hasn't, and now we have a soup of peers and all these nugatory factors to consider. What a waste.

PS: especially irksome is that I once had the pleasure of working with VisualAge for Java Micro Edition, which got all this pretty much exactly right a good long time ago.

Pols' Perfect Purchaser

Andy Pols tells a great story about just how great it can be to have your customer 1) believe in testing and 2) completely engage in your development activities. Nice.

Subtext continues to amaze

I just watched the video describing the latest incarnation of Subtext. The video shows how the liveness, direct manipulation and example–based nature of this new table–based Subtext makes the writing and testing of complex conditionals easy.

If you develop software in a way that makes heavy use of examples and tables, you owe it to yourself to know about Subtext.

A Catalogue of Weaknesses in Python

You may have read somewhere that "Patterns are signs of weakness in programming languages", and perhaps even that "16 of 23 [GoF] patterns are either invisible or simpler [in Lisp]", from which some conclude that dynamic languages don't have/need patterns, or some such.

Interesting, then, that a recent Google dev day presentation (pdf, video) provides us with a list of ~~terrible weaknesses~~ great patterns in Python.

Language

Steve Freeman ponders if maybe the reason that I find statistical properties that seem to resemble those of natural languages in the code of jMock is because jMock was written to be like a language...

Some Spa 2008 Stuff

Chris Clarke has made a post here exploring the ideas that Ivan Moore and I presented at our Spa 2008 workshop Programming as if the Domain Mattered. Ivan has written up some of his learnings from the session here. I'll be doing the same myself soon.

Chris makes this most excellent point:

I wish people would be a bit braver and use the code to express what they are trying to do and not worry about whether the way they are doing it is against Common Practice. Remember, the majority of software projects are still failures, so why follow Common Practice - it isn’t working!

Quite.

In other news, my experience report on the effects of introducing checked examples (aka automated acceptance/functional/user/whatever "tests") gets this thorough write up from "Me" (who are you, Me?) and also a mention in this one from Pascal Van Cauwenberghe.

Thanks, folks.

Design Patterns of 1994

So, Mark Dominus' Design Patterns of 1972 has made it back to the front page of proggit.

He offers a pattern-styled write up of the idea of a "subroutine" and opines that:

Had the "Design Patterns" movement been popular in 1960, its goal would have been to train programmers to recognize situations in which the "subroutine" pattern was applicable, and to implement it habitually when necessary

which would have been disastrous. Also:

If the Design Patterns movement had been popular in the 1980's, we wouldn't even have C++ or Java; we would still be implementing Object-Oriented Classes in C with structs

Actually, I think not. Because we did have patterns in the 1990's and, guess what, programming language development did not cease. Not even in C++.

Debunking Debunking Cyclomatic Complexity

Over at SDTimes, this article by Andrew Binstock contains a claim that this result by Enerjy somehow "debunks" cyclomatic complexity as an indicator of problems in code. He suggests that what's shown is that for low complexity of methods (which is overwhelmingly the most common kind of complexity of methods) increasing complexity of methods is not (positively) correlated with the likelihood of defects. Binstock suggests that:

What Enerjy found was that routines with CCNs of 1 through 25 did not follow the expected result that greater CCN correlates to greater probability of defects.

Not so. What Enerjy say their result concerns is:

the correlation of Cyclomatic Complexity (CC) values at the file level [...] against the probability of faults being found in those files [...]). [my emphasis]

It's interesting all by itself that there's a sweet spot for the total complexity of the code in a file, which for Java pretty much means all the methods in a class. However, Binstock suggests that

[...] for most code you write, CCN does not tell you anything useful about the likelihood of your code’s quality.

Which it might not if you only think about it as a number attached to a single method, and that there are no methods of high complexity. But there are methods of high complexity—and they are likely to put your class/file into the regime where complexity is shown to correlate with the likelyhood of defects. Watch out for them.

Red and Green

Brian Di Croce sets out here to explain TDD in three index cards. It's a nice job, except that I'm not convinced by the faces on his last card.

Brian shows a sad face for the red bar, a happy face for the green, and a frankly delirious face for the refactoring step. There's something subtly wrong with this. We are told that when the bar is green the code is clean, which is great. But the rule is that we only add code to make a failing test pass, which implies a red bar. So, the red bar is our friend!

When I teach TDD I teach that green bar time is something to get away form as soon as possible. Almost all the time a green bar occurs when a development episode is incomplete: not enough tests have been written for the functionality in hand, more functionality is expected to go into the next release, or some other completeness condition is not met.

It's hard to learn form a green bar, but a red bar almost always teaches you something. Experienced TDDer's are very (and rightly) suspicious of a green-to-green transition. The green bar gives a false sense of security.

Generally speaking, in order to get paid we need to move an implementation forward, and that can only be done on a red bar. Wanting to get to the next red bar is a driver for exploring the functionality and the examples that will capture it.

I tell people to welcome, to embrace, to seek out the red bar. And that when the bar is red, we're forging ahead.

TDD at QCon

Just finished a presentation of the current state of my TDD metrics thinking at QCon. The slides [pdf] are up on the conference site, video should be there soon, too.

Tests and Gauges

At the recent Unicom conference re Agility and business value I presented on a couple of cases where I'd seen teams get real, tangible, valuable...value from adopting user/acceptance/checked-example type testing. David Peterson was also presenting. He's the creator of Concordion, an alternative to Fit/Library/nesse. Now, David had a look at some of my examples and didn't like them very much. In fact, they seemed to be the sort of thing that he found that Fit/etc encouraged, of which he disapproved and to avoid which he created Concordion in the first place. Fair enough. We tossed this back and forth for a while, and I came to an interesting realization. I would in fact absolutely agree with David's critique of the Fit tests that I was exhibiting, if I though that they were for the purpose that David thinks his Concordion tests are for. Which probably means that any given project should probably have both. But I don't think that, so I don't.

Turning the Tables

David contends that Fit's table–oriented approach affords large tests, with lots of information in each test. He's right. I like the tables because most of the automated testing gigs that I've done have involved financial trading systems and the users of those eat, drink, and breath spreadsheets. I love that I can write a fixture that will directly parse rows off a spreadsheet built by a real trader to show the sort of thing that they mean when they say that the proposed system should blah blah blah. The issue that David sees is that these rows probably contain information that is, variously: redundant, duplicated, irrelevant, obfuscatory and various other epithets. He's right, often they do.

What David seems to want is a larger number of smaller, simpler tests. I don't immediately agree that more, simpler things to deal with all together is easier than fewer, more complex things, but that's another story. And these smaller, simpler tests would have the principle virtue that they more nearly capture a single functional dependency. That's a good thing to have. These tests would capture all and only the information required to exercise the function being tested for. This would indeed be an excellent starting point for implementation.

There's only one problem: such tests are further away from the users' world and close to the programmers' . All that stuff about duplication and redundancy is programmer's talk. And that's fair enough. And its not enough. I see David's style of test as somewhat intermediate between unit tests and what I want, which is executable examples in the users' language. When constructing these small, focussed tests we're already doing abstraction, and I don't want to make my users do that. Not just yet, anyway.

So then I realised where the real disagreement was. The big, cumbersome, Fit style tests are very likely too complicated and involved to be a good starting point for development. And I don't want them to be that. If they are, as I've suggested, gauges, then they serve only to tell the developers whether or not their users' goals have been met. The understanding of the domain required to write the code will, can (should?) come from elsewhere.

Suck it and See

And this is how gauges are used in fabrication. You don't work anything out from a gauges. What you do is apply it to see if the workpiece is within tolerance or not. And then you trim a bit off, or build a bit up, or bend it a bit more, or whatever, a re–apply the gauge. And repeat. And it doesn't really matter how complicated an object the gauge itself is (or how hard it was to make—and it's really hard to make good gauges), because it is used as if it were both atomic and a given. It's also made once, and used again and again and again and again...

Until this very illuminating conversation with David I hadn't really fully realised myself quite the full implications of the gauge metaphor. It actually implies something potentially quite deep about how these artifacts are built, used and managed. Something I need to think about some more.

Oh, and when (as we should) we start to produce exactly those finer–grained, simpler, more focussed tests that David rightly promotes, and we find out that the users' understanding of their world is all denormalised and stuff, what interesting conversations we can have with them then about how their world really works, it turns out.

Might even uncover the odd onion in the varnish. But let's not forget that having the varnish (with onion) is more valuable to them than getting rid of the onion.

Curiously Apt

A friend is embarking upon a conversion MSc into the wonderful world of software development. He's become interested in the currently en vogue paradigms of programming, their relationships and future. It seems to him (he says) that OO is very much about standing around a whiteboard with your friends, sipping a tall skinny latte while your pizza goes cold. And by contrast functional programming is like sitting alone, crying into your gin with your head held in your hands over some very, very, hard maths.

Perceptive. He'll go far. And not just because he's already had this other successful career dealing with actual customers (which should be a stronger prerequisite for becoming a commercial developer than any of that comp sci stuff).

Compiler Warnings

This Daily WTF reminded me of a less than glorious episode from my programming past. At a company that shall remain namless (and doesn't exist anymore) I was working on a C++ library to form part of a much larger application.

One fine day a colleague came stomping (that's the only verb that suits) over to my desk and boldly announced that "your code has crashed my compiler". Somewhat alarmed at this I scurried (yes, scurried) back to his desk. "Look" he said, and did a little cvs/make dance. Lo and behold, when the compiler got to my code indeed it fell to silently chugging away. "See," he resumed, "it's crashed."

"No," I said, "it's just compiling."

"So where," he asked, "are the messages?"

"What messages?" I replied. He scrolled the xterm up a bit.

"These. All the warnings and stuff"

"Doesn't generate any," I said.

He boggled. "What, have you found some way of turning them off?"

"No," I said, "I just wrote the code so that it doesn't generate any warnings."

He boggled some more. "Why," he eventually managed to gasp, "would you bother doing that?"

I didn't last long there. It was best for all concerend, really.

Golden Hammers vs Silver Bullets?

The old wiki page on Golden Hammers has popped up on reddit. One commentator suggests that a golden hammer seems to be the same as a silver bullet.

Probably. But I find the two phrases suggestive when placed side-by-side. To me, it seems as if a golden hammer is a tool that's very familiar, simple to apply, and words best on things within striking distance: everyday problems. All these screws sticking out all over, bam, bam, bam. A golden hammer is something we already have, and it worked great lots of time before, so lets carry on using it for whatever comes next.

A silver bullet, on the other hand, seems like something that is directed in from the outside. It is a projectile, the sender has no control over it once launched. The silver bullet is new and alien. It requires complex tooling to make it work. It is deployed against hairy monsters that jump out at you. Nothing ever worked on them before, maybe the silver bullet will?

A golden hammer, then, would be the over-applied old tool of a worker who doesn't learn new ones. The silver bullet is the disengaged outsider's agent of violent change.

I love having new distinctions to play with.

Agile 2008

Just a reminder that the call for submissions to the Agile 2008 conference closes on the 25th for Feb. I strongly encourage those who are interested in agile methods to go, and I strongly encourage those who are going to submit sessions.

I'm assistant producer (vice Steve Freeman) of the "stage" called Committing to Quality, which is a home for sessions concerned with the strong focus on internal and external quality in agile development.

I've been watching the submission system since the call opened and there are some really interesting sessions being proposed. Looks as if it's going to be a good conference.

See you there!

UK Conference Season

It's getting round to be conferece season again. If you happen to be in the UK in the next few months, maybe I'll see you at UNICOM, where I'll be talking about some adventures with automated testing.

Or perhaps at QCon, where I'll be presenting the latest news on my metrics work and joining in a panel with Beck and others, both part of the XPDay Taster track, a cross-over from the XPDay events.

Or even at Spa (that oh so magical automated testing again).

Tolstoy's Advice on Methodology

So, Tolstoy tells us that all happy families are all the same, whereas each unhappy family is unhappy in its own way. I think that the same applies to development projects: all successful projects are the same, unsuccessful ones fail in their own ways.

This occurred to me while chatting with one of my Swiss colleagues recently. The Swiss end of the business does a lot of successful RUP engagements: that's right, successful RUP. The reason that they are successful is twofold. Firstly, they always do a development case so they always carefully pick and choose which roles, disciple and the rest they will or will not use on a project. Secondly, they understand that RUP projects really are supposed to be really iterative and really incremental, that almost all the disciplines go on to a greater or lesser extent in all phases. A (different) Swiss colleague once asked me what the difference was between Agile and RUP. My only semi-flippant answer was that if you do RUP the way Philippe Kruchten wants you to do it, then not much.

Another take on DSLs, or "Why Ruby doesn't quite do it for me"

If you liked that, then you might also like this.

Research, huh! What is it good for?

...absolutely nothing, according to this screed. Yes, I know it's half a year old or so, but I only just came across it. Which is a nice segue, because it (and that I felt the need to make that qualification) partly illustrates something that annoys me not a little about the industry: a short memory. Also, not looking stuff up. While the overall message (that contemporary Comp Sci research is a titanic fraud that should be abolished) is both shrill and apparently right-libertarian propaganda, I have a degree of sympathy with it.

We are asked to consider those mighty figures of the past, working at institutions like the MIT AI Lab, producing "fundamental architectural components of code which everyone on earth points their CPU at a zillion times a day". OK. Let's consider them. And it's admitted that "some of [them] - perhaps even most - were, at some time in their long and productive careers, funded by various grants or other sweetheart deals with the State." No kidding.

Go take a look at, for example, the Lambda the Ultimate papers, a fine product of MIT. Now, MIT is a wealthy, independent, private university. So who paid for Steele and the rest to gift the world with the profound work of art that is Scheme? AI Memo 349 reports that the work was funded in part by an Office of Naval Research contract. The US Navy wanted "An Interpreter for Extended Lambda Calculus"? Not exactly. In 1975 "AI" meant something rather different and grander than it does today. And it was largely a government-funded exercise. This Google talk gives a compelling sketch of the way that The Valley is directly a product of the military-industrial complex, that is interventionist government funding. Still today, the military are a huge funder of research, and buyer of software development effort and hardware to run the results upon, which (rather indirectly) pushes vast wodges of public cash into the hands of technology firms. Or even directly: Bell Labs, for instance, received direct public funding in the form of US government contracts.

In the UK, at least, the government agencies that pay for academic research (of all kinds) are beginning to wonder, in quite a serious, budget-slashing kind of way, if they're getting value for money. So, naturally, the research community is doing some research to find out. One reason that this is of interest to me is that my boss (or, my boss's boss anyway) did some of this research. Sorry that those papers are behind a pay gate. I happen to be a member of the ACM, so I've read this one and one of the things it says is that as of 2000 the leading SCM tools generated revenues of nearly a billion dollars for their vendors. And where did the ideas in those valuable products come from? They came, largely, from research projects. What the Impact Project is seeking to do is to identify how ideas from research feed forward into industrial practice, and they are doing this by tracing features of current systems back to their sources. Let's take SCM.

The observation is made that the diff [postscript] algorithm (upon which, well, diffing and merging rely) is a product of the research community. From 1976. With subsequent significant advances made (and published in research papers) in 1984, '91, '95 and '96. Other research ideas (such as using SCMs to enforce development processes) didn't make a significant impact in industry.

Part of the goal of Impact is to:

educate the software engineering research community, the software practitioner community, other scientific communities, and both private and public funding sources about the return on investment from past software engineering research [and] project key future opportunities and directions for software engineering research and practice [and so] help to identify the research modalities that were relatively more successful than others.

In other words, find out how to do more of the stuff that's more likely to turn out to be valuable. The bad news is that it seems to be hard to tell what those are going to be.

I focus a little on the SCM research because that original blog post that got me going claims that

most creative programming these days comes from free-software programmers working in their spare time. For example, the most interesting new software projects I know of are revision control systems, such as darcs, Monotone, Mercurial, etc. As far as Washington knows, this problem doesn't even exist. And yet the field has developed wonderfully.

I would be very astonished to find that a contemporary (I write in early 2008–yes, 2008 and I still don't have a flying car!) update of that SCM study would conclude that the distributed version control systems were invented out of thin air in entirely independent acts of creation. They do mention that the SCM vendors, when asked, tended to claim this of their products).

The creation of Mercurial was a response to the commercialization of BitKeeper, and BitKeeper would seem to have been inspired by/based on TeamWare. Those seem to have all been development efforts hosted inside corporations, which is cool. I'd be interested to learn that McVoy at no time read any papers or met any researchers that talked about anything that resembled some sort of version control that was kind-of spread around the place. The Mercurial and and bazaar sites both cite this fascinating paper [pdf] which cites this posting. Which tells us that McVoy's approach to DVCS grew out work at Sun (TeamWare) done to keep multiple SCCS repositories in sync. Something that surely more people that McVoy wanted to do. SCCS was developed at Bell Labs (and written up in this paper [pdf], in IEEE Transactions in 1975)

One of the learnings from Impact is that what look like novel ideas from a distance in general turn out, upon closer inspection, to have emerged from a general cloud of research ideas that were knocking around at the time. The techniques used in the Impact studies have developed, and this phenomenon is much more clearly captured in the later papers. So what does that tell us?

Well, it tells us that its terribly hard to know where ideas came from, once you have them. And that makes it terribly hard to guess well what ideas are going to grow out of whatever's going on now. So perhaps there isn't a better way than to generate lots of solutions, throw them around the place and see waht few of them stick to a problem. Which is going to be expensive and inefficient–upsetting for free-marketeers, but then perhaps research should be about what we need, not merely what we want (which the market can provide just fine, however disastrous that often turns out to be). Anyway, once I heard somewhere that you can't always get what you want.

Back to that original, provocative, blog posting. It's claimed that, as far as the problems that the recent crop of DVCS systems address that "As far as Washington knows, this problem doesn't even exist." Applying a couple of levels of synecdoche and treating "Washington" as "the global apparatus of directly and indirectly publicly-funded research", the it would perhaps be better to say that "Washington thinks that it already payed for this problem to be solved decades ago". Washington might be mistaken to think that, but it's a rather different message.

Rubbing our noses in it this time

How hard can it be? What I want (and I know I'm not alone in this) is a 12" (ok, these days 13" if you must) MacBook Pro. With an optical drive built in. And user upgradeable RAM and hard-drive. And battery. With a pukka graphics card.

Such a machine would already be as thin and as light as I care about, and as capable as I want. But I can't have one, apparently. I could have this new "air" frippery–and doesn't it photograph well? Don't you love they way that in that 3/4 overhead shot you can't see the bulge underneath and it looks even thinner that it actually is? But really, a laptop with a off-board optical drive? Well, maybe that's the future, what with renting movies on iTunes and what have you, but...

Apple (ie, Jonathan Ive) have gotten really good at ~~recycling~~ a ~~forty year old~~ design language for products that pundits often hate and (partly because) the public loves, but this gadget seems a little too far ahead of its time. Ah well.

The Learning Curve

I'm going to let the dust settle on my recent posting on patterns, and then do a follow up—some interesting stuff has come out of it. For that, I think, I need a bit of supporting material some of which is here.

Recently this little gem resurfaced on reddit, which prompted a certain line of investigation. Normally I spread a lot of links around my posts[*] partly as aides memoir for myself, partly because my preferred influencing style is association, and partly because I believe that links are content. But I really want you to go and look at this cartoon, right now. Off you go.

Back again? Good. This "learning curve" metaphor is an interesting one. Before getting into this IT lark I was in training to be a physicist so curves on charts have are strongly suggestive to me. I want them to represent a relationship between quantities that reveals something about an underlying process. I want the derivatives at points and areas under segments to mean something. What might these meanings be in the case of the learning curve?

Most every-day uses of the phrase "learning curve" appeal to a notion of how hard it is to acquire some knowledge or skill. We speak of something difficult having a "steep" learning curve, or a "high" learning curve, or a "long" one. We find the cartoon is funny (to the extent that we do—and you might find that it's in the only-funny-once category) because our experience of learning vi was indeed a bit like running into a brick wall. Learning emacs did indeed feel a little bit like going round in circles, it did indeed seem as if learning Visual Studio was relatively easy but ultimately fruitless.

But where did this idea of a learning curve come from? A little bit of digging reveals that there's only one (family of) learning curve(s), and what it/they represents is the relationship between how well one can perform a task vs how much practice once has had. It is a concept derived from the worlds of the military and manufacturing, so "how well" has a quite specific meaning: it means how consistently. And how consistently we can perform an action is only of interest if we have to perform the action many, many times. Which is what people who work in manufacturing (and, at the time that the original studies were done, the military) do.

And it turns out, in pretty much all the cases that anyone has looked at, that the range of improvement that is possible is huge (thus all learning curves are high), and that the vast majority of the improvement comes from the early minority of repetitions (thus all learning curves are steep). Even at very high repetition counts, tens or hundreds of thousands, further repetitions can produce a marginal improvement in consistency (thus all learning curves are long). This is of great interest to people who plan manufacturing, or other, similar, operations because they can then do a little bit of experimentation to see how many repetitions a worker needs to do to obtain a couple of given levels of consistency. They can then fit a power-law curve trough that data and predict how many repetitions will be needed to obtain another, higher, required level of consistency.

Actual learning curves seem usually to be represented as showing some measure of error, or variation, starting at some high value and then dropping, very quickly at first, as the number of repetitions increases.

Which is great is large numbers of uniform repetitions is how you add value.

But, if we as programmers believe in automating all repetition, what then for the learning curve?

[*] Note: for those of you who don't like this trait, recall that you don't have to follow the links.

The Problem with Problems with Patterns

There seems to be a new venom in the informal blogsphere-wide movement against patterns. I'm not the only one to notice it.

Commentators are making statements like this:

In certain programming cultures, people consider Design Patterns to be a core set of practices that must be used to build software. It isn’t a case of when you need to solve a problem best addressed by a design pattern, then use the pattern and refer to it by name. It’s a case of always use design patterns. If your solution doesn’t naturally conform to patterns, refactor it to patterns.

What Reg originally said was "Most people consider Design Patterns to be a core set of practices that must be used to build software." Most? Really most? I challenged this remarkable claim, and Reg subsequently re-worded it as you see above. Quite who's in this certain culture I don't know. And apparently neither does Reg:

We could argue whether this applies to most, or most Java, or most BigCo, or most overall but not most who are Agile, or most who have heard the phrase "Design Patterns" but have never read the book, or most who have read the book but cannot name any pattern not included in the book, or...

But it seems off topic. So I have modified the phrase to be neutral.

But if that "certain culture" doesn't exist, then what's the problem? And if it does exist, then why is it so hard to pin down?

There are anecdotes:

When I was [...] TA for an intro programming class [...] I was asked to whip up a kind of web browser "shell" in Java. [...] Now, the first language that I learned was Smalltalk, which has historically been relatively design-pattern-free due to blocks and whatnot, and I had only learned Java as an afterthought while in college, so I coded in a slightly unorthodox way, making use of anonymous inner classes (i.e., shitty lambdas), reflection, and the like. I ended up with an extremely loosely coupled design that was extremely easy to extend; just unorthodox.

When I gave it to the prof, his first reaction, on reading the code, was...utter bafflement. He and another TA actually went over what I wrote in an attempt to find and catalog the GoF patterns that I'd used when coding the application. Their conclusion after a fairly thorough review was that my main pattern was, "Code well."

So, maybe there are some teachers of programming (an emphatically different breed from industry practitioners) who over-emphasize patterns. That I can believe. And it even makes sense, because patterns are a tool for knowledge management, and academics are in the business of knowing stuff.

It's where Patterns came from

But what's this? "Smalltalk [...] has historically been relatively design-pattern-free due to blocks and whatnot" That's simply not true. In fact, the idea of using Alexander's pattern concept to capture recurrent software design ideas comes from a group that has a large intersection with the Smalltalk community. There's a Smalltalk-originated pattern called MVC, you may have heard of it being applied here and there. Smalltalk is pattern-free? Where would such confusion come from? I think that the mention of blocks is the key.

That Smalltalk has blocks (and the careful choice of the objects in the image) gives it a certain property, that application programmers can create their own control-flow structures. Smalltalk doesn't have if and while and what have you baked in. Lisps have this same property (although they do tend to have cond and similar baked in) through the mechanism of macros.

Now, this is meant to mean that patterns disappear in such languages because design patterns are understood to be missing language features, and in Lisp and Smalltalk you can add those features in. So there shouldn't be any missing. The ur-text for this sort of thinking is Norvig's Design Patterns in Dynamic Programming.

The trouble is that he only shows that 16 of the 23 GoF patterns are "invisible or simpler" in Lisp because of various (default) Lisp language features. This has somehow been inflated in people's minds to be the equivalent of "there are no patterns in Lisp programs", where "Lisp" is understood these days as a placeholder for "my favourite dynamic language that's not nearly as flexible as Lisp".

But there are

And yet, there are patterns in Lisp. Dick Gabriel explains just how bizarre the world would have to be otherwise:

If there are no such things as patterns in the Lisp world, then there are no statements a master programmer can give as advice for junior programmers, and hence there is no difference at all between an beginning programmer and a master programmer in the Lisp world.

Because that's what patterns are about, they are stories about recurrent judgments, as made by the masters in some domain.

Just in passing, here's a little gem from Lisp. Macros are great, but Common Lisp macros have this property called variable capture, which can cause problems. There are a bunch of things you can do to work around this, On Lisp gives 5. Wait, what? Surely this isn't a bunch of recurrent solutions to a problem in a context? No, that couldn't be. As it happens, it is possible to make it so that macros don't have this problem at all. What you get then is Scheme's hygienic macros, which the Common Lisp community doesn't seem too keen to adopt. Instead, they define variable capture to be a feature.

Now, it's tempting to take these observations about the Lisp world and conclude that the whole "no patterns (are bunk because) in my language" thread is arrant nonsense and proceed on our way. But I think that this would be to make the mistake that the anti-patterns folk are making, which is to treat patterns as a primarily technical and not social phenomenon.

Actually, a little bit too social for my taste. I've been to a EuroPLoP and work-shopped (organizational) patterns of my own co-discovery, and it'll be a good while before I go to another one–rather too much of this sort of thing for my liking. But that's my problem, it seems to work for the regulars and more power to them.

Contempt

Look again at Reg's non-definition of the "certain programming cultures":

[...] most, or most Java, or most BigCo, or most overall but not most who are Agile, or most who have heard the phrase "Design Patterns" but have never read the book, or most who have read the book but cannot name any pattern not included in the book,[...]

doesn't it seem a little bit as if these pattern-deluded folks are mainly "programmers I don't respect"?

This is made pretty explicit in another high-profile pattern basher's writing

The other seminal industry book in software design was Design Patterns, which left a mark the width of a two-by-four on the faces of every programmer in the world, assuming the world contains only Java and C++ programmers, which they often do. [...] The only people who routinely get excited about Design Patterns are programmers, and only programmers who use certain languages. Perl programmers were, by and large, not very impressed with Design Patterns. However, Java programmers misattributed this; they concluded that Perl programmers must be slovenly [characters in a strained metaphor].

Just in case you're in any doubt, he adds

I'll bet that by now you're just as glad as I am that we're not talking to Java programmers right now! Now that I've demonstrated one way (of many) in which they're utterly irrational, it should be pretty clear that their response isn't likely to be a rational one.

No, of course not.

A guy who works for Google once asked me why I thought it should be that, of all the Googlers who blog, Steve has the highest profile. It's probably got something to do with him being the only one who regularly calls a large segment of the industry idiots to their face.

And now don't we get down to it? If you use patterns in your work, then you must be an idiot. Not only are you working on one of those weak languages, but you don't even know enough to know that GoF is only a bunch of work-arounds for C++'s failings, right? I think Dick Gabriel's comment goes to the heart of it. If patterns are the master programmers advice to the beginning programmer, codified, then to use patterns is to admit to being a beginner. And who'd want to do that?

"Opportunities for Improvement"

A World Without Getters

The World We Actually Live In

Irresponsible and Out of Control

Classy Behaviour

Old Skool

Putting Your Head in a Bag Doesn't Make you Hidden

What was the point of all that, again?

Postscript

Turning the Tables

Suck it and See

It's where Patterns came from

Contempt

My Links

Blogs of team members past

Other current or ex-Zuhlke Bloggers

Their Headlines

Disclaimer

Blog Archive