Bayesian Testing?

Introduction

I'm tossing this idea out into the world. It's half-formed and I'm learning as I go along. It may be invalid, it may be old news, it may not. What I'm hoping for is that someone who knows more about at least one of testing and Bayesian inference than I do will come and set me straight.

UPDATE: Laurent Bossavit turned out to be that person. The results below have be adjusted significantly a a result of a very illuminating conversation with him. Whatever virtue these results now have is due to him (and the defects remain my responsibility). Laurent, many thanks.

In addition, a bunch of folks kindly came along to an open space session at Xp Day London this year. Here is the commentary of one. From that already the idea became better formed, and this article reflects that improvement, thanks all. If you want to skip the motivation and cut to the chase, go here.


Evidence

You may have read that absence of evidence is not evidence of absence. Of course, this is exactly wrong. I've just looked, and there is no evidence to be found that the room in which I am sitting (nor the room in which you are, I'll bet: look around you right now) contains an elephant. I consider this strong evidence that there is no elephant in the room. Not proof, and in some ways not the best reason for inferring that there is no elephant, but certainly evidence that there is none. This seems to be different from the form of bad logic that Sagan is actually criticising, in which the absence of evidence that there isn't an elephant in the room would be considered crackpot-style evidence that there was an elephant in the room.

You may also have read (on page 7 of that pdf) that program testing can be used to show the presence of bugs, but never to show their absence! I wonder. In the general case this certainly seems to be so, but I'm going to claim that working programmers don't often address the general case.

Dijkstra's argument is that, even in the simple example of a multiplication instruction, we do not have the resources available to exhaustively test the implementation but we still demand that it should correctly multiply any two numbers within the range of the representation. Dijkstra says that we can't afford to take even a representative sample (whatever that might look like) of all the possible multiplications that our multiplier might be asked to do. And that seems plausible, too. Consider how many distinct values a numerical variable in your favourite language can take, and then square it. That's how many cases you expect the multiplication operation in your language to deal with, and deal with correctly. As an aside: do you expect it to work correctly? If so, why do you?


A Small Example of Confidence

Let's say that we wish to write some code to recognise if a stone played in a game of Go is in atari or not (this is my favourite example, for the moment). The problem is simple to state: a stone with two or more "liberties" is not in atari, a stone with one liberty is in atari. A stone can have 1 or more liberties. In a real game situation it can be some work to calculate how many liberties a stone has, but the condition for atari is that simple.

A single stone can have only 1, 2, 3 or 4 liberties and those are the cases I will address here. I write some code to implement this function and I'll say that I'm fairly confident I've got it right (after all, it's only an if), but not greatly so. Laurent proposed a different question to ask from the one I was asking before—a better question, and he helped me find and understand a better answer.

The prior probability of correctness that question leads to is 1 ⁄ 16. This is because there are 16 possible one-to-one onto mappings from {1, 2, 3, 4} to {T, F} and only one of them is the correct function. Thus, the prior is the prior probability that my function behaves identically to some other function that is correct by definition.

How might a test result influence that probability of correctness? There is a spreadsheet which shows a scheme for doing that using what very little I understand of Bayesian inference, slightly less naïvely applied than before.

Cells in the spreadsheet are colour–coded to give a guide as to how the various values are used in the Bayesian formula. The key, as discussed in the XpDay session is how to count cases to find the conditional probabilities of seeing the evidence.

The test would look something like this:
One Liberty Means Atari
libertiesatari?
1true

The posterior probability of correctness is 0.125


Adding More Test Cases

Suppose that I add another case that shows that when there are 2 liberties the code correctly determines that the stone is not in atari.
One Liberty Means Atari
libertiesatari?
1true
2false
Using the same counting scheme as in the first case and using the updated probability from the first case as the prior in the second then it seems as if the updated probability of correctness with the new evidence is increased to 0.25 as this sheet shows.

But suppose that the second test actually showed an incorrect result: 2 liberties and atari true.
One Liberty Means Atari
libertiesatari?
1true
2true
Then, as we might expect, the updated probability of correctness falls to 0.0 as shown here. And as the formula works by multiplication of the prior probability by a factor based on the evidence, the updated probability will stay at zero no matter what further evidence is presented—which seems like the right behaviour to me.

This problem is very small, so in fact we can exhaustively test the solution. What happens to the probability of correctness then? Extending test coverage to these cases
One Liberty Means Atari
libertiesatari?
1true
2false
3false
gives an updated probability of 0.5 as shown here.

One more case remains to be added:
One Liberty Means Atari
libertiesatari?
1true
2false
3false
4false
and the posterior probability of correctness is updated to 1.0 as shown here.

That result seems at to contradict Dijkstra: exhaustive testing, in a case where we can do that, does show the absence of bugs. He probably knew that.


Next?

My brain is fizzing with all sorts of questions to ask about this approach: I talked here about retrofitted tests, can it help with TDD? Can this approach guide us in choosing good tests to write next? How can the structure of the domain and co-domain of the functions we test guide us to high confidence quickly? Or can't they? Can the current level of confidence be a guide to how much further investment we should make in testing?

Some interesting suggestions are coming in in the comments, many thanks for those.

My next plan I think will be to repeat this exercise for a slightly more complex function.

New article for BCW

If you read this blog then there's likely little new for you in this article for Business Computing World, but it might amuse.

Innovation Games

Next Tuesday there will be a special XtC event at Zuhlke's office in London. Luke Hohmann will be demonstrating his innovation games for Agile teams. Should be good.

Details here.

Places remain at XP Day London 2009

XP Day London is filling up, but places remain. The programme is looking very good. Register here.

Sketches

One of the things I like to do in my free time is to dabble, in the most unschooled fashion imaginable, in music composition. Composing is hard. About as hard (and remarkably similar to) programming. Arnold Schoenberg offers this "advice for self-criticism" to students of composition:
6. MAKE MANY SKETCHES
Join the best sketches to produce others and improve them until the result is satisfactory.

To make sketches is a humble and unpretentious approach toward perfection.

Fundamentals of Musical Composition, Ch XII
I think that this applies equally well to programming.

XP Day London 09: Programme

After a lot of wrangling the almost-but-not-quite final programme for XP Day London is now available. Because of illness and other asynchronous distractions some of the presenters had to change at the last minute we still have to nail down one session, but this will be pretty much it.

This year we have a lot of excellent experience reports from a range of practitioners who've been doing exiting new things and some really outstanding keynotes.

Scheduling by value?

David Peterson has started a new blog on Kanban (and snaffled a very tasty URL for it). He presents this discussion of scheduling features into a development team. The case the David presents is related to a behaviour I sometimes see with inexperienced teams who's just had someone go learn Scrum. Comes the next planning meeting and this idea pops up that the backlog needs to be ordered by "business value" so that the "most valuable" features can be delivered earliest.

This can easily lead to some very nasty scenes where the Scrum Master demands that the Product Owner produce a "value" for each story—actually write a number on the card. The problem comes to a head when it turns out that the Product Owner not only doesn't know the value of the stories they are putting on the backlog, but they also have no way of finding out what they value of a story is. And this isn't because they are stupid, nor incompetent, nor malicious. It's because finding that value is far, far too difficult and time consuming an activity. And there's a good chance that any answer that came out of it would be so well hedged as to be meaningless.

Sometimes the Product Owner does know, or can find out at reasonable cost, a value for a story or feature. Being able to trade a new asset class probably can be valued. Changing a flow to give 10% high conversion probably can be valued. Improving a model to get 1% higher efficiency in the machines it's used to design can probably be valued. These valuations will be functions of time time and various other parameters. If you really have to, you could get a number of them that's valid today (and perhaps only today). David makes the point that even if you do know that number for a feature, scheduling the next one simply on the basis of highest value might not be the smartest move. There are other variables to consider.

There is a case to be made that within the context of a project value isn't the best figure of merit to use anyway, since someone should have made a go/no-go decision at some point that the planned budget and planned value seemed reasonable. That decision should be re-assessed frequently (far too little of this goes on) based on progress to date, and action taken if the actuals have come too far adrift, but in-between those times trying to optimise on value is perhaps not worth it.

Another option is to indeed demand (and obtain) those value numbers and then schedule work primarily on the basis of business value and dispense with effort estimates, so-called "naked planning". This has caused eyebrows to be raised. The underlying claim is that
value varies along an exponential scale while development costs vary along a linear scale. Therefore delivering the most valuable features trumps any consideration of whether or not the most valuable feature is cheap or easy to develop
whihc, if true of your environment, might give pause for though. How this interacts with the desire to schedule so as to maximise throughput at the bottleneck is an open question, for me at least.

Service-Oriented Architecture

I'm currently embroiled in the long and fraught process of having telephony and data services installed in a certain location. One supplier steadfastly and consistently refused to respond to my offers to become a paying customer, so I selected another who were very responsive at first, but have become less and less so over time. In fact, it's about two months since I signed and still no service has been provided (although bills have been sent).

Part of my frustration with this is that it's very hard to find out what's going on. The company, a British telecoms provider and let's leave it at that, was once a monolithic monopoly but now has been dissected into multiple different business units, components, we might almost call them, each—I suppose—focussing on its so–called Core Competence (and more on that in a later post). Each of these components has its various workflows that it does and one or more contracts with other components for services it supplies or consumes and the components communicate by passing electronic messages to one another. Sometimes they pass electronic messages to me, complete with the URL of some other component where I have to go and do some action. It's all very slick and automated and orchestrated and, indeed, seems to have a mind of its own.

For instance, the putting-in-wires component received a message telling it to come to my location and do just that. Unfortunately, the agent of the no-I-mean-really-putting-in-wires component to which they delegated implementation of that action was not able to complete it. He sent a message saying so and various exception flows kicked off, requiring a lot of manual intervention, oh yes.

Meanwhile, the arrange-for-telephony component turned out to have a clock running and when a certain (unpublicised) duration had elapsed without it receiving a notification of success from the putting-in-wires component (which was busy with some recovery actions on the no-I-mean-really-putting-in-wires component) it triggered a flow that cancelled my original request to have some services. A notification was received by one (but not all) of the taking-money-off-you components and one of them sent me a message telling me that my order for some services had been cancelled. A good thing, because otherwise I would have been blissfully unaware of the situation. On the other hand I am now angrily aware of the situation.

Now, here's the fun bit: irrespective which component sends me a message, no agent working for that component can explain to me what the message means, because whatever it means that meaning belongs to whatever other component sent the earlier message that lead the the message I received being sent. And no, they can't put me through to an agent in that component. There is no interoperability layer.

Today I spoke with five agents in three different components. One of them gave me quite the run–around because although I had contacted him through the callback given in the message I'd received from his component I had mis–configured part of my message header leading to my message being dispatched to the wrong agent because I had misunderstood the published specification for that header which he freely admitted was itself a shoddy piece of work with unreasonable and misleading contents but it was still my problem that I'd botched the message send.

Also, I've learned that to get to speak to an agent at all I have to go twice around the loop of failing a handshake because I can't provide a piece of data that the protocol requires but that I won't get until the request succeeds. After two failures in a row a supervisory process notices and I'm failed over to a more generic service through which I can contact an agent, but that service is not exposed on a public URL.

To all of which I say: bring back the mainframe.

Observations on Estimation

Teams following a process like Scrum tend to estimate the "size" of stories as an aid to figuring out a commitment for a sprint. My view is that this is a transitional practice, and that the aim should be to learn how to make stories all roughly the same size so that commitments (also a transitional practice) can be determined by counting.

While all of that is going on teams that want to use a numerical scale to estimate (rather than, say, "t-shirt" sizing) tend to choose a scale, a sequence of licit values from which estimates must be drawn. The various planning tools that demand a numerical field be filled in tend to force this issue.

I've noticed a tendency for "expert" level practitioners to want to use some clever non-linear scale, maybe Fibonacci numbers (1,2,3,5,8,13), maybe a geometric series (1,2,4,8,16) and they will have some sophisticated reason why this or that series is preferred. And I've noticed that a lot of teams aren't comfortable with this. They want to use a linear scale.

It seems to be traumatic enough that the estimates don't have units, or even dimensions. The idea that estimates are dimensionless but also structured can be a double cause of confusion.

Anecdote: a team had been estimating and planning and delivering consistently for a good long time. Their velocity was fairly constant, but drifted over time (fair enough). One day it turned out that their velocity happened to be numerically equal to the number of team members times the number of days to the next planning horizon. Someone noticed this and with a huge sigh of relief the team concluded that these mysterious "units" in which they estimated were actually man-days in disguise. Now they finally understood what they were estimating! And they promptly lost the ability to estimate: their next planning session was all over the place and it took some time for their planning activities to converge again. My inference was that it's actually quite important that estimates are dimensionless.

Anecdote: a User Experience expert at a client had been involved in some research whereby (as a side effect) members of the general public had to create a scale that made sense to them within which to rank the usability of features. These folks were presented with different generic objects and asked to give them a "size", and then to give a corresponding "size" to some other generic objects in order to create a scale that made sense to them, which would then be applied to the merit of the system features that were the actual target of the research. They created linear scales.

[After seeing this he added the observation that this process was in aid of avoiding what often happens with the strongly disagree, disagree, no preference... type of scale which is either polarised or bland results, neither of which is that useful]

That surprised me at first, since I know that the physics of our sensory apparatus are generally non-linear, and memory is non-linear and so forth. But thinking about it some more I realised that our experience tends to seem to be linear, even if the underlying phenomena aren't.

Meanwhile, if one did want to use a particular scale for estimating the size of stories, why not use one of the series of prefered values? They are very well established in engineering and product design and offer interesting error-minimising properties. On the other hand, it might be a real struggle to get a team to decide if a story was a 1.6 or a 3.15

I don't have a grand narrative into wich to fit these observations, but here is another related anecdote about estimation.

Real Engineers

There's a recurring fashion for beating up those who would build software for a living with tales of how "real" engineers do it. This started with the woefully misguided "Software Engineering" conferences of the late 1960's and continues on-and-off to this day.

As a response to this I like to collect examples of "real" engineers screwing up. Not out of malice, but out of a desire to ground certain aspects of my professional life in something resembling fact. Here are some reported facts about the Lockheed Martin F-22 "Raptor" fighter aircraft:

  1. the aircraft has recently required more than 30 hours of maintenance for every hour in the skies

  2. the canopy needs refurbishing after 331 hours of flying, less than half the design goal of 800 hours, and this costs $120,000 a pop

  3. the aircraft is almost twenty thousand dollars more expensive to fly per hour than its predecessor (which costs an already eye-watering 30+ grand per hour—what do they run on, Chanel No 5? Single malt whisky?)
And so on. Just to add to the fun, the development cycle for this aircraft was so long that the class of mission for which it was designed no longer takes place.Is this the fault of the folks at Lockheed Martin? Not really. It's the fault of a system that put politics ahead of engineering, a circumstance under which no-one can be successful.

Change

We aim to keep all our offices ISO 9000 certified. This means updating the procedures to reflect our increasing tendency to run software development projects under an Agile process. It's fallen to me to update our existing "Change Control" procedure which of its kind is very good, but does build up from the assumption that to have a change to control is an exceptional (although expected) event.

I keep wanting to turn that inside out and say—deal with every requirement this way, make this the default, make the threshold of cost/schedule/risk impact for triggering change control be 0.0 in all cases, have the Change Control Board meet every week, have them assess the impact of every item and decide whether or not to schedule it, have no other way to schedule any activity than via the change log. And bingo! Your project would pretty much be agile.

In fact, I might just do that.

De–skilling through Technology: friend and foe

Once, and for a mercifully short time, I lived on the very western edge of Bournemouth. You might recall that Ford considered his article for the Guide describing how to have a good time in Bournemouth to be one of his finest pieces of fiction. So, I was often wanting to go away, farther away than I could sensibly go on my pseudo-vintage motorcycle in what time was available. That meant going from the County Gates to Bournemouth railway station. This could be done on the cramped, slow local bus or at great expense by taxi.

But I noticed that the long–distance coaches coming in from the even further south and west of England made one last stop before Bournemouth Coach Station (adjacent to the railway station) in Westbourne, just around the corner from me. These coaches are large, have luggage space and go fast on major roads. And the on–line booking system did not blink an electronic eye at me buying and downloading to print out a 90 pence ticket to ride from Westbourne to Bournemouth. The drivers of the coaches did blink when I got on there, but I had a ticket so they shrugged and went about their business. That's nice. Imagine the counter–arguments that would likely come from a human ticket agent who knows only too well the various alternatives to doing so preposterous a thing as riding a long–haul coach half–way across the town where you live.

Now, much more recently I was on the very edge of the very agreeable city of Charlotte, North Carolina. I had been down–town to check out the excellent Mint Museum. Being a citizen of the Socialist People's Republic of Europe I naturally tried to do the journey by public transport. Charlotte has a pretty good public transport system, C.A.T.S. However, it turned out that the bus service that would otherwise run from exactly the last stop of the light rail system (the "Lynx") to exactly my hotel was not running that day. Nice try, but I have to do the last few miles by cab. Now, I am a stranger to the city and am in any case in a fairly obscure part of it. I call a cab company and while I can explain with a certain degree of accuracy where I want to go, I don't know where I am.

And that's a problem.

The cabs are guided by GPS, so they need an exact street address for pickup and set down. This is so that the drivers do not need to know their way around. That's a pretty shocking concept for a resident of London, where cab drivers know their way around so well it changes their brains. Troublesome de–skill #1.

All I know is that I'm at the last stop of the Blue Line, but I don't know where that is. And neither does the dispatcher at the cab company—after all, all they do is pass on the co–ordinates of two places they don't know the location of to drivers who don't know the route between them. De–skill #2. Luckily for me, the dispatcher thought that they knew someone who might be able to figure out where this Lynx station was. I should call back in a few minutes. I do so, and by virtue of the elevated situation of the station I can call off enough landmarks (that is, names of shopping malls) for this person at the other end to work out where I am. My hotel, for some reason, they can look up very easily.

Some time later a cab rolls up. And it turns out that all this marvellous de–skilling has been wasted because the guy is a veteran driver and knows the district like the back of his hand.

Also, once I'm seated and strapped in, his first question is: so, where are we going?

Finish-start Dependencies: just say no!

I'm working with a team that have (as do we all) a tendency when under stress to fall back upon what they are comfortable with—in particular, to serialising their activities. "We can't x because they haven't finished y because they're waiting for z..."

To help them remember that this isn't the best move they can make, I've devised this symbol to display in the team area. You are welcome to use it yourself.

Creative Commons License
No Serialised Activities by Keith Braithwaite is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 2.0 UK: England & Wales License.
Based on a work at 1.bp.blogspot.com.

Learning from Architects

From dog–houses to skyscrapers, the discipline of the building architect has long been a rich source of metaphor for system architects. I don't disagree, indeed in one of my contributions to the 97 Things Every Software Architect Should Know I recommended that software architects should learn from architects of buildings. So, you can imagine that my eye was caught by Matthew Frederick's 101 Things I Learned in Architecture School. This small book contains the eponymous count of hints, tips and tricks from a qualified architect to architecture students.

This is a good resource for those of us who would learn from architects of buildings, as it is advice to would–practitioners from a practitioner and as such relates to what building architects actually do, rather than what someone who isn't a practitioner thinks they must, surely, do. This latter is a common failure mode of software professionals seeking inspiration from other disciplines—all the way back to the very first Software Engineering conferences of the late 1960's, in which an hallucinatory notion of what "engineers" do was foisted upon us. But I digress.

Some of the 101 are very low level and very specific (eg Number 90 "Roll your drawings for transport or storage with the image side facing out"), others are much broader and seem to me to have relevance for any sort of design work.

Number 15 tells us that "A parti is the central idea or concept of a building." Wikipedia tells us that parti is from the French prendre parti "to make a decision". The parti captures, presents and summarises the highest level decision that has been made about the organising principle of an entire building or building project, and examples are given where the parti expresses all that in one, highly abstract, diagram.

A parti sounds to me a lot like a system metaphor.

Number 100 tells us that the parti should have a name, such as "half–eaten donut" or "meeting of strangers". Could the nominal (or de facto [*]) architect of the system that you are working on draw such a diagram? Would it tell anyone anything if they did? Could they name the parti, the very highest level design decision from which the rest of the system design flows?

Number 28 tells us that a good designer isn't afraid to throw away a good idea. Notice: a good idea. An idea can be good and not fit with the parti, in which case it has no place in the design. We are advised to "save [...] good but ill–fitting ideas for another time and project—and with the knowledge that they might not work then, either." When was the last time you (or your team) threw out a good idea?

Number 46 tells us to "Create architectural richness through informed simplicity or an interaction of simples rather than through unnecessarily busy agglomerations". Frederick warns particularly against "busying up a project with doodads because it is boring without them; agglomerating many unrelated elements without concern for their unity because they are interesting in themselves." What interesting doodads does your current project have?

One reason to follow the guidance of Number 46 is given in Number 51, which observes that "Beauty is due more to the harmonious relationships among the elements of a composition than to the elements themselves". One achieves this beauty through a design process, or system.

Number 77 cautions that "No design system is or should be perfect. Designers are often hampered by a well–intentioned by erroneous belief that a good design solution is perfectly systematic [...] but nonconforming oddities can be enriching, humanising aspects of your project." 77 also observes that "exceptions to the rule are often more interesting that the rules themselves."

Number 81 notes that "Properly gaining control of the design process tends to feel like one is losing control of the design process" 81 advises that the designer should "accept uncertainty. Recognise as normal the feeling of lostness that attends to much of the process. Don't seek to relieve your anxiety by marrying yourself prematurely to a design solution; design divorces are never pretty" No, they never are.

Number 99 can help. It says "Just do something. [...] don't wait for clarity to arrive before beginning to draw. Drawing is not simply a way of depicting a design solution; it is itself a way of learning about the problem you are trying to solve." I think that much the same can be said for coding. Are the design procedures in your team aligned with these principles?

[*] Even the most self–organised, most cross–functional, most Agile, most collectively–code–owning software development team will have one individual who knows most about (and likely has most influence over) the architecture of the system. You can pretend otherwise, or you can take advantage of it.

Plaudit

Apparently, this is the 141st most "top" blog for developers (as of Q2 2009). That's probably pretty far out along some long tail of topness. Ah well.

The metric used captures something like degree of interest of the rest of the blogosphere. So, thanks for your interest.

Let's talk about feelings

I seem to recall that back in the days when he was prone to wild outbursts of public self-examination (this even before blogging and twitter) John Cleese gave an interview in part about his early experiences with therapy, with a "talking" cure. His therapist would begin, "How do you feel?" and John would say "Well, I think..." and his therapist would interrupt "No. How do you feel?" and John would say "Well, I think..." and his therapist would interrupt "No. How do you feel?" And so on. You can see the problem. And he couldn't, I suppose. Which was the problem.

More and more these days I want to ask people how they feel about their code. Here's part of why.

Ivan Moore and "the other" Mike Hill have this conference session that they do called Programming in the Small [pdf]. I love it. They put little examples of code in front of folks and invite them to refactor, just a little bit. And then reflect on the refactoring. One of the things they've noticed that programers tend to do under these conditions is futz around with the code before getting down to actually improving the design.

Mike and Steve Freeman do a similar session called "Programming without Getters" This is in the so-called "dojo" format, where a revolving pair of programmers is invited to take some fairly typical "enterprise" class code and refactor it to remove the getters. There are those of us that believe that there can be something quite special about OO code with that property, but the session didn't really get there. That's because the pairs couldn't bring themselves to do the refactoring as asked, they couldn't even start to remove any getters, until they'd futzed around with the code first. And since a new person rolls into the pair every five minutes or so, it's pretty much a non-stop futz-fest.

Linda Rising saw a talk I give about this design metric that I've been playing with over the last couple of years (I have a day job, so progress has been slow). She was struck by my observation that folks who've tried this metric on their own code have reported that refactoring which made them happier with the code also increases the value of the metric. Linda wanted to know if they really said "happier". They really do.

It turns out that Linda did some research back in the day on a design metric of her own. During this work she had noticed that in general, programers like to futz around with code before they get down to work on it. If I recall correctly, she asked Dick Gabriel about this and he said that "programmers do that", along with some allusion to that metaphorical aphorism about the flavour of soup and pissing in it. I'm sure a lot of that goes on. But what Linda (again, this is as well as I recall) further noticed was that they tended to do this much less with code that scored well on her metric. And they described themselves as being more comfortable working on this code than on code that had a low score.

Now, I've nothing against metrics in principle (after all, I seem to be in the middle of inventing one) but I'm rather dubious about all this dashbordery and piechartism that's going on these days, all this getting of the CI server to dish up trends of metrics and blah, blah, blah. It's nice to have the numbers, it's nice to see a healthy trend, but.......

How do you feel about your code? Does it make you happy? Is it comfortable?


XP Day London 09: Call for Sessions

Submissions are now open for programmed sessions at XpDay London 2009, to be held 7th and 8th December 2009. http://www.xpday.org/

You are invited to propose a session for the first day of the conference. We are particularly interested in the following
  • Experience reports—share your stories of challenge and success with Agile and Lean techniques. Experience reports will be intensively shepherded by experienced practitioners.
  • Hands-on technical sessions—share techniques and practices in practical sessions: workshops, tutorials, simulations
  • Practitioners' advances in the art—share the techniques of expert Agile and Lean practitioners, work with them to move the craft forward.
The second day of the conference will be an OpenSpace session with topics selected at the end of the first day. Programmed sessions are most suitable for topics requiring some set up or extensive preparation.

To submit a session, please go to http://xpday-london.editme.com/XpDay2009Submissions

Submissions will be accepted until Friday 14th August.

Flow: are two dimensions enough?

Inspired by a comment by Joseph at the recent Agile Coach's Gathering I've been experimenting with the use of this model in mid-year reviews for my team.

One of the guys observed that it seems to have a missing dimension. It seems that it's possible to have a case whereby an individual is working on a solidly challenging problem, well up the y-axis, and they have the high level of skill to meet that challenge, well along the x-axis...and when all's said and done they'd really rather be doing something a bit more enjoyable.

Missing axis: fun.

Bridging the Communication Gap

Call it automated acceptance testing, functional test driven development, checked examples or what you will—the use of automatic validation is one of the most effective tools in the Agile developer's kit.

It's a large and involved field, touching on almost every aspect of development. Handy, then, that Gojko Adzic has published a very comprehensive guide to the use of automated acceptance tests in contemporary Agile development practice, Bridging the Communication Gap: Specification by example and agile acceptance testing.

This book is a much-needed checkpoint in the on-going adventure to discover (and re-discover) how to write software effectively. Gojko is a very energetic enthusiast for these ideas, and a very experienced practitioner of them. His knowledge and expertise is present on every page.

A strong theme runs through the book—that the reason we capture examples of required behaviour and automate validation of them is to improve communication. Examples turn out to be a very powerful way to understand a problem domain and to explore a solution suitable for that domain. There turn out to be fascinating reasons for why this is true, but Gojko quite reasonably focusses on practical advice.

The main body of the book tells a story, a story of understanding, finding, and using examples to create shared understanding across a team. Gojko gives very concrete advice in a series of short chapters and explains how to do this. How to organise a workshop to find examples, how to find good examples, how to use tools to automate validation, how to use the resulting tests to guide development. Each chapter ends with a handy bullet list of key points. Together with other material on the best use for developers to make of such checked examples, and how to fit example discovery and capture into a typical Agile development process Bridging the Communication Gap provides as close to a vade mecum for newcomers to the discipline of functional test driven development as we are likely to see.

Gojko draws informative parallels with other techniques more or less strongly aligned with the Agile development world. This places the practice of Agile acceptance testing in context, and as a team-wide activity, reinforcing the cross-functional nature of the tool. Always the emphasis is on helping the various stakeholders in a development project communicate better.

There is a survey of tools available for this kind of work, which I might wish were slightly broader in scope and a little more detailed, but it does give a good overview of the market leaders. "Market leaders" in the weakest sense, since it turns out that the best tools for this kind of work are all FOSS: big-ticket corporate testing tools really aren't in this game.

Various points regarding writing and using tests are illustrated with (of course) illuminating examples. Also described are limitations of these techniques and some pitfalls to watch out for, something that more promoters of development techniques should provide.

The book is self-published and my copy was printed by Lightening Source. Books produced this way are getting better all the time, but are still not presented at the level of quality one would expect from a commercial publishing house. The pages seem very full and with the choice of font made the text a very dark colour which I don't find easy to read. The section and sub-section headings are sometimes over long and are not laid out well, a combination that I found made the book less easy to navigate than it might have been.

I will be using with book with clients and recommending it to them for future reference. A boon to the community.

The Quality of non-declining Velocity

I'm supposing that anyone reading this blog will be familiar with the stuff-left-to-do vs time chart used by many Agile teams to track the progress and (with due care an attention) predict the outcome of a development episode:In his Zurich Lean/Scrum/Agile show keynote Ken Schwaber presented an interesting slant on this chart (apparently he got this from Ron Jeffries). The argument is that if you have poor internal quality—in particular if your "definition of done" is not strong enough and does not require you to keep your code in tip-top condition—then there will be a hidden accumulation of stuff-left-to-do. So your chart is in effect more like this:
This is interestingly different from the technique that some use to show scope added during an episode, with a stepped baseline. Here, additional work is accumulating, invisibly, inside your code. Work that you will have to do at some point to get a releasable increment. This has the effect of making a burn-down line more shallow than it appears, suggesting that there will be either unanticipated under-delivery of scope or (worse yet) a need to slip delivery.
If the causes of poor internal quality are not rectified, then this effect will repeat, and over successive episodes the team in question will get slower and slower (or deliver less and less)
Talking about this with Karl Scotland and Joseph Pelrine in the bar afterwards we tossed around the idea that this shows internal quality (traditionally a hard thing to measure) to be something like the first derivative of project velocity with respect to time.

And now to stretch the metaphor to breaking point. For that to make sense quantitatively it seems as if what we're really saying is that if the internal quality Q is less than some threshold Qcv (the quality of non-decreasing velocity) then the velocity V will decrease over time:
V/∂t Q-Qcv
Well, it seems reasonable that this will hold for low quality, when Q - Qcv is negative. But what about when Q - Qcv is positive? Is it possible to take a team that is writing code at the level of quality required for non-decreasing velocity—that is, not accumulating hidden extra work—and then increase velocity by increasing quality?

I think it is.

I think that a team can push really hard on internal quality and have it turn out that there is less work to do than they thought to get finished.

And maybe that's obvious—and maybe it isn't—but certainly I now feel as if I have a much better handle on how to explain to someone why (as the Software Craftsmanship folks say) the only way to go fast is to go well.

The Vendor/Client Relationship in Real Life

What would happen[youtube] if clients tried to deploy the kinds of arguments they use with "professional services" suppliers in other situations...

Epic user interface fail of Homeric proportions

A little while ago I was riding the now sadly degenerate East Coast Main Line service in a Mk IV coach and noticed a bit of a curfuffle in the vestibule (I love it that British railway carriages are still referred to as having "vestibules"). Paying attention I discovered that an elderly lady had required assistance from the train staff with the door of the toilet. This seemed a little odd, so after a suitably discreet interval I went to investigate. One has to make one's own enertainment on the train.

What I found was a flabbergasting cascade of fail.

This advice is welcome:
DSC00106
Judging by the design and finish, this is the original signage.

But how to close and lock the door?
DSC00104
Ok, slightly non-obvious. Also, lacking some rather crucial information, it will turn out.

Seems as if more people than the lady I saw have had problems with the door, since there was this later, auxiliary sign:
DSC00107
Apologies for the poor quality. The text at the bottom reads "If the 'lock' button is not illuminated, the toilet doot is NOT locked" It might very well be closed, you see, but not locked. The original signage has braille attached, not so this vital little nugget of information (as I recall).

It seems that not even this prompt has quite been doing the job, as this third sign had also been added:
DSC00105
The visually impared (and those on urgent business) are now in serious trouble.

Being of an equiring mind, and finding the complexity of this control system too hard to believe, I did a few experiments and made an important discovery not covered by any of the above instructions but of no little importance I think.

If you press, or, let us say, accidentally nudge, the 'lock' button while the door is closed and locked the door (which is powered in the interest of the mobility impared, a good thing in itself) both unlocks and opens.

Now, how hard could this be? What am I missing from this design?

Still two buttons, their respective behaviour being:
  1. If the door is open, close and lock it. If the door is closed and locked, no action.
  2. If the door is closed and locked, open it. If the door is open, no action.

I'll be popping over to Switzerland in a couple of weeks, a place where they still take trains (and much else) seriously. And shall on the trip from Flughafen Zurich to the Haputbahnhof be paying close attention to the toilet doors.

Second Opinions

Got an opinion piece up in E&T magazine regarding second opinions.

Many thanks to all who responded to my earlier posts and enquiries about that. I'll be posting digests of some stories soon.

Identity as a Process

Returning to a topic I've thought about before: identity is a problem.

I expect that most of you to be familiar with the "my grandfather's axe/boat/knife..." problem. In short, how does the identity of a composite object vary (or not) as the identities of the components vary? One proposed solution is the so–called perdurantist approach which hinges on the observation (at once both banal and deeply challenging) that what we think of as objects in the world are really structures with extent in three spacial and one temporal dimension (pace Kaluza-Klein type arguments). We don't seem to be very good at that sort of thinking. Note that perdurantism seems still to be talking about fixing the boundaries of a thing in order to identify it. I think that's missing a trick.

This [pdf] (via Michael Feathers) is a treatment of that trick I've not seen before.

In that paper the biological concept of autpoiesis (and the complex of ideas around it) is used to analyse the working of the glider pattern in Conway's Game of Life. I read the paper as telling us that identity of a glider is extent of the continuation of the process which at any given time looks to us like a glider.

Now, how to apply this understanding elsewhere...

QCon Panel: A Great Leap Forward or Exposed Artery?

A QCon panel disucssion re "Transparency" is now up. It features Kent Beck, Chris Matts, John Nolan and myself, with Steve Freeman in the chair.

The Treppenwitz was pretty strong on this one. What I'm thinking now is that instead of all that ghastly droning on I did, I should have said "the truth shall make you free" . The only thing to watch out for is that being free, although far superior to the alternative, might not be particularly comfortable.

But it is still far superior.

Software Craftsmanship 2010

Apparently there will be one. Excellent! For all my growing distaste for the This Movement and That Manifesto and what all that's springing up around the parts of the industry that I see most closely, a place that happens to be of the same name as one where people can come together and share an enjoyment of good code and coding is a fine thing.

Jason says that he's "banning talks, presentations or any other kind of session that doesn't involve real live coding" I appreciate the gesture, but I'm not sure that's quite right. I found that Ade's session on mapping personal practices particularly valuable—although it could have benefited from another hour or so. None–the–less a programmers' conference with programming as its core topic illustrated through programming is a too–rare thing and Jason is to be commended for making one happen.

Both of the sessions that I ran there involved real coding by attnedees and even though I wasn't coding myself I learned plenty from the reactions of those who were, such as Gojko's on the TDD session. I've not come across a better forum elsewhere for that kind of discussion on that sort of scale than the SC conference, so having another one seems like a very good thing.

Fallacy as tool

In the comments to this post regarding the level of debate regarding climate change in certain circles Pithlord has this to say:
Most fallacies aren’t really fallacies when you reinterpret them as Bayesian reasons to give an idea more credence rather than iron-clad syllogisms. Without [...] the “ad hominem fallacy” [...] you’d give all your money to Nigerian spammers.
That's a very nice formulation of the idea that the "fallacies" of logic are amongst the tools of rehtoric. This is an notion that has fasinated me ever since I first read Zen and the Art of Motorcyle Maintainance. The rhetorical apporach may not demand agreement (in the way that a sound and valid argument would) but it does tend to persuade. And the answers given by fallacious arguments are not neccessarily wrong, just uncertain.

Many folks in the IT industry seem to want to bludgeon their interlocutors into agreement with something that looks a lot like a proof (mea culpa). As I get older the gentle art of persuasion seems more and more attractive.

Seduced by the drama?

Have you ever watched those shows on the Discovery channel (or similar) where the huge construction project goes a bit wrong? If it were more mainstream programming the story would pretty quickly stop being about the stuff and start being about the people, but since it is intended for 14-year old boys of every age these programmes don't quite go down that route. Big yellow machines, yum! But they do always have that drama to them. Drama comes form conflict and in these shows the conflict is between what the plan says should happen and the conditions in the world. A smooth and straightforward project would be less than gripping.

There are lots of those, but they wouldn't make great television. I find that many of these shows still don't make especially great television anyway because what actually happens is that, for example, a smooth technocrat (often German, the best sort) arrives, points out a few alternatives, gets everything back on track and off we go to a successful delivery. That's how it mostly is in the grown-up world of proper stuff. Risks materialise as issues (as it is expected they will, from time to time) and are dealt with in a calm and orderly way. There's the occasional stopppage, the odd bout of overtime, the best crane driver in the world has to be lured out of retirement for this one last job or whatever it may be. But there's no food fight. Food fight? Bear with me.

Scott Berkun has an essay out called "Ugly Teams Win", a part of the forthcoming Beautiful Teams. He presents an...interesting model:
[...] when things get tough, it's the ugly teams that win. People from ugly teams expect things to go wrong and show up anyway.
Well. He passes through some interesting observations about the, what shall we say? challenging personalities of several stellar individuals in several fields. You might be very happy to have a Picasso in your house, but to have had Picasso himself would be a different matter. Fine. Of course, a gang of stellar performers is a very different thing from a well-performing team.

However, Berkun's point is well made that the best people to have on your project for the good of the project might not appear to be the best people generally, in all sorts of ways. Building effective teams is a tricky business. He goes further, though:
The only use of beauty applied to teams that makes sense is the Japanese concept of wabi-sabi. Roughly, wabi-sabi means there is a special beauty found in things that have been used.
I'm not sure that it does, although things that have been used often end up wabi-sabi. Here's a description of wabi-sabi as I've come to understand it
Wabi refers to that which is humble, simple, normal, and healthy, while sabi refers to elegant detachment and the rustic maturity that comes to something as it grows old. It is seen in the quiet loneliness of a garden in which the stones have become covered with moss or an old twig fence that seems to grow naturally from the ground. In the tearoom it is seen in the rusty tea kettle (sabi literally means rusty). The total effect of wabi and sabi is not gloominess or shabbiness, however, but one of peace and tranquillity
In summary "wabi-sabi refers to the delicate balance between the pleasure we get from things and the pleasure we get from freedom from things"

One interesting aspect of this is that things made from natural materials (wood, stone, leather) may acquire charm with wear, but things made form synthetic materials seem not to. Objects can become wabi-sabi through use, wear, or the simple passage of time and the natural processes that they take part in. Berkun wants that to apply to teams:
In this sense, the ugly teams I described at the beginning of this chapter, the underdog, the misfit, represent the wabi-sabi teams
That seems like a huge leap to me, especially when we find out what he means for a team to be used. Here's how he describes the early days of a project, the "Channels" functionality of IE 4:
The deals we made forced legal contracts into the hands of the development team: the use of data [...] had many restrictions and we had to follow them, despite the fact that few doing the design work had seen them before they were signed. Like the day the Titanic set sail with thousands of defective rivets, our fate was sealed well before the screaming began. Despite months of work, the [...] team failed to deliver. The demos were embarrassing. The answers to basic questions were worse.
"Our fate was sealed well before the screaming began" Oooh-kay. And this is the steady-state:
somewhere in our fourth reorg, under our third general manager and with our fifth project manager for Channels, the gallows humor began. It is here that the seeds of team wabi-sabi are sown. Pushed so far beyond what any of us expected, our sense of humor shifted into black-death Beckett mode. It began when we were facing yet another ridiculous, idiotic, self-destructive decision where all options were comically bad. "Feel the love," someone would say.
At least one of Berkun and I have completely and utterly failed to understand what wabi-sabi means.

But that's the least of the issues I have with this. It looks to me as if this team is not being used in the way that an old shoe was used (and so gained its comfort, charm, personality and identity). This team is being abused. Here's how they ended up:
Late in the project, I became the sixth, and last, program manager for Channels. My job was to get something out quickly for the final beta release, and do what damage control I could before it went out the door in the final release. When we pulled it off and found a mostly positive response from the world, we had the craziest ship party I'd ever seen. It wasn't the champagne, or the venue, or even how many people showed up. It was how little of the many tables of food was eaten: in just a few minutes, most of it had been lovingly thrown at teammates and managers.
And there's the food-fight. Don't be distracted by the hysterical high-jinks (as bad a sign as they are). Note that they've had six programme managers by this point. The end-point of the project is described in terms of damage control. Does that sound like the wabi-sabi of the moss-covered stone lantern in the quiet garden? Does it in fact sound like an in any way attractive or desirable outcome? Berkun certainly seems to want to call this a success. He says "The few who remained to work on Internet Explorer 5.0 had a special bond". No doubt they did, no doubt they did.

Here's where this story starts to turn my stomach a little. You see, after going through this dreadful experience "the few that remained" went on to make "Internet Explorer 5.0 [...] the best project team I'd ever work on, and one of the best software releases in Microsoft's history" Maybe so. But at what cost? The majority that didn't remain (it's hard not to imagine them being considered washouts, dropouts, failures), how did they feel about being placed in this outrageous position on IE4? And what are we to infer about teams toughing it out through pre-doomed projects?

One of the few cogent points to have emerged form the recent spasm of interest in so-called Software Craftsmanship is the idea that a "craftsman" has a line that they will not cross, things that they will not do. I tend to agree. I think that the industry would be in globally better shape if more people were prepared to locally say "no" to destructive madness of, well, of exactly the kind reported here for IE4 Channels. I don't consider the members of that team heroic for having made it through and bonded and all that stuff. Well, certainly not "the few". I'm inferring that some folks gave up on this project, walked away from the screaming and got on with something less harmful to themselves. Those are the folks I want to celebrate. And at every scale.

I'm very disturbed that a story such as this one is going to make it into a book about "beautiful teams", even if the point of it is supposed to be the subsequent success of the IE 5 team after their traumatic bonding experience.

This story celebrates failure. And it celebrates a particularly seductive kind of failure, one with which the IT industry is riddled. It celebrates a macho bullshit kind of failure that looks like success to stupid, evil people. It celebrates a kind of failure that too many programmers have come to (secretly) enjoy, and that too many businesses have come to expect that their programmers will (secretly) enjoy and therefore put up with.

In the grown-up world of proper stuff stupid, doomed, destructive projects get cancelled. And that is a successful outcome. We should do more of that.




Life in APL, programming in a live environment

This gorgeous screencast of an APL session shows an implementation of Conway's Game of Life being derived in APL.

It's a very delicious demonstration of what can be achieved in 1) a "live" computational environment (rather than a mere language 'n' platform) with 2) a language that works by evaluating expressions (rather than merely marshalling operations) and 3) already knows how to show you what complicated values look like (because it understands its work to be symbolic not operational).

Ponder what the Dyalog folks (no affiliation) are showing you there, ponder just how far towards that same goal you'd get in whatever programming system you use at work in 7 minutes 47 seconds (even if you'd done as much rehearsal as I'm sure they did) and then ponder the state of our industry.

Then have a stiff drink.

Agile Coaches Gathering

I'm going to the Agile Coaches Gathering at (and partly in aid of) Bletchley Park in May. I commend the event to you.

The Lives of Others, the Lives of Ourselves

I've kept to a rule with this blog, that the posts on it should be of broadly technical and mostly professional nature. I'm going to break that rule now.

Some time ago I watched the film Das Leben der Anderen, a bitter-sweet offering from Florian Henckel von Donnersmarck. It's a love story, of several kinds, and a thriller and little bit of (and little bit of a response to) an Ostalgie comedy. In one scene Hauptmann Wiesler, a Stasi interrogator and surveillance officer, teaches a class of prospective secret policemen. He puts great emphasis on the importance of the handling of the seat cover after an interview. This was a real thing: during interrogation a cloth would be used absorb the personal odours of the subject and then sealed in a jar. When, not if, the Stasi needed to find you the jar would be unsealed and used to give tracker dogs your scent.

I recall hearing about this in a John Peel piece many years ago. He went to interview some punks in the former DDR and they spoke of this and many other horrors of living in an authoritarian surveillance society. And this sort of thing was considered a genuine horror. A fine example of the reason why it was worth the grinding fear of the Cold War, to avoid living in that sort of society.

Between the time of watching that film and writing this piece I got into a conversation with a Daily Mail type of fellow, who was complaining that "the UK is becoming a complete police state. Like[sic], on the level of Nazi Germany". This is well known to be a bad way to argue. My response was that it wasn't really, not even on the level of Communist Germany, go watch Das Leben der Anderen to see what a real police state looked like.

Today I'm thinking that I might owe that guy a little bit of an apology. This report [pdf] from the Rowntree Reform Trust makes dismaying reading:
  • A quarter of the public-sector databases reviewed are almost certainly illegal under human rights or data protection law
  • Fewer than 15% of the public databases assessed in this report are effective, proportionate and necessary, with a proper legal basis for any privacy intrusions
  • The benefits claimed for data sharing are often illusory.
Well, joy.

Top of the list of problematical databases is the National DNA Database. A little background: England and Wales (Scotland and Northern Ireland have their own arrangements) are Common Law countries and like other such used to have a distinction between summary offences, misdemeanours and felonies. We don't any more (much as we don't have Grand Juries, either) we only have summary vs indictable offences. But that only applies if you are brought to trial.

We used also to have a distinction between arrestable offences (effectively, felonies) and non-arrestable offences. Over the years so many more offences have been created that this distinction was seen as unhelpful and at first it was blurred and eventually, in the Serious Organised Crime and Police Act 2005, abolished.

Under section 110 of that act a constable may arrest, without warrant, anyone he or she "has reasonable grounds for suspecting to be about to commit an offence" (any offence at all, remember, however minor) if this is "necessary". For example "to enable the name of the person in question to be ascertained" (and for various other reasons listed in the Act). What this amounts to is a summary power of arrest, of anyone, at any time. This was spectacularly under-reported at the time. And here's the punchline: everyone who is arrested is obliged to give a DNA sample and that goes on the National DNA Database.

And it stays there. It stays there if you are released without charge. It stays there even if you are charged, go to trial and are acquitted. This was already known to be in breach of the European Convention on Human Rights. In that unanimous ruling the European Court of Human Rights made a number of interesting statements:
The Court was struck by the blanket and indiscriminate nature of the power of retention in England and Wales. In particular, the data in question could be retained irrespective of the nature or gravity of the offence with which the individual was originally suspected or of the age of the suspected offender; the retention was not time-limited; and there existed only limited possibilities for an acquitted individual to have the data removed from the nationwide database or to have the materials destroyed.
And indeed it turns out to be non-trivial to get the DNA of innocent persons removed from the database.

Also,
The Court noted that England, Wales and Northern Ireland appeared to be the only jurisdictions within the Council of Europe to allow the indefinite retention of fingerprint and DNA material of any person of any age suspected of any recordable offence.
and
The Court expressed a particular concern at the risk of stigmatisation, stemming from the fact that persons in the position of the applicants, who had not been convicted of any offence and were entitled to the presumption of innocence, were treated in the same way as convicted persons.
So we are, here in the UK, faced with a police system that in this regard at least makes no distinction between the innocent and the guilty. And on a grand scale. The full version[pdf] of the RRT report states that
Over half a million innocent people (people not convicted, reprimanded, given a final warning or cautioned, and with no proceedings pending against them) – including over 39,000 children – are now on the database
and to no good effect
doubling the number of people on the database from about 2m to about 4m has not increased the proportion of crimes solved using DNA, which remains steady at about 1 in 300. Indeed, in 2007 the number actually fell slightly
I'm no longer sure that I know how this is much different from the Stasi bottling seat covers.

Except maybe that we do still have a chance to do something about it. A shame that we must appeal to the European institutions for protection from our own executive. Time to join Liberty. Hope I'm not too late.

Slipping through our fingers?

It was close. So close.

What's the really exiting thing about the Agile development movement? Is it getting to ship working software? That's pretty good. Is it not having to lie about status all the time? That's great too! Is it being able (and allowed, and required) to take true responsibility for our work? Wonderful stuff! But I don't think these are the best part.

I think that the best part is that we were starting to normalise the idea that programmers are valuable knowledge workers who collaborate with other valuable workers in the business as peers. Big chunks of the industry had to (re)learn how to do all that shipping and telling the truth and stuff in order to get to that point, but that's only the means. The end is to be dealt with by our paymasters as if we make a genuine contribution to our businesses. And even that is only a starting point for the really interesting stuff.

And now along come folks shouting about Lean. A lot of them are Agile folks looking (quite properly) for what to do next that's better. And the message seems to be: programmers are operators of a technical workflow, and nothing else. They pull to-do items from a queue (which someone else manages) and work them through various states of a purely technical nature until they can be handed off into production (where someone else creates value using them). Again and again and again forever. And it is claimed that if they are organised this way then they will be more efficient and shipping those to-do items than if organzied in other ways. This is almost certainly true.

In which case it is going to be hard to avoid being shoe-horned into that mode once the development managers wake up to it. Back down into the engine room. Back to being, rather more literally than before, a cog in a machine.

The Agile approach to developing software (or, something a lot like it) is gaining more and more interest because, along with the stuff about getting to look up at the stars it is actually a more productive way to develop than a lot of the incumbent approaches. If Lean is more productive again but doesn't even have that social stuff as a hidden agenda, I'm not sure that we as inhabitants of this corner of the industry will turn out to have done ourselves much of a favour.

Fauxtrovert

It took me a long time to overcome my distaste of blogs. I'm still not a huge fan but writing peripatetic axiom does seem to be better than useless. After a certain amount of prodding I've started to dabble with twitter.

I'm not finding it easy. One problem is that a lot of the time I'm doing things that I'm not supposed to tell anyone else about (because they are commercially sensitive) and a lot of the rest of the time I'm doing things that I just can't believe anyone would be interested to know about. That second part would seem to be a part of the introvert type.

As with blogs in the early days ("I like beer", "isn't my cat cute" etc.) the art of twittering all the miscellaneous stuff of ones' life seem fairly pointless in a way I've found difficult to explain. But now redditor shenpen has expressed it very well for me. The twitter stance would appear often to be not introversion, and not extroversion, but fauxtroversion

Lumpy Kanban

The coalescing of bits and bobs in this posting to the kanbandev Y! group have brought me to a realisation of why Kanban-for-software seems to make me cringe so much. I thought that I'd replied in the group, but apparently not, so I'll do it here. Reply now here

Versus what I'm used to seeing (expect to see, want to see) in an Agile development organisation the Kanban that I've seen explained in numerous posts, conference sessions and so forth has far too large a batch size. There's far too much work in progress. The flow of value is far too uneven. I mean, really, the Kanbanista's want us to organise our work around something as big and lumpy and lengthy as an entire feature!?

Consulting Engineers

For the last couple of years I've lived at the bottom of Syndenham Hill (on the Kent side). At the top is the place to where the Crystal Palace was moved after the Great Exhibition. The Palace and the park built around it had many water features and for this and other reasons water towers were erected to give sufficient head of pressure, something otherwise hard to achieve at the top of the biggest hill for miles around. Big park, lots of features, huge towers. 

The engineer who built these towers proved to be mentally unsound and the towers structurally unsound, so Brunel was brought in to rebuild them (which might happen again). The story goes that he was at first most reluctant to pass judgement on the work of another engineer, however bonkers. Gentlemen didn't do that to one another. 

This was in 1855, when engineering as a profession was young and about twenty years before the Tay Bridge disaster firmly fixed in people's heads the idea that one engineer won't do. Much as even accountancy firms themselves have to get an accountant in to audit their books, on a big enough job engineering firms get one another in to check they've got their sums right. This the work of Consulting Engineers. 

In the IT world we bandy the word "consultant" around with a certain amount of levity. In the US a consultant doesn't really seem much different from what in the UK we call a contractor: in everything but name a temporary employee. In the UK there is a slight difference, which a contractor once expressed to me (a consultant) in this way: contractors don't have to write reports.

More generally contractors build the system they way they are told to, consultants have an opinion about how the system should be built. This distinction is not lost on the tax authorities.

Anyway, what with all this talk of "craftsmanship" there's been about the place recently thoughts naturally turns towards the more mature disciplines. A significant aspect of those, in many cases, is that it's awfully hard to get certified to practice. 

That's in part so that individuals can bear some liability if the job goes tits-up because the certifying bodies certainly will. This is as imperfect as any human institution, but at least they try. Another aspect of this is genuine consultancy. If you, you personally, are going to bear liability for it all going pear-shaped then maybe you sould spread some of the blame load by getting someone else in. 

Notice that in the medical field, if you aren't sure of a diagnosis then you can ask for a second opinion. If your doctor isn't sure, they'll go and get one by themselves.

This seems not to happen much in the IT field. Perhaps another symptom of our immaturity? When was the last time you worked on a project where the prime contractor brought in a competitor to check that they'd "got their sums right"?

Have we had our Tay Bridge yet? Maybe. In some domains, avionics, for instance, you do hear of multiple independent implementations. That's not quite what I mean, though. 

Why don't IT companies running projects (of a certain size or complexity) routinely get the competition in to express an opinion? Why don't clients demand this as part of their risk mitigation strategy? What is it going to take for folks to bring a genuinely professional standard of conduct to IT?

Update: it's two days later and no-one has thrown down the gauntlet. I was expecting some bright individual to come back at me with a "you work for a consultancy, so you tell us why this doesn't happen" But no-one seems to be biting....

The Rush to Lean Makes Me Nervous

And I'm not the only one.

Now, I am not an expert on manufacturing, but I have seen some. Software development is not manufacturing. I don't think that it's even very much like manufacturing. Manufacturing is about achieving uniformity across huge numbers of multiples and making some trade-off between the highest and the most economic rate of production of those multiples. 

Software development is about making one

I believe that software development is more like product development than manufacturing.

I'm not an expert in product development either, but I work for a company that, amongst other things, does develop products. Product development is very different from manufacturing (which we don't do), although the two are very closely related. A big part of product development is to arrive at a design that can be effectively manufactured. 

Unfortunately a lot of our clients don't want you to know that we designed their products, so I can't brag too much. Which is a shame as a lot of them are waaaay cool. However, this flier[pdf] is public and describes one episode of product development. The flier mentions PEP, our process for developing products. As a deliberate policy we try to keep the processes that we use to develop products and the processes we use to execute software development projects as closely aligned as possible. 

Both are iterative. Both involve exploration. Both are adapted to circumstances while maintaining certain principles (such as actively managing risk). Both have delivered a very high success rate over many, many engagements. So I feel on fairly firm ground in claiming this similarity between software development and product development.

As Steve Freeman points out, by a strict Lean interpretation of the manufacturing school product development looks wasteful. And it is. And that's OK because it isn't manufacturing. The economics and the goals are different.

Its worth noting that the highly-publicised Lean successes seem to be concerned largely with operational activities: never-ending, on-going care and feeding of an existing system. To the extent that your activities are like that, a more Lean approach is likely to work well for you, I think. I've yet to hear of a success story of strict Lean (no iterations, no retrospectives, all the rest of it) run in a project setting to produce a new product. 

I don't say it can't be done, but I've not heard of it. If you have, I'd love to.


Let over Lambda

Let over Lambda is a new self–published book on lisp by Doug Hoyte. I'm not quite sure what to make of it, overall.

It's great to see a book full of straight-ahead programming these days rather than mere system composition. It's especially great to see an extended work dealing with programming in the small. It's great fun to see someone who really likes programming as an activity in its own right exhibit their enjoyment. It's a great pleasure to see assembly language appear in a programming book of the 21st century. I find the curious combination of lisp being at once both a very high level language suited to symbolic programming and being very close to the metal most stimulating. That's a pair of properties that the programming language design mainstream seems to have abandoned. Java, for instance, doesn't feel particularly close even to the JVM.

It's especially great to see a lisp afficionado standing up for vi.

Assembler arises in a couple of spots where the impact of macros, the parsimony of lisp's internal representations and the intelligence of the lisp compiler combine to collapse quite sophisticated source code down to startlingly few opcodes. Which is all very fascinating. So much so that I was inspired to resurrect the SBCL installation on my Mac and go refresh my memory of how cute the disassemble function is. However, it feels to me as if an opportunity has been lost to take that just a bit further and come up with some Programming Perls–like startling observations about performance.

The book builds up to a very interesting exercise in implementing Forth. It's very nicely done and a great illustration of how easy it is to implement one interesting language given another. Lisp/Prolog used to be the canonical pair, I think. This illustration makes a good case for lisp/forth being roughly as illuminating.

Along the way there are several not–quite–consistant claims about what the book is for and the big build up to the alleged principle of “duality of syntax” is a very long run for a very short slide. Again, it feels as if an opportunity to do somehting really startling has been lost. There's a sort of plonking “here it is” presentation of this and other material. It's often good and interesting material, but needs a little bit more development in places.

It's perhaps not so great to have the, what shall I call it? unfettered enthusiasm of the author for lisp, macros and all that they imply coming at you un–moderated. I don't think that a commercial editor would have allowed quite so much polemic to make it onto the page. There's a bit too much direct quotation of Paul Graham material (“Blub”, “secret weapon”, you know the sort of stuff) that makes it quite clear that there are on the one hand people who get it, and on the other dullards. This is made very explicit on the back cover blurb:
Only the top percentile of programmers use lisp and if you can understand this book you are in the top percentile of lisp programmers.
Hmm. I have a strong feeling that I understand most of what's in the book and also that I'm not in the top of the top. Whatever that means. I'm not even a “lisp programmer” in any very serious sense of the term. Faced with a little light programming to do then in the absence of any other constraints I'm likely to reach for Scheme—and that brings me to another item that a commercial editor probably wouldn't have let through.

You might imagine that the differing trade–offs made in Scheme and Common Lisp are something that reasonable people could agree to disagree about. Hoyte wants his reader to understand very clearly that this is not so: the choices made in Scheme are wrong (emphasis in the original) and those made in CL are right (emphasis also in the original). The first one of these assertions was amusing enough. The second, not so much. And they just keep on coming. Hoyte is far too young to be settling scores from some X3J13 puch–up, wich would be embarrasing enough, so it all ends up looking a bit adolescent to me.

One last thing...at least in the print–on–demand form I've got from Lightning Source UK the book looks absolutely ghastly. “Made with lisp” says the front matter. Lisp with a lot of help from TeX and that's really not good enough for 2009, not without a lot more tuning than has gone into this. And Lightning Source (or whomever did the camera-ready copy) have originated the work at too low a resolution. That and the lazy choice of CMR combined with the glossy toner makes the actual print a less than comfortable read. Self–publishig has a long way to go yet.