This thread came from here, and will be continued...
Just walked out of the first keynote at Agile 2007: I don't need to hear about an amature mountaineer's love-life (yes, really) no matter how "inspirational" it's supposed to be. What have we come to?
Anyway, that gives me time to get ready for the latest adventure in test-driven development and complexity.
Well, that didn't go quite a smoothly as I'd hoped. The room I'd been given was in some sort of bunker under the hotel and while it did technically have a wireless network connection the feeble signal that we had was unable to support as many laptops as there were in the room. So it was a bit of a struggle to get folks set up to use the tool.
However, some people did manage to gather some interesting data of which I hope that some will be shared here. Certainly, folks have found some interesting correlations between the numbers that the tool emits and their experiences working with their code. Especially encouraging is that folks are applying the tool to long lived codebases of their own and looking at historical terends. These are the sorts of stories that I need to gather now.
Note: the tool is GPL (and the source is in the measure.jar along with the classes). Several folks are interested in C# and Ruby versions, which I'd love to see and am happy to help with.
I sat down with Laurent Bossavit and we experimented to see if we could get equally interesting results from looking at the distribution of size (ie, counting line ends) in a codebase, and it turns out not. Which is a shame, as that would be easier to talk about, but is also what I expected, so that's kind-of OK.
A lot of good questions came up in the session, pointers to where I need to look next: Is there a difference between code written by solo programmers vs teams? Do you get the same effect from using functional (say, Fit style) tests as from unit tests? Is there any correlation with test coverage? Exactly what effect does refactoring have on the complexity distribution? Thanks all for these.
Laurent seemed at one point to have a counterexample to my hypothesis (which was a very exciting proposition), code that he knew had been done with strong test-first, but had a Pareto slope of about 1.59 (and an amazing R^2 of 1.0). But on closer examination it turned out that the codebase was a mixture of solid TDD code that by itself had a slope of 2.41, and some other code (which we had good reason to believe was a, poor and b, not test-driven) that by itself had a slope of 1.31
Unfortunately, I wasn't able to make it to the research paper session where this thing was discussed, or this. But I need to catch up with those folks. In particular, the IBM group report that with TDD they don't see the compelxity of the code increase in the way that they expect from experience with non TDD projects.