Interesting to note that a similar discussion, albeit brief, sparked up on the XP egroup. I posted a link to my previous blog post there, but it seemed to get lost in the flood. Oh well. It will be interesting to compare my results with those in the Muller paper.
Anyway, I've managed to find the time to look at a very small sample of Java codebases, find their distribution of complexity, fit it to a Pareto distribution and take a look at the slope of the best fit straight line. Here's the outcome, codebases ordered by published unit tests per unit total cyclomatic complexity (where applicable):
codebase | #tests/ total CC | slope |
---|---|---|
jasml 0.10 | 0 | 1.18 |
logica smpp library | 0 | 1.39 |
itext 1.4.1 | 0 | 1.96 |
jfreechart 1.0.1 | 0.02 | 2.43 |
junit 3.8.2 | 0.14 | 2.47 |
ust (proprietary) | 0.35 | 2.79 |
A few points present themselves: each of the code bases with no tests published has a substantially lower slope (and so substantially greater representation of more complex methods) than any of those with; of those with published tests, number of tests per "unit" of complexity is positively correlated (at about 0.96, very good for "social science", reasonable for "hard science", but this is "computer science", so who knows?) with higher slope and so a preference for simpler methods.
The story continues here
No comments:
Post a Comment