Sunday, October 7, 2007

Hackystat and Crap4J

The folks at Agitar, who clearly have a sense of humor in addition to being excellent hackers, have recently produced a plug-in for Eclipse called Crap4J that calculates a measure of your code's "crappiness".

From their web page:

There is no fool-proof, 100% objective and accurate way to determine if a particular piece of code is crappy or not. However, our intuition – backed by research and empirical evidence – is that unnecessarily complex and convoluted code, written by someone else, is the code most likely to elicit a “This is crap!” response. If the person looking at the code is also responsible for maintaining it going forward, the response typically changes into “Oh crap!”

Since writing automated tests (e.g., using JUnit) for complex code is particularly hard to do, crappy code usually comes with few, if any, automated tests. The presence of automated tests implies not only some degree of testability (which in turn seems to be associated with better, or more thoughtful, design), but it also means that the developers cared enough and had enough time to write tests – which is a good sign for the people inheriting the code.

Because the combination of complexity and lack of tests appear to be good indicators of code that is potentially crappy – and a maintenance challenge – my Agitar Labs colleague Bob Evans and I have been experimenting with a metric based on those two measurements. The Change Risk Analysis and Prediction (CRAP) score uses cyclomatic complexity and code coverage from automated tests to help estimate the effort and risk associated with maintaining legacy code. We started working on an open-source experimental tool called “crap4j” that calculates the CRAP score for Java code. We need more experience and time to fine tune it, but the initial results are encouraging and we have started to experiment with it in-house.

Here's a screenshot of Crap4J after a run over the SensorShell service code:


Immediately after sending this link to the Hackystat Hackers, a few of us started playing with it. While the metric seems intuitively appealing (and requires one to use lots of bad puns when reporting on the results), its implementation as an Eclipse plugin is quite limiting. We have found, for example, that the plugin fails on the SensorBase code, not through any fault of the SensorBase code (whose unit tests run quite happily within Eclipse and Ant) but seemingly because of some interaction with Agitar's auto-test invocation or coverage mechanism.

Thus, this seems like an opportunity for Hackystat. If we implement the CRAP metric as a higher level analysis (for example, at the Daily Project Data level), then any combination of tools that send Coverage data and FileMetric data (that provides cyclomatic complexity) can produce CRAP. Hackystat can thus measure CRAP independently of Eclipse or even Java.

The Agitar folks go on to say:

We are also aware that the CRAP formula doesn’t currently take into account higher-order, more design-oriented metrics that are relevant to maintainability (such as cohesion and coupling).

Here is another opportunity for Hackystat: it would be trivial, once we have a DPD analysis that produces CRAP, to provide a variant CRAP calculation that factors in Dependency sensor data (which provides measures of coupling and cohesion).

Then we could do a simple case study in which we run these two measures of CRAP over a code base, order the classes in the code base by their level of crappiness according to the two measures, and ask experts to assess which ordering appears to be more consistent with the code's "True" crappiness.

I think such a study could form the basis for a really crappy B.S. or M.S. Thesis.

1 comment:

Ramanand said...
This comment has been removed by a blog administrator.