Wouldn’t it be useful to know, project-by-project and over time, how much time your team is spending on finding and fixing bugs? It could sure make for some pretty graphs and visualisations!

This article shows you how to use microformats in your Git commit messages to gather useful and interesting metrics on your software projects.

Why gather metrics?

  • Determine how much time spent finding and fixing bugs. You may think your team are working on features most of the time, only to be shown that actually they are spending most of their hours firefighting! If this is the case, it’s better to know rather than carry on regardless.

  • Track differential over time. Do bugs get quicker to find as the team grow more familiar with the client’s business aims and codebase? Are things taking longer and longer to diagnose and fix? This could be a sign that you need to stop and address technical debt. In our particular case, we have developed some hypotheses about tooling and methodologies to support isolating and identifying defects, and we want to test their effectiveness over time.

  • Automatable — it’s not a great deal of effort. If this information could be useful in the future, then now is a good time to start at least recording it!

Defining a microformat

Many tools already support a commit message format you’re likely familiar with: using [#Numeric ID] to reference a ticket or issue, as employed by Trac, GitHub, PivotalTracker and others.

When tracking your own data, you should use a format that’s easily parseable by humans and machines, and is easily distinguishable from normal commit message content.

For ease of use and consistency, we decided on the following format:

Fix pagination bug

Pagination was causing a buffer overflow by repeatedly hammering the server for every page in parallel rather than requesting a page at once.

Added test that verifies that this no longer occurs and counts the number of requests made.

ttc:2
ttf:1.5

So, all we have is ttc:{number of hours to the nearest 0.5}{newline}ttf:{number of hours to the nearest 0.5}. Very short and simple! There are just two fields: ttc is Time To Cause and ttf is Time To Fix. So, TTC is how long it took to work out exactly what the problem was, and TTF how long it took to fix. TTF may include things like writing a test to prevent regressions.

Of course, these things are not hard and fast. There’s little point setting a stopwatch - we’re not pretending that this is a precise science. That’s why we’re only going to a half hour level of precision. It it’s less than 15 minutes, then just put 0, it will all average out.

TTC and TTF are exclusive periods of time - so in this example, the engineer spent 2 hours finding and 1.5 hours fixing, for a total of 3.5 hours spent.

Collating the results

Now that your commit messages are being properly annotated, it’s time to extract the raw data. Fortunately, GitHub makes this very easy by allowing web hooks to be attached to a repository and receive commit data on every push. If you’ve ever tinkered with GitHub’s hooks system, you know the structure of that data looks something like this:

{
  "ref": "refs/heads/master",
  /* ... */
  "commits": [
    {
      "id": "33eb08ab524a6b8dc962aa3a4be62a313730b34f",
      "distinct": true,
      "Fix pagination bug\n\nPagination was causing a buffer..." // ...
      "timestamp": "2014-05-27T18:40:37-04:00",
      "author": {
        "name": "Nate Abele",
        "email": "[email protected]",
        "username": "nateabele"
      },
      /* ... */
    }
  ],
  /* ... */
  "repository": {
    /* ... */
  }
}

Once the hook is set up, it can iterate through each push’s commit data, pattern-matching on the message field, like so: /^TT[CF]:?\s*([0-9]*\.?[0-9]+?)$/img. Once parsed, the results may be stored in a database for later analysis.

Difficulties

While straightforward, this technique is not without some caveats. For example, if you’re working out an issue across disparate components in a multi-repository system, it may be difficult to analyse the problem in aggregate.

Further, critical bugs may require immediate fixes, even if only partial, with a final fix to be implemented at a later date. Other bugs, even after identifying the supposed root cause and adding regression tests accordingly, simply refuse to die.

Accurately measuring the impact of these issues, while possible, requires higher-level analysis, involving integration with, and careful tending of an issue-tracking system.

Wrapping up

I always get a bit suspicious of code metrics. Whilst you can have categories of bugs, every problem is very much it’s own thing. It’s vital to understand that any trends and patterns that you observe are in the broadest of strokes only — take every case on its own merits. Be aware that in comparing projects you may well be comparing apples and oranges. And, more critically, if you start using metrics as a tool to beat engineers over the head, then you fail management in the worst way!

Dire warnings against their abuse aside, metrics are the thing that stands between your team and "blind" development. “Gut feel” will only take you so far; if you don’t have any insight into what you’re doing and whether it’s being successful, how do you know if you’re improving?

Do you use metrics in your software projects? Have they been helpful? How do you measure technical debt?