Drupal Maintainability III – Self Documentation

It’s been a while since I’ve done an article in the Maintainability series, but that’s how the web business goes – alternating between insane activity and manageable momentum.

This time we’re talking about writing self-documenting code. For those of you who write code, be it theming, CSS, module development, or anything else I think self-documenting code may be the most important way to make your code understandable for other people that try to read it. And really it’s not just to make life easy for other people, you’ll thank yourself when you open that old project 6 months later. A really smart dude* once said

I wouldn’t give a nickel for the simplicity on this side of complexity, but I would give my life for the simplicity on the other side of complexity.

Variable and Function Names That Mean Something

Let me start with an example.

$v) {
if ( array_key_exists($k, $objects)) {
if ( ! isset( $ids[‘location’] ) ||
! is_array( $ids[‘location’] ) ) {
$ids[‘location’] = array( );
}
if ($k == ‘location_id’) {
$ids[‘location’][$locNo][‘id’] = $v;
//store location type id
$ids[‘location’][$locNo][‘location_type_id’] = $value[‘location_type_id’];
} else {
$ids[‘location’][$locNo][$objects[$k]] = $v;
}
} else if (is_array($v)) {
//build phone/email/im/openid ids
if ( in_array ($k, array(‘phone’, ’email’, ‘im’, ‘openid’)) ) {
$no = 1;
foreach ($v as $k1 => $v1) {
if (substr($k1, strlen($k1) – 2, strlen($k1)) == “id”) {
$ids[‘location’][$locNo][$k][$no] = $v1;
$no++;
}
}
}
}
?>

There’s so many generic names in this code: values, k, v, objects, ids, locNo, k1, v1, no. If it weren’t for the 2 in-line comments this code would be completely incomprehensible. Try to make names specific and meaningful. Concatenate two words together if it helps. This holds true for markup too, don’t name your divs col1, col2, col3.

This also applies to functions that you call from within your code. If you must use a generic function then only do so once if possible and create a local copy with a meaningful name. So for example in Drupal instead of:

arg(1)));

$bar = my_module_bar_get(arg(1), arg(2));

}
?>

you might do this to make things more understandable:

$uid));

$bar = my_module_bar_get($uid, $foo_id);

}
?>

Singular and plural.

In most cases scalars (strings, integers, etc.) and objects are single things, while arrays are often lists. Make sure that your variable name reflects that. $bars and $people versus $bar and $person. This can apply even to CSS and markup.

Abbreviations

I have a pet peeve about abbreviations. Does ‘vid’ mean ‘video’, ‘vocabulary ID’, ‘version ID’, ‘Vehicle ID’, or something completely different? Of course there are some universal conventions; I think it’s fairly safe to assume that in programming ‘ID’ means ‘identifier’. You might say “but it takes longer to type”. But you can use the tab key on the command line to do auto-complete and your IDE also has auto-complete for variable and function names. If you’re not yet using an IDE you really should, it will make your life a whole lot easier.

Molecularly Understandable Code

What I mean by this is that each chunk of code should be understandable. Ten points to anyone who can tell me what is going on here:

array(1, 4),
2 => array(1, 4),
3 => array(1),
4 => array(4),
5 => array(1),
7 => array(1, 4),
);
}
?>

I’m not making up these code examples. Function names have been slightly altered to protect the (not so) innocent but this is real code that someone wrote and I’ve found myself needing to read/call/fix. I have no idea what is going on in the above snippet. If you’re passing around multi-dimensional arrays of seemingly random numbers you probably need to re-architect things.

Actual Documentation

Now in a perfect world you might be able to write code that people could fully understand without any extra help. But unfortunately this doesn’t appear to be possible. One of the most important ways to include actual documentation in your code is with DocBlock / Javadoc / Doxygen / whatever-you-want-to-call-them comment blocks. This is used on almost all Drupal functions and is really helpful as your IDE will provide a way to show you the comments as you type a function name or when your cursor is on a function name.

Here’s the basic format (note that there’s supposed to be a space in front of each of the lines with a leading ‘*’ but the syntax highlighter is stripping those out):

Note that for functions that implement hooks or override theme functions there’s no need to duplicate all of the documentation that’s already in core. You can simply put “Implementation of hook_foo().” or “Override of theme_bar().”.

The second kind of comments are in-line comments. I think that you should have an in-line comment every 10-15 lines. So a function might look something like:
ingredients as $ingredient) {

}

// Mix the ingredients.

// Cook the ingredients.

return $cooked_dish;

} ?>

Document Edge Cases

If you do something strange like:
2) {

}
else {

}
?>
Give us a few words explaining what’s going on: why do you need to do things differently if there’s more than two widgets?

Generally writing self-documenting code may slow you down a bit, you have to type a bit more, and you have to think things through a bit more. But I think the benefits of being able to understand things later far outweigh the few percentage points of increased time. You’ll have a better product in the long run.

* Einstein