Thursday, 27 July 2006

Programming idioms #2: Design by contract

This is the second in my "Programming idioms" series, the first (on mixin classes) can be found here.

Design by contract (DBC) is a practical implementation of Floyd-Hoare logic which is intended to eliminate certain classes of bugs from imperative (i.e. procedural and OO) languages. One of the nice things about DBC is that although it's firmly rooted in the theoretical world, for the "working" programmer it's really very practical. So, no need to learn all about tedious categories and arrows and other such weirdness.

In DBC (and Floyd-Hoare logic) each block of code can have associated with it pre- and post-conditions. Pre-conditions are predicates (relations) which must hold true directly before the block is executed at run-time and post-conditions must hold true directly after. In DBC the runtime system running your program will barf if a pre-condition or post-condition is not met. You may also have invariants, which must always be true. For example:

// Pre: True

x = 43

// Post: x == 43


class BankAccount:

    # Inv: balance > -overdraft_limit

    self.balance = ...


    # Pre: w > 0

    # Post: balance == balance' - w

    def withdraw(w):

        self.balance = self.balance - w

DBC gotchas

As I'll comment on in the next section, DBC conditions and invariants are often annotations in comments. Many systems exist to implement assertion checking for DBC in languages that were designed without it. However, it's worth watching out for a few idioms which suck and you'll want to avoid.

Firstly, DBC should really be used to document and check for semantic conditions which are *not* documented and caught elsewhere. I've seen the following idiom in student code and (shockingly) in lecture notes:

/* Pre: a is a String (VERY bad) */

/* Pre: a isinstanceof String (Better but still dumb) */

public void foobar(String a) { ... }

The problem with this code is that the declaration "String a" already documents that a is a String and provides run time checking of that assertion. Notating the same thing in a comment is just asking the programmer to read the same thing twice and introduces an opportunity for error (what if you change the type declaration and forget to change the comment?). The second comment is better than the first because it's more precise (think: what does 'is a' mean?) and could be automatically checked if a suitable system were available.

Pedagogically this anti-pattern is very damaging. Whenever I've seen students taught this they end up commenting every since ****ing type declaration and never manage to use DBC for any other sort of semantic condition.

Practical implementations

Bertrand Meyer designed the language Eiffel which has DBC built in, although many other languages have pre-processor and similar systems which add DBC onto non-DBC languages.

Since Eiffel never really took-off in the way that Java and Python have, most of the time the only place you'll see DBC is in comments (as above) and assert statements.  It's a shame, because DBC enforces a useful discipline about state management, it provides extremely precise documentation and it adds a relatively small runtime cost (of condition and invariant checking).

However, some of the power of DBC can be added to languages in other ways, and it's worth considering how this might work itself out in production languages of the future. Tim Sheard has a nice concept he calls Generalized Algebraic Data Structures, which are basically type definitions with extra semantic information. So, while in DBC you might declare an invariant that says "an 'age' must be an integer over zero" in GADS you would just create a new int type which only ranges over positive integers and let the type checker (rather than an assertion checker) barf if you try to store a zero or negative as an age.

Blogged with Flock

No comments:

Post a Comment