Practical Invariants
A guide to using invariants in everyday programming. —
For many programmers, the word invariant will surface memories of their algorithms class that were once carefully buried. The tome Introduction to Algorithms by Cormen et al., ubiquitous in algorithms courses, introduces the concept early and is used nearly thirty times in the book in dry, formal proofs of various algorithms.1 It’s unfortunate that this is often to be the first—and equally as often the last—formal introduction to invariants that most programmers get, because structuring code around them makes the code easier to plan, build, and test.
Put broadly, an invariant is an assertion that is always true about the state of a program at a certain point.
A good invariant is like the thesis statement of your code. Like a good thesis, a good invariant should:
- Have its purpose encoded in the statement; it will be self-evident.
- Convey the code’s point of view or approach to the problem.
- Be specific about the state of all the data.
- Inform the reader of the high-level considerations without diving into unnecessary detail.
Invariants are a roadmap
One of the most common problems I see when conducting coding interviews is that interviewees often jump right into coding without their thesis. You should always have your invariant (or at least a draft of it) before building anything even moderately complex.
Here’s an invariant from the Xinu operating system’s memory manager:
All free blocks of memory are kept on a linked list; blocks on the free list are ordered by increasing address.2
It would be fairly easy to write malloc
and free
3 functions using that invariant as a roadmap.
The invariant could be made slightly better if it indicated how the list was terminated because it could be used as the deciding factor about how to handle edge-cases.
Writing invariants
Why did the memory manager get an invariant, but not the malloc
function or the OS?
Invariants are best used to describe successive changes to interdependent state.
The memory manager performs repeated malloc
and free
operations to an
explicitly stated list, and some other implicit structure that holds blocks of
memory when it’s not on the list.
Before trying to write out the invariant for your module, you should know:
- What data the module is going to contain.
- What data-structures you’ll be using.
- How states and failure conditions will be represented in the data.
Once you have those, try writing an invariant.
For example, imagine you were responsible for writing a module that loaded plugins. The module would probably have a list of plugins to load, some state about whether they’ve been initialized or not, and some functions to operate as hooks. A good invariant would clarify when it was valid to call all of these states, and how interactions between them would work. It might look something like this:
Plugins are registered, initialized all at once, then hooks can be called; when a hook is called each initialized plugin is executed with a copy of the event in non-deterministic order.
Testing with invariants
A well structured invariant will help you to test new functions in a consistent way. While testing isn’t as strong as formal proofs, it’s as close as most practicing software developers need to get.
Invariant tests can be written as helpers that are used in other tests as a form of correctness check.
Below is an example of how you might write and test the memory manager from Xinu if it were written in go
:
func testMMUInvariant(t *testing.T, mmu *MMU) {
// check that list is in ascending order
// check that blocks don't overlap
// check that there are no adjacent blocks
}
func TestMalloc(t *testing.T) {
mmu := MockMMU()
// set up mock
testMMUInvariant(t, mmu) // assert the mock is valid
// unit test for malloc
testMMUInvariant(t, mmu) // assert malloc left the system in a good condition
}
The invariant test helper is used to validate both that the test input
is valid and that each call to malloc
leaves the system in a known good state.
Extracting the invariant into its own helper allows the invariant to be updated in the future. It also eliminates lazy testing because it’s easier to test the full invariant than to rewrite just portions of it for each function’s unit-test. Pretesting mocks ensures that your tests won’t degrade over time as code gets refactored and the mocks become less like reality. They should help find weak mocks that could cause false-positive tests after assumptions change.
Additionally, these test helpers can open up your code to new kinds of testing.
It would be straightforward to write a poor-man’s fuzzer that randomly cycles through ten-thousand malloc
and free
calls.
Finding a way to test the state at each call would be nightmarish without tightly coupling the testing logic to the implementation.
A “good enough” solution would be to test the state at the beginning and end along with the invariant at each step.
Conclusion
Invariants don’t have to be the rigid pedantic structure you learned in algorithms class. In fact, you’ve probably been using some variant of them already to guide your thinking, structure your code, and drive your tests even if you didn’t have a concrete concept around it. Next time you approach a problem with complex state and interactions, try explicitly stating an invariant to see if it’ll give the problem some clarity.
T. H. et al Cormen, Introduction to algorithms. Cambridge, MA: MIT Press, 2003. ↩︎
D. Comer, Operating System Design, 2nd ed. Boca Raton: CRC Press, 2015. ↩︎
The Xinu OS provides four different functions to allocate and free stack and heap space, for clarity they’re replaced here with the commonly known C style versions. ↩︎