Friday, March 30, 2018

TDD: Tales of the Fischer King

Chess960 is a variant of chess introduced by Bobby Fischer in the mid 1990s. The key idea is that each players' pieces, rather than being assigned to fixed initial positions on the home rank, are instead randomized, subject to the following constraints
  • The bishops shall be placed on squares of opposite colors
  • The king shall be positioned between the two rooks.
There are 960 different positions that satisfy these constraints.

In December of 2002, Peter Seibel proposed :
Here's an interesting little problem that I tried to tackle with TDD as an exercise....  Write a function (method, procedure, whatever) that returns a randomly generated legal arrangement of the eight white pieces.
Looking back that the discussion, what surprises me is that we were really close to a number of ideas that would become popular later:
  • We discuss property based tests (without really discovering them)
  • We discuss mocking the random number generator (without really discovering that)
  • We get really very close to the idea that random numbers, like time, are inputs
I decided to retry the exercise again this year, to explore some of the ideas that I have been playing with of late.

Back in the day, Red Green Refactor, was understood as a cycle; but these days, I tend to think of it as a protocol.  Extending an API is not the same flavor of Red as putting a new constraint on the system under test, the Green in the test calibration phase is not the same as the green after refactoring, and so on.

I tend to lean more toward modules and functions than I did back then; our automated check configures a specification that the system under test needs to satisfy.  We tend to separate out the different parts so that they can be run in parallel, but fundamentally it is one function that accepts a module as an argument and returns a TestResult.

An important idea is that the system under test can be passed to the check as an argument.  I don't feel like I have the whole shape of that yet, but the rough idea is that once you are inside the check, you are only talking to an API.  In other words, the checks should be re usable to test multiple implementations of the contract.  Of course, you have to avoid a naming collision between the check and the test.

One consequence of that approach is that the test suite can serve the role of an acceptance test if you decide to throw away the existing implementation and start from scratch.  You'll need a new set of scaffolding tests for the new implementation to drive the dynamo.  Deleting tests that are over fitting a particular implementation is a good idea anyway.

One of the traps I fell into in this particular iteration of the experiment: using the ordering bishops-queens-knights is much easier to work with than attacking the rook king rook problem.  I decided to push through it this time, but I didn't feel like it progressed nearly as smoothly as it did the first time through.

There's hidden duplication in the final iteration of the problem; the strategies that you use for placing the pieces are tightly coupled to the home row.  In this exercise, I didn't even go so far as encapsulating the strategies.

Where's the domain model?  One issue with writing test first is that you are typically crossing the boundary between the tests and the implementation; primitives are the simplest thing that could possibly work, as far as the interface is concerned.

Rediscovering that simplest thing was originally a remedy for writer's block was a big help in this exercise.  If you look closely at the commits, you'll see a big gap in the dates as I looked for a natural way to implement the code that I already had in mind.  A more honest exercise might have just wandered off the rails at that point.

I deliberately put down my pencil before trying to address the duplication in either the test or the implementation.  J. B. Rainsberger talks about the power of the TDD Dynamo; but Sandi Metz right points out that keeping things simple leaves you extremely well placed to make the next change easy.

During the exercise, I discovered that writing the check is a test, in the sense described by James Bach and Michael Bolton.
Testing is the process of evaluating a product by learning about it through exploration and experimentation
This is precisely what we are doing during the "Red" phase; we are exploring our candidate API, trying to evaluate how nice it is to work with.  The implementation of the checks can later stand in as an example of how to consume the API.

The code artifacts from this exercise, and the running diary, are available on Github.



No comments:

Post a Comment