Thursday, January 4, 2018

Property based testing: Mars Rover

My go to TDD exercise has been the mars rover kata.  Trying a property test first approach to it has been a struggle.  Compared with an example driven approach, there seems to be a whole lot of front loaded pain without an obvious payoff.

I reached out to Paul Holser to see if I was missing a bet, and of course the discussion turned to what sorts of properties I had discovered in the design.

First property: for all inputs, the output is an ordered collection of representations of rover positions.  If we are checking within the domain boundary, this is pretty trivial -- we can use the type system to enforce this constraint.  If we are checking from outside of the boundary, then we're checking that the lines of text/bytes that have been returned to us can be parsed; three entries per represe, the first two entries are coordinates/integers, the last entry is a member of the set of cardinal directions.

Second property: for all inputs, the number of representations in the output equals the number of rovers described by the input.  Rovers are conserved.

Third property: for any inputs, you get the same output each time you run it.

Fourth property: for any pair of inputs, reversing the order of execution does not change the outputs.


In other words, its a pure function.  I don't actually know the battery of tests that you should run to establish this fact.

A rover with an empty instruction set remains at the same coordinates in the same orientation.  This gives us our fifth property: for all inputs, if we replace a single rover's instructions with an empty instruction set (thereby creating a different input), the corresponding output will have the same position.  Note that this indirectly demonstrates that the order of the outputs matches the order of the inputs.

This property would not apply if the rovers moved other than by consuming their own instructions. For example, if one rover were to push another rover out of the way, then the property would not hold.

The above identity leads to two additional properties, both variations of different paths, same destination.

The sixth property takes advantage of the identity to allow us to establish that the behavior of the rovers is consistent with their programs being executed in the right order.  So for all inputs, we can select any rover, and create a new program in the following way: the positions an instructions of the rovers before the selection in the list are copied directly, the positions of the remaining rovers are copied with empty instructions.  We get the corresponding output, and then build a second program -- the output positions are taken as inputs, with empty instruction sets assigned to the rovers before the selected rover, and the original instruction sets assigned to the remainder.  The property to be checked is that the output of this second program matches the original that we started with.

The seventh property is similar; instead of breaking up the original program along neat rover boundaries, we can also split the selected rover's instructions into two pieces, running part of the instruction set with the rovers ahead of it, then running the remaining instructions with the subsequent rovers.

Both of these are basically establishing that we're dealing with a stateless process; the next collection of positions depends only on the current collection of positions and the next instruction.

Up until this point, we haven't really been looking at the semantics of the output at all; we've got a list of coordinates and orientations, but all we've done is check for equality.

Putting this another way, the simplest thing that could possibly work is still to ignore the instruction sets entirely, and simply return the original positions and orientations in each case.

The plateau is rectangular, and presumably the boundary effects are equivalent on all sides.  If you flip the rectangle upside down -- exchanging north for south and east for west -- you get the same configuration of rovers in a different coordinate system.   Right and Left have the same effect on orientation that they did before the flip.  Any interactions with the boundary will still appear at the same points in the evaluations of the rovers.

This yields our eighth property: for any input, if we flip the input position, compute the output, and then flip the output, we should get the same answer that we would get from running the original output as is.

There is a similar result for quarter rotations, if you adjust the input so that the plateau itself is square.

Rotations are mostly immune to boundary effects and collisions, so it's probably reasonable to start there.

One assumption that we probably need to make is that all valid inputs have rovers that start within the region of the specified plateau.  The problem statement doesn't address that point.  I don't think property based testing helps much here - running the checks can only tell you if the test and implementation made compatible assumptions, not correct assumptions.

Property nine: for all inputs, we can select any rover and replace it's original instruction set with one that has all of the move instructions removed.  When we run this program:
  • The coordinates of the output should match those of the input
  • The orientation of the output will match that of the input if, and only if, the difference in the count of left instructions and the count of right instructions is a multiple of four. 
Finally, we have the possibility of an input that forces us to change the position of the rover!

This establishes that we've got four-symmetry, that left and right cancel each other out, and that rotations preserve position.

Property ten: for all inputs, we can select any rover and replace it's original instruction set with a one that elides all of the left and right instructions.  When we run this program, the orientation of the selected rover will be unchanged.

EDIT: come to think of it, any single instruction leaves at least two of the three properties unchanged; the remaining property changes by at most one unit.  The complication of the move instruction is that you need the orientation to know which property is changing in which direction.

We can introduce into this domain the concept of a displacement - we're in a taxicab geometry.  The number of moves in an instruction set establishes an upper bound on the displacement that can be achieved; we know that the rover will be found somewhere in the circle.

Property eleven: for all inputs, we can measure the displacement of the output coordinates from the input coordinates.  For each rover, the displacement will be less than or equal to the count of moves in its instruction set.

Furthermore, if the circle doesn't reach the boundary of the plateau, then we don't need to worry about boundary effects for that rover.  Taking the initial positions of the rovers, and the size of the bounding circles allows us to compute a bounding box; if the box doesn't reach the plateau boundaries, then the simulation is completely free of boundary effects.

If the plateau is large enough to enclose the bounding box at two different locations, then we can establish twelfth property - that of translation symmetry in the following way.  Given any valid input, we can compute the extents of a bounding box that encloses the possible programs of all rovers.  Choose a displacement in positive x and y.  Compute the dimensions of a plateau by adding the displacement to the extent of the bounding box.  Create two inputs using this plateau: for the first, compute the initial positions of the rovers such that the lower left corner of the bounding box is at the origin; for the second, compute initial positions of the rovers such that the upper right corner of the bounding box is at the upper right corner of the plateau.  Note that the relative displacment of the input coordinates of the corresponding rovers should be the same.  Compute outputs for both programs; the displacements of the corresponding outputs should all match.

The thirteenth property also plays with bounding boxes to establish this property - that if the bounding circles of two rovers don't overlap, then they don't have any interference effects.  The check goes something like this: first you run a single program with all of the rovers participating.  Then you run a program that isolates each rover on the same plateau, and confirm that the final positions of the isolated rovers matches that predicted by the single program containing all of the rovers.

It's not clear how you transform input with overlapping bounding circles into one that doesn't have them.  Do you move the rovers? shrink the bounding circles? skip examples that don't match the criteria?

But in the main, it seems to be a very powerful pattern to use an input as a seed from which you can compute an input that has the properties you want to verify -- using an identity transformation wherever possible.

It's not so clear to me that these constraints are driving the design in any useful way.  For example, we don't have any properties that establish that left and right are correctly oriented.  We don't have any tests that actually demonstrate that the rover has moved.





No comments:

Post a Comment