Thursday, November 26, 2020

TDD: Controlled Data

 I have been paying more attention to parsing of late, and that in turn led me to some new understandings of TDD.

Back in the day when I was first learning about RED/GREEN/REFACTOR, the common approach to attacking a problem was to think about what the design might look like at the end, choose a candidate class from that design, and see what could be done to get it working.

Frequently, people would begin from a leaf class, expecting to build their way up, and that seemed wrong to me.  The practical flaw was the amount of work they would need to do before you could finally point to something that delivered value.  The theoretical flaw was this: I already knew how to guess what the design should be.  What I wanted, and what I thought I was being offered, was emergent design -- or failing that, experiments that would honestly test the proposition that a sound design could emerge from tests.

The approach I preferred would start from the boundary; let's create a sample client, and using that requirement begin enumerating the different behaviors we expect, and discover all of the richness underneath by rearranging the implementation until it is well aligned with our design sense.

Of course, we're still starting with a guess, but it's a small guess -- we know that if our code is going to be useful there must be a way to talk to it.  Bootstrapping can be a challenge -- what does the interface to our code look like.

And in group exercises, I've had a fair amount of success with this simple heuristic: choose something that's easy to type.

Let's take a look at the bowling game, in the original Klingon.  These days, I most commonly see the bowling game exercise presented as a series of drills (practice these tests and refactorings until it becomes second nature); but in an early incarnation it was a re-enactment of a pair programming episode.

Reviewing their description, two things stand out.  First, that they had initially guessed at an interface with extra data types, but rejected it when they realized the code that they "wanted to type" would not compile.  And second, that they defer the question of what the implementation should do with an input that doesn't belong to the domain

tests in the rest of the system will catch an invalid argument.

I want to take a moment to frame that idea using a different language.  Bertrand Meyer, in Object-oriented Software Construction, begins with a discussion of Correctness and Robustness.  What I see here is that Koss and Martin chose to defer Robustness concerns in order to concentrate on Correctness.  For the implementation that they are currently writing, properly sanitized inputs are a pre-condition imposed on the client code.

But when you are out near an unsanitized boundary, you aren't guaranteed that all inputs can be mapped to sanitized inputs.  If your solution is going to be robust, then you are going to need graceful affordances for handling abnormal inputs.

For an implementation that can assume sanitized inputs, you can measure the code under test with a relatively straight forward client like


But near the boundary?


I don't find that this produces a satisfactory usage example.  Even if we were to accept that throwing a unchecked exception is a satisfactory response to an invalid input, this example doesn't demonstrate graceful handling of the thrown exception.

Let's look at this first example again.  What is the example really showing us?

I would argue that what this first example is showing us is that we can create a bowling score from a sanitized sequence of inputs.  It's a recipe, that requires a single external ingredient.

Can we do the same with our unsanitized inputs? Yes, if we allow ourselves to be a little bit flexible in the face of ambiguity.

A parser is just a function that consumes less-structured input and produces more-structured output. -- Alexis King

I want to offer a definition that is consistent in spirit, but with a slightly different spelling: a parser is just a function that consumes less-structured input and produces ambiguous more-structured output.

When we parse our unsanitized sequence of throws, what gets produced is either a sanitized sequence of throws or an abnormal sequence of throws, but without extra constraints on the input sequence we don't necessarily know which.

In effect, we have a container with ambiguous, but sanitized, contents.

That ambiguity is part of the essential complexity of the problem when we have to include sanitizing our inputs.  And therefore it follows that we should be expressing that idea explicitly in our design.

We don't have to guess all of the complexity at once, because we can start out limiting our focus to those controlled inputs that should always produce a bowling score; which means that all of the other cases that we haven't considered yet can be lumped into an "unknown" category -- that's going to be safe, because a correct implementation must not use the unknown code path when provided with pre-sanitized inputs.

When we later replace the two alternative parser with one that produces more alternatives -- that's just more dead branches of code that can again be expressed as unknown.

In the simple version of the bowling game exercise, we need three ingredients

  • Our unsantizied input
  • a transform to use when we terminate the example with a valid score
  • a transform to use when we terminate the example with abnormal inputs.

So we can immediately reach for whatever our favorite pattern for transforming different results might be.

Following these ideas, you can spin up something like this in your first pull:


And now that everything is compiling, you can dig into the RED/GREEN/REFACTOR cycle and start exploring possible designs.

Now, I've palmed a card here, and I want to show it because it is all part of the same lesson.  The interface provided here embraces the complexity of unsanitized data, but it drops the ball on timing -- I build into this interface an assumption that the unsanitized data is all arrives at the same time.  If we are defining for a context where unsanitized throws arrive one at a time, then our examples may need to show how we explicitly handle memory; and we may have to make good guesses about whether we need to announce the abnormal input at once, or if we can let it wait until the score method is invoked.

The good news: often we have over simplified because our attention was on the most common case; so in the case that we discover our errors late, there can be a lot of re-use in our implemetation.

The bad news: if our clients are over simplified, then the implementations that are using our clients are going to need rewrites.

If we're aware, we can still tackle the problem in slices - first releasing a client that delivers business value in the simplest possible context, and then expanding our market share with new clients the deliver that value to other contexts.