Wednesday, August 25, 2021

TDD: Thinking in Top Down

We did so by writing high level tests that were checking special patterns (gliders …). That was a nightmare ! It felt like trying to reverse engineer the rules of the game from real use cases. It did not bring us anywhere.  -- Philippe Bourgau

I don't have this problem when I work top down.

The reason that I don't have this problem is that the "rules of the game" are sitting right there in the requirements, so I don't have to reverse engineer them from a corpus of examples.

What I do have to do is show the work.  That happens during the refactoring phase, where I explicitly demonstrate how we arrive at the answer from the parameters of the question.

For a problem like the glider, the design might evolve along these lines: why is this cell alive?  Because its live-neighbor-count in the previous generation was three.  Where did that three come from?  Well, it's a sum, we count 1 for each neighbor that is alive, and zero for each that is dead, for each of the eight neighbors.  Where do the neighbors come from?  We identify them by making eight separate calculations of using the coordinates of the target cell.  And so on.

Sometimes, I imagine this as adding a comment to the hard coded answer, explaining why that answer is correct, and then introducing the same calculation in the code so that the comment is no longer necessary.

Paraphrasing Ward Cunningham, our goal is to produce code aligned with what we understand to be the proper way to think about the problem.  Because we understand the rules of the game, we can align to them during the refactor phase without waiting for more examples to test.

Top down doesn't mean that you must jump to lion taming in one go.  Top down refactorings tend to run deep, so it often makes sense to start with a examples that are narrow.  It's not unreasonable to prefer more tests of lighter weight to a single "and the kitchen sink too" example.


Thursday, August 5, 2021

TDD: Duplication

We had a long discussion on slack today about duplication, and refactoring before introducing the second test.  Didn't come away with the sense that ideas were being communicated clearly, but I suppose that's one of the hazards of talking about it, instead of showing/pairing on it.

Or the idea was just too alien -- always a possibility.

In the process, I found myself digging out the Fibonacci problem again, because I remembered that Kent Beck's demonstration of the Fibonacci problem "back in the day" had supported the point I wanted to make.  After looking in all of the "obvious" places, I thought to check the book.  Sure enough, it appears in Appendix II of Test Driven Development by Example.

(Rough timeline: Kent created a Yahoo group in February 2002; the Fibonacci exercise came up in March, and as of July the current draft had a chapter on that topic.)

Kent's refactoring in the text looks like:

This is his first step in removing duplication; replacing the easy-to-type literal that he used to pass the test with an expression that brings us one step closer to what we really mean.

Of course, there's nothing "driving" the programmer to make this change at this point; Kent is just taking advantage of the fact that he has tests to begin cleaning things up.  As it happens, he can clean things up to the point that his implementation completely solves the problem.

Today, there were (I think) two different flavors of objection to this approach.  

One of them focused on the test artifacts you end up with - if you implement the solution "too quickly", then your tests are deficient when viewed as documentation.  Better, goes the argument, to document each behavior as clearly as possible with its own test; and if those tests are part of your definition of done, then you might as well introduce the behaviors and the tests in rhythm.

It's an interesting thought - I don't agree with it today (tests are code, code is a liability would be my counter argument) - but its certainly worth consideration, and I wouldn't be surprised to discover that there are circumstances where that's the right trade off to make.

The other object came back to tests "driving" the design.  In effect, the suggestion seems to be that you aren't allowed to introduce a correct implementation until it is the simplest that passes all the tests.  I imagine an analogy to curve fitting - until you have two tests, you can't implement a line, until you have three tests, you can't implement a parabola, and so on.

That, it seems to me, leads you straight to the Owl.  Or worse, leaves us in the situation that Jim Coplien warned us of years ago - that we carry around a naive model that gives us the illusion of progress.