Friday, January 6, 2023

Schools of Test Driven Development

There are two schools of test driven development.  There is the school that believes that there are two schools; and there is the school that believes that there is only one school.


The school of "only one school" is correct.

As far as I can tell, "Chicago School" is an innovation introduced by Jason Gorman in late 2010.  Possibly this is a nod to the fact that Object Mentor was based in Evanston, Illinois.

"Chicago School" appears to be synonymous with "Detroit School", a term proposed by Michael Feathers in early 2009.  Detroit, here, because the Chrysler Comprehensive Compensation team was based in Detroit; and the lineage of the "classical" TDD practices could be traced there.

Feathers brought with that proposal the original London School label, for practices more readily influenced by innovations at Connextra and London Extreme Tuesday Club.

Feathers was proposing the "school" labels, because he was at that time finding that the "Mockist"  and "Classicist" labels were not satisfactory.

The notion that we have two different schools of TDDer comes from Martin Fowler in 2005.

This is the point where things get muddled - to wit, was Fowler correct to describe these schools as doing TDD differently, or are they instead applying the same TDD practices to a different style of code design?

 Steve Freeman, writing in 2011, offered this observation

...There are some differences between us and Kent. From what I remember of implementation patterns, in his tradition, the emphasis is on the classes. Interfaces are secondary to make things a little more flexible (hence the 'I' prefix). In our world, the interface is what matters and classes are somewhere to put the implementation.

In particular, there seems (in the London tradition) to be an emphasis on the interfaces between the test subject and its collaborators.

I've been recently reviewing Wirfs-Brock/Wilkerson's description of the Client/Server model.

Any object can act as either a client or a server at any given time

As far as I can tell, everybody has always been in agreement about how to design tests that evaluate whether the test subject implements its server responsibilities correctly.

But for the test subject's client responsibilities?  Well, you either ignore them, or you introduce new server responsibilities to act as a proxy measure for the client responsibilities (reducing the problem to one we have already solved), or you measure the client responsibilities directly.

Mackinnon 2008 reported that extending the test subject with testing responsibilities was "compromising too far", and John Nolan had therefore challenged the team to seek other approaches.

Reflecting in 2010, Steve Freeman observed: 

The underlying issue, which I only really understood while writing the book, is that there are different approaches to OO. Ours comes under what Ralph Johnson calls the "mystical" school of OO, which is that it's all about messages. If you don't follow this, then much of what we talk about doesn't work well. 

Similarly, Nat Pryce:

There are different ways of designing how a system is organised into modules and the interfaces between those modules....  Mock Objects are designed for test-driving code that is modularised into objects that communicate by "tell, don't ask" style message passing.

My take, today, is still in alignment with the mockists: the TDD of the London school is the same TDD as everybody else: controlling the gap between decision and feedback, test first with red green refactor, and so on.

The object designs are different, and so we also see differences in the test design - because tests should be fit for purpose.



 


Thursday, February 10, 2022

TDD: Tabula Rasa

What does Test Driven Development look like when you are staring at a blank page?

I was reminded again this week that there a lot of different approaches that one might use, and they don't all answer that question the same way.  So let's try a better question: what does Test Driven Development look like when I am staring at a blank page.

It'll help to have a specific example to work from, so lets consider something like a model for a calculator app; something that will eventually have buttons for input, and a display for output.  The kinds of tests that we expect to end up with ask questions like "after I push buttons in this sequence, what information is on the display?"

You will, I hope, recognize that this is a "toy" problem.  It's not very big.  We don't need to worry about integrating with anything else.  The domain is general and familiar.  We can probably make a fair bit of headway by starting with a small number of "buttons", and then extending our model to support a "scientific calculator" or a "programmer calculator".

Furthermore, I'm going to whistle on past the open issues of how "button presses" become inputs to the model, or how outputs from the model appear on the display.  So out of the gate, before I've even written anything down, I'm carving up the bigger problem into modules, and exercising judgment about which are "important".

But the page is still blank.  Now what?

And if we were stuck more than a minute, I'd stop and say, "Kent, what's the simplest thing that could possibly work?" -- Ward Cunningham, 2004

My immediate goal is to crack through the analysis paralysis and writers block to get something/anything into play.

Two features here: first of all, because this is a "programmer test", I'm going to reach immediately for whatever language I plan for the production code.  That's one less thing I need to worry about as I context shift between design and checking for mistakes.

The second is that my design criteria is "easy to type".  I don't (yet) need to worry about whether I want to decouple these tests from a specific implementation.  I don't (yet) need to worry about whether I want to separate the specification from the test framework (if any).  I don't (yet) need to worry about code style, or physical design.  I'm just boosting myself past the point of static friction.

Choosing something other than a trivial behavior is common at this point.  I normally get away with it because faking a complicated behavior is not significantly harder than faking a simple behavior, and what I get in exchange is a chance to experience describing a more complicated check, so that I don't get deeply invested in the wrong interfaces.

Now we have code on the page, and the "RED" task is happening, and I can fuss over things like getting arguments in the right order for my test framework calls, and do I want to use a different representation of the data to make the intent of the test clearer, and are the reports we get when the test fails what we expect them to be, and so on.

There's a bunch of saw sharpening that makes sense now; after you have real code on the table to argue about, but before you are deeply committed to the specifics of the design.

Or we can judge that this design should be considered disposable, with the expectation that it is going to just act as a place holder until we gathered more evidence about what the longer lived test design should look like

And when we finally bored with the pre-flight rituals? Fake it to get the green test to red in a minimal about of all clock time (we already know that's going to be easy to type because we've written the same expression in the return statement).  And get the hustle on.

Tuesday, January 11, 2022

Mock Object Bibliography

Primitive Obsession Bibliography

  • 2005 https://www.jamesshore.com/v2/blog/2005/primitiveobsession
  • 2005 https://chriswheeler.blogspot.com/2005/05/my-favourite-smells.html
  • 2007 https://grabbagoft.blogspot.com/2007/12/dealing-with-primitive-obsession.html
  • 2011 https://blog.ploeh.dk/2011/05/25/DesignSmellPrimitiveObsession/
  • 2013 https://wiki.c2.com/?PrimitiveObsession
  • 2013 https://fsharpforfunandprofit.com/posts/designing-with-types-single-case-dus/
  • 2013 https://blog.thecodewhisperer.com/permalink/primitive-obsession-obsession
  • 2015 https://blog.ploeh.dk/2015/01/19/from-primitive-obsession-to-domain-modelling/
  • 2015 https://www.industriallogic.com/blog/multiple-asserts-are-ok/
  • 2018 https://softwareengineering.stackexchange.com/questions/365017/when-is-primitive-obsession-not-a-code-smell


Wednesday, August 25, 2021

TDD: Thinking in Top Down

We did so by writing high level tests that were checking special patterns (gliders …). That was a nightmare ! It felt like trying to reverse engineer the rules of the game from real use cases. It did not bring us anywhere.  -- Philippe Bourgau

I don't have this problem when I work top down.

The reason that I don't have this problem is that the "rules of the game" are sitting right there in the requirements, so I don't have to reverse engineer them from a corpus of examples.

What I do have to do is show the work.  That happens during the refactoring phase, where I explicitly demonstrate how we arrive at the answer from the parameters of the question.

For a problem like the glider, the design might evolve along these lines: why is this cell alive?  Because its live-neighbor-count in the previous generation was three.  Where did that three come from?  Well, it's a sum, we count 1 for each neighbor that is alive, and zero for each that is dead, for each of the eight neighbors.  Where do the neighbors come from?  We identify them by making eight separate calculations of using the coordinates of the target cell.  And so on.

Sometimes, I imagine this as adding a comment to the hard coded answer, explaining why that answer is correct, and then introducing the same calculation in the code so that the comment is no longer necessary.

Paraphrasing Ward Cunningham, our goal is to produce code aligned with what we understand to be the proper way to think about the problem.  Because we understand the rules of the game, we can align to them during the refactor phase without waiting for more examples to test.

Top down doesn't mean that you must jump to lion taming in one go.  Top down refactorings tend to run deep, so it often makes sense to start with a examples that are narrow.  It's not unreasonable to prefer more tests of lighter weight to a single "and the kitchen sink too" example.


Thursday, August 5, 2021

TDD: Duplication

We had a long discussion on slack today about duplication, and refactoring before introducing the second test.  Didn't come away with the sense that ideas were being communicated clearly, but I suppose that's one of the hazards of talking about it, instead of showing/pairing on it.

Or the idea was just too alien -- always a possibility.

In the process, I found myself digging out the Fibonacci problem again, because I remembered that Kent Beck's demonstration of the Fibonacci problem "back in the day" had supported the point I wanted to make.  After looking in all of the "obvious" places, I thought to check the book.  Sure enough, it appears in Appendix II of Test Driven Development by Example.

(Rough timeline: Kent created a Yahoo group in February 2002; the Fibonacci exercise came up in March, and as of July the current draft had a chapter on that topic.)

Kent's refactoring in the text looks like:

This is his first step in removing duplication; replacing the easy-to-type literal that he used to pass the test with an expression that brings us one step closer to what we really mean.

Of course, there's nothing "driving" the programmer to make this change at this point; Kent is just taking advantage of the fact that he has tests to begin cleaning things up.  As it happens, he can clean things up to the point that his implementation completely solves the problem.

Today, there were (I think) two different flavors of objection to this approach.  

One of them focused on the test artifacts you end up with - if you implement the solution "too quickly", then your tests are deficient when viewed as documentation.  Better, goes the argument, to document each behavior as clearly as possible with its own test; and if those tests are part of your definition of done, then you might as well introduce the behaviors and the tests in rhythm.

It's an interesting thought - I don't agree with it today (tests are code, code is a liability would be my counter argument) - but its certainly worth consideration, and I wouldn't be surprised to discover that there are circumstances where that's the right trade off to make.

The other object came back to tests "driving" the design.  In effect, the suggestion seems to be that you aren't allowed to introduce a correct implementation until it is the simplest that passes all the tests.  I imagine an analogy to curve fitting - until you have two tests, you can't implement a line, until you have three tests, you can't implement a parabola, and so on.

That, it seems to me, leads you straight to the Owl.  Or worse, leaves us in the situation that Jim Coplien warned us of years ago - that we carry around a naive model that gives us the illusion of progress.

Sunday, April 25, 2021

TDD: When the API changes

If I discover my API is bad after writing 60 tests, I have to change a lot! -- shared by James Grenning

My experience with the TDD literature is that it tends to concentrate of situations that are fixed.  Here's a problem, implement a solution in bite sized steps, done.

And to be fair, in a lot of cases -- where the implications of "bite sized steps" are alien, that's enough.  There's a lot going on, and eliminating distractions via simplification is, I believe, a good focusing technique.  Red-Green-Refactor is the "new" idea, and we want to keep that centered until the student can exercise the loop without concentration.

But -- if we intend that TDD be executed successfully outside of lessons -- we're going to need to re-introduce complicating factors so that students can experience what they are going to face in the wild.