Cascade Faliure

Monday, August 7, 2023

TDDbE: Equality for All

If I went through these steps, though, I wouldn't be able to demonstrate the third and most conservative implementation strategy: Triangulation.

And honestly: I kind of wish he hadn't done that.

Some backround - early after Kent shared the first draft of the book that eventually became Test Driven Development by Example, an objection was raised: there should be a second test to "drive" the refactoring that Kent had demonstrated in the multiplication example. Until the second test is introduced, "You Aren't Going to Need It" should apply. Furthermore, in Kent's approach, during the refactoring task, he makes changes that don't actually preserve the behavior; a function that ignores its inputs is converted into a function that is sensitive to its inputs.

So bunch of things all get tangled up here. And when the dust clears, triangulation ends up with a lot more ink than is strictly justified, remove duplication gets a lot less ink, and a bunch of beginners fall into a trap (confession - I was one who did fall in, back in the day).

Kent's earliest remarks on triangulation emphasize that he doesn't find it particularly valuable. And certainly he didn't come up with a particularly well motivated example when introducing it.

The metaphor itself isn't awful - sometimes you do need more information to be certain of your course - just over emphasized.

For example: it takes up this entire chapter.

I find the motivation for implementing Dollar::equals to be a little bit underwhelming.

If you use Dollars as the key to a hash table, then you have to implement hashCode() if you implement equals().

This rather suggests that we should be writing a check that demonstrates that Dollar can be used as a hash key.

The "implementation" of Dollar::equals really leaves me twitching (see Effective Java by Joshua Bloch) - this is an override that doesn't match the semantics of the parent class (yet). I'd want to add the additional checks needed to finish this implementation before moving on to some other more interesting thing.

Sunday, August 6, 2023

TDDbE: Degenerate Objects

Write a test.
Make it run.
Make it right.

Again, "get to green quickly" implying that we are going to transition from the design we have to the design we want via many more much smaller steps.

Beck offers "architecture driven development" as a label for the opposite approach, where implementing the clean design happens before we can know whether the clean design will in fact work the problem. I don't love the name, and to be honest I'm not sure there's that much advantage in pretending that we don't know what the final design is going to look like.

I might argue as well that "invent the interface you wish you had" sounds to me a lot like solving the clean code part first. If we can train our brains to do that well, why not the other? Of course, Kent immediately takes the same view, but reversed

Our guesses about the right interface are no more likely to be perfect than our guesses about the right implementation.

But fine - sometimes we have to go back and change up the interface -- which is one reason that we implement a new failing test only when there are no other failing tests; we don't want to make extra work for ourselves by over committing when we know the least about our final design.

Something I notice: for this second chapter, Beck extends his original test into a more complicated scenario, rather than introducing a new test. That's fine, I guess - I'm not sure I remember when disciplined assertions became a thing.

The longer I do this, the better able I am to translate my aesthetic judgments into tests. When I can do this, my design discussions become much more interesting.

What catches my attention here is the sequence: judgment, test, implementation. In this example, Kent doesn't wait until a mistake "discovers" that a different design would be easier to work with in the long term, but instead motivates his code change via "that's doesn't work how I want it to". And without a doubt, I think treating this Dollar implementation as an immutable value is reasonable.

But there's certainly no "the test told us to do it" here. This design change comes about because we're all carrying memories of designs that made things unnecessarily difficult, and we have no particular interest in repeating the experience.

Thursday, August 3, 2023

TDDbE: Multi-Currency Money

Alright, chapter one, and we're introduced to the process of writing the first test

... we imagine the perfect interface for our operation. We are telling ourselves a story about how the operation will look from the outside. Our story won't always come true, but it's better to start form the best-possible application program interface (API) and work backward

Which I find interesting on two fronts - first, because where does this understanding of perfection come from? and second, because immediately after his first example he itemizes a number of ways in which the provided interface is unsatisfactory.

Another aspect to this example that I find unsatisfactory is that the presentation is rather in media res. Beck is essentially working from the WyCash context - we already have a portfolio management system, and there's a gap in the middle of that system that we want to fill in. Or if you prefer, a system where the simpler calculations are in place, but we want to refactor those calculations into an isolated module so that we might more easy make the new changes we want.

So we might imagine that we already have some code somewhere that knows how to multiple `dollars(5)` by `exchangeRate(2)`, and so on, and what we are trying to do is create a better replacement for that code.

I'm not entirely satisfied with this initial interface for this case, however - it rather skips past the parts of the design where we take the information in the form we have it and express it in the form that we need it. In this case, we're looking at the production of a report, where the inputs are some transient representation of our portfolio position, then the outputs are some transient representation of the report.

In effect, `new Dollar` is a lousy way to begin, because the `Dollar` class doesn't exist yet, so the code we have can't possibly be passing information to us that way.

I don't think it matters particularly much, in the sense that I don't think that the quality of the design you achieve in the end is particularly sensitive to where you start in the solution. And certainly there are a number of reasons that you might prefer to begin by exploring "how do we do the useful work in memory" before addressing the question of how we get the information we need to the places we need it.

Another quibble I have about the test example (although it took me many years to recognize it) is that we aren't doing a particularly good job about distinguishing between the measurement and the application of the "decision rule" that tells us if the measured value is satisfactory.

Moving on: an important lesson

We want the bar to go green as quickly as possible

The green task should be evaluated in wall clock time - we want less of that, because we want the checks in place when we are making mistakes.

A riddle - should TDD have a bias towards designs that produce quick greens, and if so is that a good thing?

(Isolating the measurement affords really quick greens via guard clauses and early returns. I'll have to think more on that.)

Once again, I notice that Kent's example racks up four compile errors before he starts working toward the green bar, where "nanocycle TDD" would encourage prioritizing the green bar over finishing the test. I'm not a big fan of nanocycle, myself, so I like having this evidence in hand when challenged.

We need to generalize before we move on.

You can call it generalizing, or you can call it removing duplication, but please notice that Kent is encouraging that we perform cleanup of the implementation before we introduce another failing test.

(There is, of course, room to argue with some of the labels being used - generalizing can change behaviors that we aren't constraining yet, so is it really "refactoring"? Beck and Fowler aren't on the same page here - I think Kent addresses this in a later chapter.)

By eliminating duplication before we go on to the next test, we maximize our chances of being able to get the next test running with one and only one change.

Ten years later, this same advice

for each desired change, make the change easy (warning: this may be hard), then make the easy change

How much work we have to do to make the change easy is likely an interesting form of feedback about the design you start with; if it's often a lot of work, maybe our design heuristics need work.

The final part of the chapter is an important lesson on duplication that I don't think really got the traction that it should have. Here, we have a function that returns 10 -- but there are lots of functions in the world that return 10, so we should do the work to make it clear why this specific function returns 10.

(Heh: only now do I notice that we don't actually reach the multi-currency bits hinted at by the chapter title.)

Wednesday, August 2, 2023

TDDbE: Introduction

The introduction is an interesting story of technical debt: the work on improving the internal quality of the code, and in particular the outsourced `Dollar` object, had been on-going for six months before the arrival of the business opportunity that needed the improved design.

(The debt metaphor first appears in Ward's 1992 experience report, which describes this same portfolio management system).

The transition from testing that computations are within tolerance to testing that computations are exact hides an important lesson: code changes that relocate rounding within a calculation are not refactorings in domains where the calculations are regulated, and you do want tests that are sensitive to those changes if refactoring is a core value.

In a broader sense, we're also getting a statement of good design; a design is good if it is ready to be changed.

I suspect that the story takes place circa 1992, so of course the team isn't doing "TDD", and there's no particular indication that the team is taking a test first approach. The promise here is really that TDD can bring you to this place that the Wyatt team got to another way (not too different though -- "if you assume all the good ideas in here are Ward's" and all that).

We also get our first look at the two simple rules

Write a failing automated test before you write any code.
Remove duplication.

The first of these, I think, fails, in the sense that it describes an XP approach where you turn up the dials to eleven to see how it works. And it works a lot, but it doesn't work eleven - the costs of writing automated tests are not uniform, and expensive tests against unstable specifications is generally a miserable experience for everybody.

In regarding the second rule, it's probably worth noting that, at the time, Kent had a reputation for being something of a wizard at recognizing duplication, and pursuing a sequence of refactorings that would eventually make the duplication obvious to everyone else, and finally taking steps to remove it.

I suspect that there is a similar skill, in which we take two things that look to be the same and improve the design until it becomes clear that they are not, in fact, duplicates.

Remove duplication hints at the four rules of simple design. One of the questions I'm still carrying around is whether the rules of simple design are part of the TDD bundle -- are we still "doing TDD" if we replace simple design with some other collection of design heuristics?

Monday, July 31, 2023

TDDbE: Preface

Opening paragraph: the goal of Test Driven Development is "clean code that works" - which is a bit of sloganeering describing the stop condition. TDD is a way to apply our effort to achieve the stop condition.

First promise: the long bug trail after "done" goes away. Certainly a lot of the faults in complicated logic should have been identified and eliminated, because we will be making sure to design our systems so that testing the complicated logic is cost effective (i.e. relentlessly reducing the coupling between complicated logic and modules that are difficult or expensive to test). Depending on how much of your system is complex logic, that could be the bulk of your problem right there.

It gives you a chance to learn all of the lessons that the code has to teach you.

This one, I'll need to chew on. You've got structured time to focus attention, but there's an opportunity cost associated with each of these lessons.

Write new code only if an automated test has failed
Eliminate duplication
These are two simple rules....

Eliminate duplication being a rule is kind of important, in so far as it pulls one of the pillars of the four rules of simple design into a definition of TDD. Sandro Mancuso has suggested that the four rules of simple design are separate from TDD, and while I'm receptive to the idea that they should be, the rule listed here means it isn't quite so easy.

A lot of lip service was paid to the "only if" rule early on. Michael Feathers introduced Humble Dialog before TDDbE was published, and we already had people using test doubles as an isolation mechanism. I think you can interpret that a lot of different ways - either the rule isn't so absolute, or the rule is absolute but the TDD process isn't sensible for all coding problems.

Running code providing feedback between decisions

This is a riddle I've been thinking on lately: the obvious feedback we get between decisions is red/green bar, announcing whether the values measured in our experiments match the described specifications. But the feedback for the design decisions - running the code doesn't give us very much there. Feedback for design decisions must come from somewhere else.

Refactor - Eliminate all of the duplication created in merely getting the test to work.

This is a big one: in my experience, most to nearly all practitioners move on from the refactor phase while duplication is still present. Back in the day, Beck had a reputation for being something of a wizard in spotting duplication that wasn't obvious and immediately taking steps to eliminate it.

Next, lots of emphasis on implications of reduced defect density. Just something to notice, possibly in contrast to his earlier writings about test first being primarily an analysis and design technique.

I'm noticing in passing that he isn't making any attempt here to clarify whether he is talking about reduced defect density because defects are being discovered and removed, or reduced defect density because fewer defects are being introduced.

TDD is an awareness of the gap between decision and feedback during programming, and techniques to control that gap.

One of my two favorite quotations from the book.

Some software engineers learn TDD and then revert to their earlier practices, reserving TDD for special occasions when ordinary programming isn't making progress.

Huh. If this sentence is written in 2009 - fine, there's been lots of time for people to try it, learn it, and decide it's a sometimes useful tool in the kit. On the other hand, if that sentence goes back to the earliest drafts, then it becomes a very interesting statement about some of the early adopters.

There certainly are programming tasks that can't be driven solely by tests. Security software and concurrency, for example, are two topics where TDD is insufficient to mechanically demonstrate that the goals of the software have been met.

Horses for courses.

Before teeing off on the examples as being too simple

Well, I certainly do do that often enough - mostly because the starting position (money as model code) already elides a bunch of interesting decisions about module boundaries. And weren't we promising that the tests would "drive" those decisions?

Book Club: Test Driven Development by Example

I think I'll try a review of Test Driven Development by Example. I'll be using the 14th printing (October 2009) paperback

Friday, January 6, 2023

Schools of Test Driven Development

There are two schools of test driven development. There is the school that believes that there are two schools; and there is the school that believes that there is only one school.

The school of "only one school" is correct.

As far as I can tell, "Chicago School" is an innovation introduced by Jason Gorman in late 2010. Possibly this is a nod to the fact that Object Mentor was based in Evanston, Illinois.

"Chicago School" appears to be synonymous with "Detroit School", a term proposed by Michael Feathers in early 2009. Detroit, here, because the Chrysler Comprehensive Compensation team was based in Detroit; and the lineage of the "classical" TDD practices could be traced there.

Feathers brought with that proposal the original London School label, for practices more readily influenced by innovations at Connextra and London Extreme Tuesday Club.

Feathers was proposing the "school" labels, because he was at that time finding that the "Mockist" and "Classicist" labels were not satisfactory.

The notion that we have two different schools of TDDer comes from Martin Fowler in 2005.

This is the point where things get muddled - to wit, was Fowler correct to describe these schools as doing TDD differently, or are they instead applying the same TDD practices to a different style of code design?

Steve Freeman, writing in 2011, offered this observation

...There are some differences between us and Kent. From what I remember of implementation patterns, in his tradition, the emphasis is on the classes. Interfaces are secondary to make things a little more flexible (hence the 'I' prefix). In our world, the interface is what matters and classes are somewhere to put the implementation.

In particular, there seems (in the London tradition) to be an emphasis on the interfaces between the test subject and its collaborators.

I've been recently reviewing Wirfs-Brock/Wilkerson's description of the Client/Server model.

Any object can act as either a client or a server at any given time

As far as I can tell, everybody has always been in agreement about how to design tests that evaluate whether the test subject implements its server responsibilities correctly.

But for the test subject's client responsibilities? Well, you either ignore them, or you introduce new server responsibilities to act as a proxy measure for the client responsibilities (reducing the problem to one we have already solved), or you measure the client responsibilities directly.

Mackinnon 2008 reported that extending the test subject with testing responsibilities was "compromising too far", and John Nolan had therefore challenged the team to seek other approaches.

Reflecting in 2010, Steve Freeman observed:

The underlying issue, which I only really understood while writing the book, is that there are different approaches to OO. Ours comes under what Ralph Johnson calls the "mystical" school of OO, which is that it's all about messages. If you don't follow this, then much of what we talk about doesn't work well.

Similarly, Nat Pryce:

There are different ways of designing how a system is organised into modules and the interfaces between those modules.... Mock Objects are designed for test-driving code that is modularised into objects that communicate by "tell, don't ask" style message passing.

My take, today, is still in alignment with the mockists: the TDD of the London school is the same TDD as everybody else: controlling the gap between decision and feedback, test first with red green refactor, and so on.

The object designs are different, and so we also see differences in the test design - because tests should be fit for purpose.