Cascade Faliure: August 2023

Tuesday, August 29, 2023

TDDbE: Make It

We can't mark our test for $5 + $5 done until we've removed all of the duplication.

Huh. Because I have really felt like we've marked earlier tests as done before the duplication is removed.

We don't have code duplication, but we do have data duplication....

My experience is that people often miss this duplication - not necessarily in the long term, but certainly in the short term. Data duplication gives you "permission" to do a lot of the refactoring early (after first test), rather than deferring it until later.

In this case, why 10? because there are two 5s -- great, so "let's make the code say that".

Instead... ugh, Beck's example really seems to be going off of the rails here. Sums with augends and addends? who ordered that?

In passing, I do think it is important to notice that the details of the implementation are bleeding into the tests. These are programmer tests - they are there to measure the interactions of the internal elements in our module. We aren't restricted to only testing via the "published" API.

Wrote a test to force the creation of an object we expected to need later.

This has been a peeve for a long time - a lot of the mythology of test driven development, especially early on, centered the fact that the design arrives organically from the needs of the tests. At the same time, practitioners were constantly guessing at designs ("I'm going to need a class that does/represents mumble") and then creating a bunch of scaffolding around their guess, rather than attacking the problem that they actually have in front of them right now.

I don't think there's anything wrong, necessarily, with having a mental model of how the code should read in the end, and creating a sequence of tests to get there. I do think, though, that if that's what we are doing we really ought to be upfront about it.

Monday, August 28, 2023

TDDbE: Addition, Finally

Beck introduces the idea of rewriting the TODO list as both a warmup exercise, and as permission to clean up the little items first rather than copying them to the new list.

I'm not sure how to write the story of the whole addition, so we'll start with a simpler example

Again, the lesson of shifting to smaller steps as a counter measure to uncertainty; accompanied by my usual interjection that baby-footin' is optional, not mandatory. Have the skill to work at multiple cadences and shift between them, then use the cadence that is appropriate at the time.

I hope you will see through this how TDD gives you control over the size of the steps.

Vindication!

TDD can't guarantee that we will have flashes of insight at the right moment. However, confidence-giving tests and carefully factored code give us preparation for insight, and preparation for applying that insight when it comes.

That said, I think that testSimpleAddition really goes sideways. I think there are two flaws here: first, that the metaphors aren't very good. Second, the experimenting with the metaphor is something that should emerge during refactoring, not during the red task.

What we should be doing is taking the metaphor we understand (adding dollars to dollars), and making that work, then refactoring the underlying implementation to find the working metaphor (encoding into the design our improved understanding of the proper way to think), then lifting the new metaphor into the published API and finally (assuming it makes sense) sunsetting the dollars to dollars metaphor.

In other words:

for each desired change, make the change easy (warning: this may be hard), then make the easy change -- Kent Beck, 2012

This is not a fair fight: I'm writing 10+ years after Beck shared this approach, whereas the work I'm criticizing was written ten years before. If he waited to write the book until he understood everything thoroughly, the book would not exist for me to critique.

Part of the point of this review exercise is to rediscover where we were, in the expectation that we will find past practices that should be improved.

Thursday, August 24, 2023

TDDbE: The Root of All Evil

A quick lesson on how to get rid of a pair of classes that shouldn't have been introduced in the first place.

One thing stands out here: he doesn't do a lot of work to verify that the tests he wants to get rid of are redundant. Think, zap, done. And here, that's the right thing (toy problem, not too much to keep in your head, and of course it happens the tests are actually unnecessary).

Of course, if we are deleting tests that are passing, there's no problem in the sense that the production code is still right. The potential difficulty would be if we continue to assume that some part of the code is covered by tests when in fact it is not, and then start working on that code expecting to be afforded the same protections we had when the test was in place, and start making mistakes.

Sunday, August 20, 2023

TDDbE: Interesting Times

Whoa! Code without a test? Can you do that?

This is a strange little digression, on two points.

First, the error messages for the equality tests are something that should have been observed when the equality tests were created during RED. That would have been the obvious time to notice the messages and make improvements to them. Perhaps addressing the issue then would have made the earlier chapter more confusing.

Second, the change to the output message doesn't actually clarify the problem - Kent calls attention to this in the text, of course, but I'm amused that the extra work thrown in at this point made the failure harder to recognize, rather than easier.

Kent decides to back out the change and start over: here, he calls it the conservative approach. Fifteen years later, he'll introduce test && commit || revert -- a way of committing to the conservative approach.

With the tests you can decide whether an experiment would answer the question faster. Sometimes you should just ask the computer.

I like the experiment language quite a bit.

I'm beginning to feel that the main lesson up to this point is something along the lines of: be flexible until you know what you are doing.

Friday, August 18, 2023

TDDbE: Times We're Livin' In

The answer in my practice is that I will entertain a brief interruption, but only a brief one, and I will never interrupt an interruption.

That's a useful constraint right there.

This is the kind of tuning you will be doing constantly with TDD. Are the teeny-tiny steps feeling restrictive? Take bigger steps. Are you feeling a little unsure? Take smaller steps. TDD is a steering process -- a little this way, a little that way, There is no right step size, now and forever.

Yet another definition of TDD. Heh.

As usual, I'm not particularly fond of the currency test; again, it lacks connection to the business needs. Expressed another way, there's no assurance that the test, as written, is a good example for future clients because we aren't actually looking at the needs of the client. We're just sort of assuming that we're going to end up with micro-methods eventually anyway, so we might as well proceed to them without passing go.

Sunday, August 13, 2023

TDDbE: Makin' Objects

The two subclasses of Money aren't doing enough work to justify their existence, so we would like to eliminate them.

Minor complaint: that's a pretty clear conclusion to reach as soon as you start thinking about them. Again, my charitable interpretation is that we are witnessing a demonstration of how to climb out of a hole that you probably shouldn't have dug in the first place.

And that's perhaps concerning, in so far as we could have had a lot more coupling to the "wrong" interface if we hadn't discovered the improvement as early as we did.

The first step is interesting - replacing the direct invocation of the object constructors with factory methods (aka "named constructors"). That's a good idea often enough that I wouldn't have minded terribly if the "perfect interface" we imagined in chapter one had used them.

Kevlin Henney once observed that if your program is really object oriented, then the only place you would see the name of a concrete class is at the point that the constructor is invoked - everything else would be interfaces or (rarely) abstract classes.

Friday, August 11, 2023

TDDbE: Apples and Oranges

We would like to use a criterion that makes sense in the domain of finance, not in the domain of Java's objects. But we don't currently have anything like a currency, and this doesn't seem to be sufficient reason to introduce one.

Working outside in, we'd probably have one. After all, the multi-currency report shown in Chapter 1 includes ISO 4217 currency codes.

Many kinds of values will want to have both an amount and unit, and currency code is the logical expression of unit in this domain.

The thought occurs to me: are we floundering about because we don't actually know what we are doing (or are pretending that we don't know what we are doing?)

Thursday, August 10, 2023

TDDbE: Equality for All, Redux

I tried this already

I'm not entirely sure how I want to interpret this remark. On the one hand, spikes are a thing, and we might reasonably run some quick experiments in a sandbox to guess which approach we would prefer to use. On the other hand, cooked examples are a thing; demonstrating a practiced solution is not nearly as compelling as demonstrating an unpracticed solution.

I'm not fond of this demonstration at all.

In order, it's hard to tell at this point if we're supposed to be learning that this is the way to do it, or if this is intended to be some flavor of "even though we started badly, we got there in the end."

Maybe it's a lesson about choice - introducing the complexity of Francs before we've invested too much in Dollars reduces the cost of change, with the penalty of having more "not yet done" things to keep track of?

It's hard for me to shake the notion that the viability of this approach is tightly coupled to the fact that this is a toy problem.

Wednesday, August 9, 2023

TDDbE: Franc-ly Speaking

A copy paste chapter - I almost feel that I should a previous post, just to stay on theme.

The different phases have different purposes. They call for different styles of solution, different aesthetic viewpoints.

In this case: copy-and-hack is an option. OK, that doesn't bother me too much. Why copy and hack instead of some other idea and hack? Well, it's easy to type? I suppose that's reason enough.

If we weren't willing to copy-and-hack, I suppose the alternatives would be to refactor toward a more flexible design first, then leverage that into a more dignified introduction of the new functionality.

It's probably also worth noting that one reason copy-and-hack is a wall clock fast option: there's not a lot of code here yet - less than 15 lines if we are willing to ignore the white space. So 10 out of 10 for changing course while it's still easy to do.

I think the "eliminate duplication" message would be stronger if it didn't carry over from chapter to chapter.

Tuesday, August 8, 2023

TDDbE: Privacy

So having invested in an equals method, Beck now rewrites the multiplication test to use it. At this point, the tests for both multiplication and equality are completely decoupled from the data structures hidden behind the Dollar facade.

In effect, he takes an approach similar to the "unit test" approach described by Beizer - if Dollar::equals is well tested, then we can have the same confidence in that method that we do in the standard library, and therefore we can use it in testing other methods.

(I wouldn't call Dollar::equals well tested yet, of course. It still doesn't satisfy the general contract of Object::equals. "This is a risk we actively manage in TDD.")

In addition, since the tests are only using the facade, the implementation details can be made private, reducing future costs should we decide later that the underlying data representation should change - a la Parnas 1971.

Kent seems to be much happier with the clarity of the multiplication test here, but I'm not certain that this improvement is worth the emphasis.

Fundamentally, my issue is this: Dollar::equals is unmotivated. When we review the report we have in chapter 1, what do we find? We need to be able to "display" Dollars, and sum a column of Dollars, and multiply a price (Dollars / share) and number of shares to determine a position. The report tells us that presentation and arithmetic have business value.

But I see nothing in the report that indicates that logic has value. Equality isn't a thing that we need to satisfy a customer need right now.

Instead, the demand for equality came about because we wanted a "value object", and because we wanted test design that looked nice. Which feels to me as though we're chasing our own tail down a rabbit hole.

Expressed another way - by using equality, we've decoupled the current array of tests from the amount. However, this is unsatisfactory because some representation of amount is precisely the thing we need to get out of the Dollar object when we are producing the report.

The representation in the report is effectively part of the human interface, and we should be, perhaps, reluctant to couple our tests too tightly to that interface ("abstractions should not depend upon details"), but we do need some abstraction of amount to produce a useful report.

A black box that accepts integers and returns booleans does not satisfy.

What this reminds me of: early TDD exercises where the actual business value was delivered at the end, when all of the design-in-the-mind was finally typed into the computer, as opposed to ensuring the value first, then refactoring until there are no more improvements to be made.

Monday, August 7, 2023

TDDbE: Equality for All

If I went through these steps, though, I wouldn't be able to demonstrate the third and most conservative implementation strategy: Triangulation.

And honestly: I kind of wish he hadn't done that.

Some backround - early after Kent shared the first draft of the book that eventually became Test Driven Development by Example, an objection was raised: there should be a second test to "drive" the refactoring that Kent had demonstrated in the multiplication example. Until the second test is introduced, "You Aren't Going to Need It" should apply. Furthermore, in Kent's approach, during the refactoring task, he makes changes that don't actually preserve the behavior; a function that ignores its inputs is converted into a function that is sensitive to its inputs.

So bunch of things all get tangled up here. And when the dust clears, triangulation ends up with a lot more ink than is strictly justified, remove duplication gets a lot less ink, and a bunch of beginners fall into a trap (confession - I was one who did fall in, back in the day).

Kent's earliest remarks on triangulation emphasize that he doesn't find it particularly valuable. And certainly he didn't come up with a particularly well motivated example when introducing it.

The metaphor itself isn't awful - sometimes you do need more information to be certain of your course - just over emphasized.

For example: it takes up this entire chapter.

I find the motivation for implementing Dollar::equals to be a little bit underwhelming.

If you use Dollars as the key to a hash table, then you have to implement hashCode() if you implement equals().

This rather suggests that we should be writing a check that demonstrates that Dollar can be used as a hash key.

The "implementation" of Dollar::equals really leaves me twitching (see Effective Java by Joshua Bloch) - this is an override that doesn't match the semantics of the parent class (yet). I'd want to add the additional checks needed to finish this implementation before moving on to some other more interesting thing.

Sunday, August 6, 2023

TDDbE: Degenerate Objects

Write a test.
Make it run.
Make it right.

Again, "get to green quickly" implying that we are going to transition from the design we have to the design we want via many more much smaller steps.

Beck offers "architecture driven development" as a label for the opposite approach, where implementing the clean design happens before we can know whether the clean design will in fact work the problem. I don't love the name, and to be honest I'm not sure there's that much advantage in pretending that we don't know what the final design is going to look like.

I might argue as well that "invent the interface you wish you had" sounds to me a lot like solving the clean code part first. If we can train our brains to do that well, why not the other? Of course, Kent immediately takes the same view, but reversed

Our guesses about the right interface are no more likely to be perfect than our guesses about the right implementation.

But fine - sometimes we have to go back and change up the interface -- which is one reason that we implement a new failing test only when there are no other failing tests; we don't want to make extra work for ourselves by over committing when we know the least about our final design.

Something I notice: for this second chapter, Beck extends his original test into a more complicated scenario, rather than introducing a new test. That's fine, I guess - I'm not sure I remember when disciplined assertions became a thing.

The longer I do this, the better able I am to translate my aesthetic judgments into tests. When I can do this, my design discussions become much more interesting.

What catches my attention here is the sequence: judgment, test, implementation. In this example, Kent doesn't wait until a mistake "discovers" that a different design would be easier to work with in the long term, but instead motivates his code change via "that's doesn't work how I want it to". And without a doubt, I think treating this Dollar implementation as an immutable value is reasonable.

But there's certainly no "the test told us to do it" here. This design change comes about because we're all carrying memories of designs that made things unnecessarily difficult, and we have no particular interest in repeating the experience.

Thursday, August 3, 2023

TDDbE: Multi-Currency Money

Alright, chapter one, and we're introduced to the process of writing the first test

... we imagine the perfect interface for our operation. We are telling ourselves a story about how the operation will look from the outside. Our story won't always come true, but it's better to start form the best-possible application program interface (API) and work backward

Which I find interesting on two fronts - first, because where does this understanding of perfection come from? and second, because immediately after his first example he itemizes a number of ways in which the provided interface is unsatisfactory.

Another aspect to this example that I find unsatisfactory is that the presentation is rather in media res. Beck is essentially working from the WyCash context - we already have a portfolio management system, and there's a gap in the middle of that system that we want to fill in. Or if you prefer, a system where the simpler calculations are in place, but we want to refactor those calculations into an isolated module so that we might more easy make the new changes we want.

So we might imagine that we already have some code somewhere that knows how to multiple `dollars(5)` by `exchangeRate(2)`, and so on, and what we are trying to do is create a better replacement for that code.

I'm not entirely satisfied with this initial interface for this case, however - it rather skips past the parts of the design where we take the information in the form we have it and express it in the form that we need it. In this case, we're looking at the production of a report, where the inputs are some transient representation of our portfolio position, then the outputs are some transient representation of the report.

In effect, `new Dollar` is a lousy way to begin, because the `Dollar` class doesn't exist yet, so the code we have can't possibly be passing information to us that way.

I don't think it matters particularly much, in the sense that I don't think that the quality of the design you achieve in the end is particularly sensitive to where you start in the solution. And certainly there are a number of reasons that you might prefer to begin by exploring "how do we do the useful work in memory" before addressing the question of how we get the information we need to the places we need it.

Another quibble I have about the test example (although it took me many years to recognize it) is that we aren't doing a particularly good job about distinguishing between the measurement and the application of the "decision rule" that tells us if the measured value is satisfactory.

Moving on: an important lesson

We want the bar to go green as quickly as possible

The green task should be evaluated in wall clock time - we want less of that, because we want the checks in place when we are making mistakes.

A riddle - should TDD have a bias towards designs that produce quick greens, and if so is that a good thing?

(Isolating the measurement affords really quick greens via guard clauses and early returns. I'll have to think more on that.)

Once again, I notice that Kent's example racks up four compile errors before he starts working toward the green bar, where "nanocycle TDD" would encourage prioritizing the green bar over finishing the test. I'm not a big fan of nanocycle, myself, so I like having this evidence in hand when challenged.

We need to generalize before we move on.

You can call it generalizing, or you can call it removing duplication, but please notice that Kent is encouraging that we perform cleanup of the implementation before we introduce another failing test.

(There is, of course, room to argue with some of the labels being used - generalizing can change behaviors that we aren't constraining yet, so is it really "refactoring"? Beck and Fowler aren't on the same page here - I think Kent addresses this in a later chapter.)

By eliminating duplication before we go on to the next test, we maximize our chances of being able to get the next test running with one and only one change.

Ten years later, this same advice

for each desired change, make the change easy (warning: this may be hard), then make the easy change

How much work we have to do to make the change easy is likely an interesting form of feedback about the design you start with; if it's often a lot of work, maybe our design heuristics need work.

The final part of the chapter is an important lesson on duplication that I don't think really got the traction that it should have. Here, we have a function that returns 10 -- but there are lots of functions in the world that return 10, so we should do the work to make it clear why this specific function returns 10.

(Heh: only now do I notice that we don't actually reach the multi-currency bits hinted at by the chapter title.)

Wednesday, August 2, 2023

TDDbE: Introduction

The introduction is an interesting story of technical debt: the work on improving the internal quality of the code, and in particular the outsourced `Dollar` object, had been on-going for six months before the arrival of the business opportunity that needed the improved design.

(The debt metaphor first appears in Ward's 1992 experience report, which describes this same portfolio management system).

The transition from testing that computations are within tolerance to testing that computations are exact hides an important lesson: code changes that relocate rounding within a calculation are not refactorings in domains where the calculations are regulated, and you do want tests that are sensitive to those changes if refactoring is a core value.

In a broader sense, we're also getting a statement of good design; a design is good if it is ready to be changed.

I suspect that the story takes place circa 1992, so of course the team isn't doing "TDD", and there's no particular indication that the team is taking a test first approach. The promise here is really that TDD can bring you to this place that the Wyatt team got to another way (not too different though -- "if you assume all the good ideas in here are Ward's" and all that).

We also get our first look at the two simple rules

Write a failing automated test before you write any code.
Remove duplication.

The first of these, I think, fails, in the sense that it describes an XP approach where you turn up the dials to eleven to see how it works. And it works a lot, but it doesn't work eleven - the costs of writing automated tests are not uniform, and expensive tests against unstable specifications is generally a miserable experience for everybody.

In regarding the second rule, it's probably worth noting that, at the time, Kent had a reputation for being something of a wizard at recognizing duplication, and pursuing a sequence of refactorings that would eventually make the duplication obvious to everyone else, and finally taking steps to remove it.

I suspect that there is a similar skill, in which we take two things that look to be the same and improve the design until it becomes clear that they are not, in fact, duplicates.

Remove duplication hints at the four rules of simple design. One of the questions I'm still carrying around is whether the rules of simple design are part of the TDD bundle -- are we still "doing TDD" if we replace simple design with some other collection of design heuristics?