Thursday, January 10, 2019

Why I Practice TDD

TL; DR: I practice because I make mistakes; each mistake I recognize is an opportunity to learn, and thereby not make that mistake where it would be more expensive to do so.

Last night, I picked up the Fibonacci kata again.

I believe it was the Fibonacci kata where I finally got my first understanding of what Kent Beck meant by "duplication", many years ago.  So it is something of an old friend.  On the other hand, there's not a lot of meat to it - you go through the motions of TDD, but it is difficult to mine for insights.

Nonetheless, I made three significant errors, absolute howlers that have had me laughing at myself all day.

Error Messages

Ken Thompson has an automobile which he helped design. Unlike most vehicles, it has neither a speedometer, nor gas gauge, nor any of the other numerous idiot lights which plague the modern driver. Rather, if the driver makes a mistake, a giant "?" lights up in the center of the dashboard. "The experienced driver," says Thompson, "will usually know what's wrong."

In the early going of the exercise, I stopped to review my error messages, writing up notes about the motivations for doing them well.

Mid exercise, I had a long stare at the messages.  I was in the middle of a test calibration, and I happened to notice a happy accident in the way that I had made the test(s) fail.

But it wasn't until endgame that I finally discovered that I had transposed the expected and actual arguments in my calls to the Assertions library.

The contributing factors -- I've abandoned TestNG in favor of JUnit5, but my old JUnit habits haven't kicked back in yet.  While all the pieces are still fresh in my mind, I don't really see the data.  I just see pass and fail, and a surprise there means revert to the previous checkpoint.

Fence Posts

The one really interesting point in Fibonacci is the relationship between a recursive implementation and an iterative one.  A recursive implementation falls out pretty naturally when removing the duplication (as I learned from Beck long ago), but as Steve McConnell points out in Code Complete: recursion is a really powerful technique and Fibonacci is just an awful application of it.

What recursion does give you is a lovely oracle that you can use to check your production implementation.  It's an intermediate step along the way.

In "refactoring" away from the recursive implementation, I managed to introduce an off-by-one error in my implementation.  As a result, my implementation was returning a Fibonacci number, just not the one it was supposed to.

Well that happens, right? you run the tests, they go red, and you discover your mistake.

Not. So Much.

I managed to hit a perfect storm of errors.  At the time, I had failed to recall the idea of using an oracle, so I didn't have that security blanket.  I went after the recursion as soon as it appeared, so I was working in the neighborhood of fibonacci(2), which is of course equal to fibonacci(1).

My other tests didn't catch the problem, because I had tests that measured the internal consistency
of the implementation, rather than independently verifying the result.  One of the hazards that comes about from testing an implementation "in the calculus of itself."  Using property tests was "fine", but I needed one more unambiguous success before relying on them.

Independent Verification

One problem I don't think I've ever seen addressed in a Fibonacci kata is integer overflow.  The Fibonacci series grows roughly geometrically.  Java integers have a limited number of bits, so ending up with a number that is too big is inevitable.

But my tests kept passing - well past the point where my estimates told me I should be seeing symptoms.

The answer?  In Java, integer operators do not indicate overflow or underflow in any way. JLS 4.2.2

And that's just as true in the test code as it is in the production code.  In precisely the same way that I was only checking the internal consistency of my own algorithm, I also fell into the trap of checking the internal consistency of the plus operator.

There are countermeasures one can take -- asserting a property that the returned value must be non-negative, for instance.

Conclusions

One thing to keep in mind is that production code makes a really lousy test of the test code, especially in an environment where the tests are "driving" the implementation.  The standard for test code should be that it obviously has no deficiencies, which doesn't leave much room for creativity.


 

 

Monday, January 7, 2019

A study in ports and adapters

I've got some utilities that I use in the interactive shell which write representations of documents to standard out.  I've been trying to go fast to go fast with them, but that's been a bit rough, so I've been considering a switch to go slow to go smooth approach.

My preference is to work from the outside in; if the discipline of test driven development is going to be helpful as a design tool, then we should really be allowing it to guide where the module boundaries belong.

So let's consider an outside in approach to these shell applications.  My first goal is to establish how I can disconnect the logic that I want to test from the context that it runs in.  Having that separation will allow me to shift that logic into a test harness, where I can measure its behavior.

In the case of these little shell tools, there are three "values" I need to think about.  The command line arguments are one, the environment that the app is executing in is a second, and the output effect is the third.  So in Java, I can be thinking about a world that looks like this:


In the ports and adapters lingo, TheApp::main is taking on the role of an adapter, connecting the java run time to the port: TheApp::f.

For testing, I don't want effects, I want values.  More specifically, I want values that I can evaluate for correctness independently from the test subject.  And so I'm aiming for an adapter that looks like a function.

So I'll achieve that by providing a PrintStream that I can later query for information

In my tests, I can then verify, independently, that the array of bytes returned by the adapter has the correct properties.

Approaching the problem from the outside like this, I'm no longer required to guess about my modules -- they naturally arise as I identify interesting decisions in my implementation through the processes of removing duplication and introducing expressive names.

Instead, my guesses are about the boundary itself: what do I need from the outside world? How can I describe that element in such a way that it can be implemented completely in process?

Standard input, standard output, standard error are all pretty straight forward in Java, which somewhat obscures the general case.  Time requires a little bit of thinking about.  Connecting to remote processes may require more thought -- we probably don't want to be working down at the level of a database connection pool, for example.

We'll often need to invent an abstraction to express each of the roles that are being provided by the boundary.  A rough cut that allows the decoupling of the outer world gives us a starting point from which to discover these more refined boundaries.




Wednesday, January 2, 2019

Don't Skip Trivial Refactorings

Summary: if you are going to use test driven development as your process for implementing new features, and you don't want to just draw the owl, then you need to train yourself to refactor aggressively.

My example today is Tom Oram's 2018-11-11 essay on Uncle Bob's Transformation Priority Premise.  Oram uses four tests and seven distinct implementations of even?

If he were refactoring aggressively, I think we would see three implementations from two tests.

Kent Beck, in the face of objections to such a refactoring under the auspices of "removing duplication", wrote:
Sorry. I really really really am just locally removing duplication,
data duplication, between the tests and the code. The nicely general formula
that results is a happy accident, and doesn't always appear. More often than
not, though, it does.

No instant mental triangulation. No looking forward to future test cases.
Simple removal of duplication. I get it now that other people don't think
this way. Believe me, I get it. Now I have to decide whether to begin
doubting myself, or just patiently wait for most of the rest of the world to
catch up. No contest--I can be patient.
At that time, Beck was a bit undisciplined with his definition of "refactoring".  Faced with the same problem, my guess is that he would have performed the constant -> scalar transformation during his "refactoring" step.  If there isn't a test, then it's not observable behavior?

Personally, I find that the disciplined approach reduces error rates.


Wednesday, December 26, 2018

Study: SimpleDateFormat

I got an interesting lesson in code duplication today.

In reviewing some of our legacy code, I happened across this simple looking piece of code:

and it occurred to me to wonder... how many places are we calling that constructor?

And the answer is: a lot.

A quick scan of the various repositories I had cached locally showed that three different format strings were each being used to initialize the SimpleDateFormat at more than 50 different places in our code.

“Don’t Repeat Yourself” was never about code. It’s about knowledge. -- Mathias Verraes

Could it really be the case that this code was, in fact, DRY? Had we written the code the most effective way we could?

The strongest piece of evidence I have? Timestamps on the changes. The lines of code in question are old. They haven't needed to change in the years since they were written.

Why not? A large chunk of the credit must belong with the fact that SimpleDateFormat is part of the standard library, and the Java team has ensured that functionality has been sufficiently stable for our use.

A secondary possibility is that the date format is taking an in memory representation of a date time, and converting into an human readable representation to be included in a message, or vice versa. The adapters are stable because the message formats have been stable.

Because the message formats are stable, the code in question has very little reason to change. As a consequence, we don't get burned when we take what could be a call into a common method and instead choose to inline it.

What I think this suggests: when/if we decide to update the date formatting and parsing logic, we should be expecting to keep the actual format strings inline. Encapsulating the implementation details is, I think, easier to justify when you are in the middle of an exercise to change those details.

Tuesday, November 6, 2018

Ayende: Fear of an Empty Source File

And yet, probably the hardest thing for me is to start writing from scratch. If there is no code already there, it is all too easy to get lost in the details and not actually be able to get anywhere.  -- Oren Eini
This paralysis is the same thing I try to attack with simplest thing that could possibly work.

Dynamic friction encourages more progress than static friction.

Saturday, October 27, 2018

REST, Command Messages and URI Design

I posted the examples above into a discussion about how to send command messages to a domain model.

The domain model is the inner core of the onion. In particular, it shouldn't be directly coupled to HTTP.

The client sends one HTTP request to a single socket, and that the listener on that socked translates the HTTP Request into a message that the domain model understands. There's no particular advantage, from the point of view of the domain model, that some of the information is provided in the message body, and some in the headers, when it is all being transformed before it reaches the core domain logic.

Consider, for example, a form in a web page. It is a perfectly straight forward exercise to ensure that the application/x-www-form-urlencoded representation of the message is complete by adding a few hidden fields that the consumer doesn't need to worry about. When this is done, the particular URL used in the action really doesn't matter.

This is generally true: the resource identifier doesn't have to support the domain, because we have the message body for that.

The target resource identified by the request line becomes a free variable.

In HTTP, GET doesn't have a message body. When we request a resource from the server, any parameters that the server requires to discriminate between resources needs to be included in the identifier itself. Because HTTP is designed for large-grain hypermedia transfer, there are provisions for the server to provide to the client the appropriate caching semantics.

The real significance of the URI for POST is this -- it gives us the means to invalidate the representations of a specific resource from the client's cache.

Saturday, October 20, 2018

Wumpus RNG and the functional core

I've been experimenting with refactoring the wumpus, and observed an interesting interaction with the functional core.

For illustration, let's consider the shooting of an arrow.  After fighting a bit with the console, the player submits a firing solution.  This, can pass along to the game module, but there are two additional twists to consider. 

First: if the firing solution requires a tunnel between rooms that doesn't exist, the arrow instead travels randomly -- and might end up bouncing several more times. 

Second: if the arrow should miss, the wumpus may move to another room.  Again, randomly.

In my refactorings, the basic shape of the method is a function of three arguments: the firing solution, and some source of "random" numbers.  At this abstraction level, shooting the arrow looks to be atomic, but in some scenarios the random numbers are consumed.

I took some time today to sketch the state machine, and a slightly different picture emerges.  The game is Hunting, and invoking the shootArrow method takes the game to some other state; an incorrectly specified firing solution takes us to a BouncingArrow state, and invoking the bounceArrow method.  When the arrow has finished bouncing, we may end up in the ArrowMissed state, in which case we wakeWumpus, and discover whether or not the hunter has met his end.



The internal core knows nothing about sources of data - no command line, no random numbers, just computing transitions from one state to the next. This means that it is very easy to test, everything just sits in memory. The orchestration logic gets to worry about where the data comes from, but is completely decoupled from the core logic of the game.