Saturday, August 10, 2019

Purchase Approval

One problem I've had with Domain Driven Design is coming up with good realistic examples that exhibit the sorts of complexity we need to be thinking about, without getting lost in the weeds.

My latest idea is to try working through a purchase approval in an analog office.

Bob wants the company to pay for something.  So he gets a blank form, and fills in the details, and drops off the form with Alice.  Alice does the work of comparing the details to the current policies, and approving / rejecting the request.  The resolved request is returned to Bob, so that he can act on the decision that has been made.

There are a lot of requests, and checking the details is a lot of work.  So Alice has become a bottleneck.  She offloads some of the work to Terry the intern; Terry does the legwork for requests when the approval doesn't require Alice's domain expertise.

As a proxy for easy, we'll use a trivial condition like "amount less than 100 USD".

The form acts as a sort of lock; an actor in this protocol can only change the paper when they have physical control of it.  So the process is serial, only one person can record information at a time.

That means we need to think more precisely about how the requests are shared by Alice and Terry.  Perhaps all requests go first to Alice, and she passes the easy requests to Terry; or perhaps the requests all go to Terry, and the hard cases are forwarded to Alice.

We can think of the request as a single form, that gets modified.  Alternatively, we can think of an envelope filled with "immutable" documents; each actor adds new paperwork to the envelope.

The process is asynchronous, in this sense - the request can be in Alice's office even though Alice herself is out at lunch, or home sick.  The movement of paper allows the people to communicate, even though they aren't necessarily in the office at the same time.

The paperwork is anemic - all of the domain knowledge is locked in the heads of Alice and Terry (and, to some degree, Bob).  The paperwork is just the bookkeeping.

It's also worth noting that the paper is immutable, in this sense: once the paperwork has left Bob's control, he cannot correct errors until the paperwork is returned to him.

Bob's "view" of this process is the stack of blank forms, and his collection of resolved requests.  Alice and Terry have similar views: stacks of pending requests.

Exercise 1: what changes when we take this process digital?  So instead of physical paperwork moving from place to place, we now have information being copied from one place to another.

Exercise 2: what changes when we extend the digital process to automate Terry's work?

Saturday, August 3, 2019

TDD: Random is Arbitrary

I happened across a Yahtzee kata today. Although I didn't work that exercise today, it got me thinking again about random behaviors.
Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin.
If your goal is to write fast, deterministic tests that you can use to detect unintended changes during a refactoring, then you need to treat a random number generator the way you would a clock.  The test itself has to decide which random sequence to provide, and pass it to the test subject.

But there's a second problem, which is this -- if a random choice from a list is an acceptable behavior, then so to must be the same random choice from every permutation of that list.

If we can map a sequence of random numbers to [HEADS, TAILS], and produce from that an acceptable behavior, then it must also be true that mapping the same sequence of random numbers to [TAILS, HEADS] must also be acceptable.  You can replace one with the other any time you want.

But doing that isn't a refactoring; when we make this change, the same inputs produce different outputs, which are measured by the test harness.

The choice of which permutation to use is an implementation detail; any tests that depend on the specific requirements are overfit.

Can you beat this by passing in an ordered list of choices?  Not really, for the same reason - if the permutation of items that you pass in produces an acceptable behavior, then it sill also be acceptable for the code under test to re-order those items before using them.

Anoher related problem is that there are different ways to interact with the random number generator that produce equally acceptable results.  If we want to randomly toss three coins, we can pull three random numbers and project each onto [0,1], and encode the result, or we can pull a single random number, project onto [0,7], and then use a squashed encoding to interpret the result.  If one is valid, then so to is the other -- but they leave the RNG in different states, and therefore we have additional risks of overfitting.

What does this mean?  That simply decoupling the random number generator isn't enough.  We need to be passing the result of the random choice - pass in the coin after it has been flipped, pass in the dice after they have been rolled, pass in the deck after it has been shuffled.

It's not enough to inject the random number generator into the test; you need to leave the arbitrary mapping of the random number to some result value in the imperative shell as well.