Sunday, February 19, 2017

TDD Probability Theory

I recently recommended that a friend review the probability kata; which got me back into it again.

 The probability kata was published by Greg Young in June, 2011.  The core of the problem statement was

Write a probability value object. It should contain the following methods:
  • Probability CombinedWith(Probability)
  • Probability Either(Probability)
  • Probability InverseOf()


Well, that certainly seems straight forward enough....

WARNING: Spoilers ahead



Part of the motivation for revisiting the kata again myself: I had been looking at property based testing, and came to find Scott Wlaschin's presentation.  When I came to slide #55, I recognized an old friend.

Because I had already given myself a perverse challenge in the probability kata -- to see how far I could really go building out an implementation test first.  The challenge came to be to figure out what I could test.

Write a probability value object.

What a value object gives you, out of the box, is an interface that allows you to test another reference of the same type for equality.  That's good, it gives us enough traction that we can begin to imagine writing tests against this interface.

Within the frame of property based testing, equality makes checking identities possible.

So a simple identity that we can check out of the box, might look like



Another common property that we consider is interchangability. Eric Evans wrote:
We don't care which instance we have of a VALUE OBJECT




The first thing I observed is that the theory of identity with one input gives rise to an analogous theory with two inputs; which is to say that in once case we construct from an arbitrary starting point a candidate to verify some identity, and then in another we investigate the class of values similar to our construction.


That's about all I can manage from the supposition that Probability is a value.

If we turn our attention to the probability interface,  it's probably easiest to start with the query that takes no arguments at all, and treat that as a warm up for the harder cases.



I think the appearance of the tautology here is fascinating; and I'm trying to find the right way to give it some lift.

The first refactoring to introduce is one that makes the construction explicit as a concept.



By teasing out the construction as an explicit concept, the relation between my one argument theories becomes a bit clearer: in one case construction is a participant in the check, whereas in the other it is being used as a filtering mechanism, supporting broader exploration.

A bit of additional refactoring helps to surface the tautology, perhaps


Perhaps tautology isn't quite right -- but as you can see it's equivalent to a theory that we have already checked. If we want to do that, we should be explicit and choose a more obvious spelling.


At first blush, there's not a lot that you can say about combinedWith() and either(). There is a symmetry in the arguments that we can exploit -- it looks like both of these operations are commutative


The excitement ebbs.

In truth, it's right around the corner.
The math is surprisingly not the main part of the exercise.
Heh, let's do some anyway.

combinedWith() and either(), together, are not particularly exciting.  But when you combine them with inverseOf(), you unlock DeMorgan's Laws, which give you extra identities to play with.
the negation of a conjunction is the disjunction of the negations
the negation of a disjunction is the conjunction of the negations




Scott reached a similar conclusion:
we have understood the requirements in a deeper way.

Punchlines

Me? I was really startled when I hit this point.
Use TDD when writing this object.
I've got all these tests, and this deep understanding of the requirements, and this giddy feeling of accomplishment... and a quote-unquote production implementation of the behavior that has no state. Every function in my implementation looks exactly the same



No joke! My first implementation of each method returned null, which would fail a check as soon as I tried to chain the return value to some other function, replaced the failing code with the simplest thing that could possibly work, and kept going, and going, and going.

I exaggerate mildly -- equals() doesn't return this, of course, because that would be a compile time error.  It returns true.  None of these identities, all of which hinge on equality, actually drive an implementation for the equals method.


Turning this operation around, all of our current theories about probability can be satisfied by a singleton type.

Developing a non trivial implementation of equals requires a theory that introduces distinct values. This might be a postulate
we assert that there exist distinguishable values Boolean.TRUE and Boolean.FALSE

or a constraint on a production rule
given distinguishable A,B, then Probability.from(A) is distinct from Probability.from(B)


In other words, we introduce a factory method, and develop theories about its operation

Notice that the introduction of the factory method doesn't constrain the data model in any way -- we can still use a singleton, or a boolean, or a bitset, or rationals, or reals, or even a binomial opinion under the covers.

That's the good news.  The bad news is that we've introduced a counting problem: if the range of possible inputs exceeds the range of implemented probability values, than an injection is impossible by the pigeonhole principle.  So the theory that distinct inputs produce distinct values lacks a firm foundation.

So we need some sort of specification that allows that assumption, without introducing a phantom requirement.  One hint is that, in floating point math, we rarely want to be thinking about equals, but instead look for tolerance.
Two inputs that differ by more than Δ must produce different values.

We deliberately leave unconstrained the exact granularity required -- we just specify a bound on it.  This doesn't constrain the value type at all, but it does put some restrictions on the factory, which the tests need.  Note that reasonable restrictions don't come from the implementations of the classes, but the business requirements of the interface; what's the minimum number of digits required to support the use case? At the API level, we might have multiple functions that return factories, each supporting a distinct specification for the minimum precision of support. The test, in our case, would state its needs, the entry point provides a factory, the test uses the factory to verify its theories (both its theories about the values, and its theories about the factory that it has been assigned).

Up to this point, the tests have still given us a leeway with the implementation of Probability we choose.
The internal state should be held as a decimal.
Well, ain't that a kick in the teeth.  Floating point numbers don't actually satisfy the identity we had created for inverseOf -- they aren't evenly distributed, and that gives us problems when we calculate the inverse of a very small number.

OK, let's try to work this backwards.  If storing probability state as a floating point number is acceptable to the business, then that implies that the equality specification that we selected was, in fact, over-fitting.  Much in the same way we previously discussed having some minimum accuracy built into the production rules, we also need to think about a specification of tolerance when checking the identities.


I'm comfortable with that thought -- in this exercise, we were driving toward a goal without a business justification to compare against.  Maybe the real requirement is for something tight, perhaps instead it is for something loose; perhaps with explicit uncertainty built in.

No comments:

Post a Comment