Cascade Faliure: Ruminations on State

Over the weekend, I took another swing at trying to understand how boundaries should work in our domain models.

Let's start with some assumptions.

First, we capture information in our system because we think it is going to have value to us in the future. We think there is profit available from the information, and therefore we capture it. Write only databases aren't very interesting, we expect that we will want to read the data later.

Second, that for any project successful enough to justify additional investment, we are going to know more later than we do today.

Software architecture is those decisions which are both important and hard to change.

We would like to defer hard to change decisions as late as possible in the game.

One example of such a hard decision would be carving up information into different storage locations. So long as our state is ultimately guarded by a single lock, we can experiment freely with different logical arrangements of that data and the boundaries within the model. But separating two bounded sets of information into separate storage areas with different locks, then discovering the logical boundaries are faulty makes a big mess.

Vertical scaling allows us to concentrate on the complexity of the model first, with the aim of carving out the isolated, autonomous bits only after we've accumulated several nines of evidence that it is going to work out as a long term solution.

Put another way, we shouldn't be voluntarily entering a condition where change is expensive until we are driven there by the catastrophic success of our earlier efforts.

With that in mind, let's think about services. State without change is fundamentally just a cache. "Change" without state is fundamentally just a function. The interesting work begins when we start combining new information with information that we have previously captured.

I find that Rich Hickey's language helps me to keep the various pieces separate. In easy cases, state can be thought of as a mutable reference to a value S. The evolution of state over time looks like a sequence of updates to the mutable reference, where some pure function calculates new values from their predecessors and the new information we have obtained, like so

Now, this is logically correct, but it is very complicated to work with. op(), as shown here, is made needlessly complicated by the fact that it is managing all of the state S. Completely generality is more power than we usually need. It's more likely that we can achieve correct results by limiting the size of the working set. Generalizing that idea could look something like

The function decompose and its inverse compose allow us to focus our attention exclusively on those parts of current state that are significant for this operation.

However, I find it more enlightening to consider that it will be convenient for maintenance if we can re-use elements of a decomposition for different kinds of messages. In other words, we might instead have a definition like

In the language of Domain Driven Design, we've identified re-usable "aggregates" within the current state that will be needed to correctly calculate the next change, while masking away the irrelevant details. New values are calculated for these aggregates, and from them a new value for the state is calculated.

In an object oriented domain model, we normally see at least one more level of indirection - wrapping the state into objects that manage the isolated elements while the calculation is in progress.

In this spelling, the objects are mutable (we lose referential transparency), but so long as their visibility is limited by the current function the risks are manageable.

Cascade Faliure

Tuesday, May 1, 2018

Ruminations on State

No comments:

Post a Comment