Cascade Faliure: October 2015

Friday, October 30, 2015

Domain Events and DTO Lifetimes

The client sends commands to the write model. If the write model doesn't understand the messages sent by the client, then (as far as that client is concerned), the model is effective immutable. The effective lifetime of the command itself is very brief - we need momentary agreement.

The read model shares projections with the client. If the client doesn't understand the messages it receives, then (again, from the perspective of this client), the model is write only. The effective lifetime of the projection is again short; once the appropriate view has been updated in the client, the projection can be discarded - we need momentary agreement.

The write model shares events with the read model, but the pattern doesn't hold.

The distinction is simply this: events persist.

You might need to save off commands in a queue, to ensure that they don't stomp on each other, they may need to be scheduled. But we know that it has to be ok for commands to evaporate, because fail fast is a correct expression of congestion control when the application will not be able to meet the SLA.

Similarly, you might persist projections; but that's primarily a performance optimization -- when the cache expires the projection, it will be rebuilt. The client might want to insulate the user from the dynamic nature of the model for a time, but an eventually consistent view will eventually change. That's its nature.

Events are more than just a representation of change pushed across the boundary between the write model and the read model. It also cross the boundary between the write model of today, and the write model of the future.

In particular, that means that putting domain objects directly into the representation of the event is dangerous, because we expect to be aggressively and continuously refining the domain model as we learn more and more about it. In other words, the instability of the domain model in the scale of product lifetime cautions us against mapping our persistent messages too closely to the domain.

We need to prepare for event streams that include multiple instances of the same event emitted by different versions of the model. Which suggests that, for each message in the stream, we'll need a hint in the meta data that indicates the proper recipe for restoring the domain event -- as the model in the past would have written the event knowing when it was going to be read in the future.

Avro? and tag every event in the history of the model with the writer schema of that time? Thrift/ProtocolBuffers, and hope that the evolution of the events can be supported entirely by non destructive schema changes? JSON, because you get the easy part of the answer for free? Take the hit to upgrade the immutable events in your store, so that all events are taken from the same version of the api?

My best guess today? You are going to need a schema eventually - this seems obvious to me as soon as other domains start subscribing to these events.

So the early guess is about how much value you can deliver before you take the plunge, and how expensive the first schema migration will be.

Friday, October 23, 2015

Can we enforce the transaction consistency boundaries with interfaces?

Maybe.

Let's first consider the case where we have an aggregate which includes a reference to another aggregate. That's perfectly reasonable, provided that the business is satisfied that coordinated changes between the aggregates are eventually consistent.

Now, each of the aggregates has their own commands (each changing their own state). Best practices suggest that we should only be modifying one aggregate per transaction; in other words, we should only be running command(s?) on one aggregate or the other.

Can we organize our code to enforce that?

I've been chewing on a remark from Greg Young, that getters and setters are evil. Setters, sure -- setters should instead be commands, written in the Ubiquitous Language. But getters? how on earth are you going to do anything useful with another object if you can't read it? What are you going to do with a Specification that can't read the object it is supposed to constrain?

I've chosen, for the moment, to understand his comment in this way: getters and setters have no place in the model; getters are perfectly acceptable in an immutable projection.

I'm borrowing these two ideas from Greg; ~~which I believe he lifted from an earlier generation of CQRS experts~~ [wrong - Greg is the earlier generation of CQRS experts]. Commands are sent to the model, which is optimized for validating and calculate all changes. Queries are sent to a projection -- there can be several -- which is optimized for reads, but may be stale.

So if we send a command to a model, and the execution of that command required state from some other aggregate, then we need to hydrate the appropriate projection of the remote aggregate.

I had been blocked on this until recently, because I couldn't see past needing a getter to obtain the reference to the remote aggregate to do the hydration.

But the answer to that puzzle is to pass a DomainService as one of the arguments in the command. The root can look up the referenceId without needing to expose it, and pass that value to the service to get back an immutable projection with precisely the data that it needs.

Essentially, we are building into the signature of the command the contract that promises we won't change anybody else.

Two use cases where I need more thought. The first is factory commands; calls into this aggregate to create a new instance of that aggregate. The second is a query on this aggregate to run a command on that one.

Another perspective on the problem: if the other aggregate is responsible for a business invariant, then it may throw a checked exception. I don't see how I can claim to be implementing a query that changes the model (in another aggregate), or an immutable object that throws exceptions.

My guess right now is that You Don't Do That. Instead, some hand waving happens in the Application Service fronting this mess that gets all the dancers on the correct step.

Are aggregate roots always entities?

Yes.

The critical characteristic of an aggregate root is that it acts as a transaction consistency boundary. In other words, it is responsible for changes that must always and immediately satisfy some business invariant.

That immediately rules out the possibility that it is a ValueObject, because ValueObjects are immutable. Any changes to a value reflected in the model are going to introduce a new instance of the value object.

Similarly, DomainEvents are also ruled out -- events are things that happened in the past, and we don't have time machines.

As an aside, I think the DDD Sample gets this wrong; the HandlingEvent aggregate is modeled as a DomainEvent. The description in the class header is

HandlingEvent's are sent from different Incident Logging Applications

Written in the active voice

Incident Logging Applications send HandlingEvents

and now I'm suspicious. Does the business really not track the source of these external events? Notice, also, that we never load a DomainEvent from the repository -- we only collect a history of events that match a TrackingId, which is a value object typically created within cargo.

DomainService is stateless, so there's no need for a transaction.

Sagas? I don't believe that's a fit; long running business processes that span multiple transactions and potentially more than one aggregate.

ApplicationService doesn't fit because the business invariant belongs in the model.