Monday, July 16, 2018

REST, Resources, and Flow

It occurred to me today that there are two important ideas in designing a REST API.

The more familiar of the two is the hypermedia constraint.  The client uses a bookmark to load a starting point, and from there navigates the application protocol by examining links provided by the server, and choosing which link to follow.

That gives the server a lot of control, as it can choose which possible state transitions to share with the client.  Links can be revealed or hidden as required; the client is programmed to choose from among the available links.  Because the server controls the links, the server can choose the appropriate request target.  The client can just "follow its nose"; which is to say, the client follows a link by using a request target selected by the server.

The second idea is that the request targets proposed by the server are not arbitrary.  HTTP defines cache invalidation semantics in RFC 7234.   Successful unsafe requests invalidate the clients locally cached representation of the request target.

A classic example that takes advantage of this is collections: if we POST a new item to a collection resource, then the cached representation of the collection is automatically invalidated if the server's response indicates that the change was successful.

Alas, there is an asymmetry: when we delete a resource, we don't have a good way to invalidate the collections that it may have belonged to.

Thursday, June 14, 2018

CQRS Meetup

Yesterday's meetup of the Boston DDD/CQRS/ES group was at Localytics, and featured a 101 introduction talk by James Geall, and a live coding exercise by Chris Condon.
CQRS is there to allow you to optimize the models for writing and reading separately.  NOTE: unless you have a good reason to pay the overhead, you should avoid the pattern.
 James also noted that good reasons to pay the overhead are common.  I would have liked to hear "temporal queries" here - what did the system look like as-at?

As an illustration, he described possibilities for tracking stock levels as a append only table of changes and a roll-up/view of a cached result.  I'm not so happy with that example in this context, because it implies a coupling of CQRS to "event sourcing".  If I ran the zoo, I'd probably use a more innocuous example: OLTP vs OLAP, or a document store paired with a graph database.

The absolute simplest example I've been able to come up with is an event history; the write model is optimized for adding new information to the end of the data structure as it arrives. In other words, the "event stream" is typically in message order; but if we want to show a time series history of those events, we need to _sort_ them first.  We might also change the underlying data structure (from a linked list to a vector) to optimize for other search patterns than "tail".
Your highly optimized model for "things your user wants to do" is unlikely to be optimized for "things your user wants to look at".
This was taken from a section of James's presentation explaining why the DDD/CQRS/ES tuple appear together so frequently.  He came back to this idea subsequently in the talk, when responding to some confusion about the read and write models

You will be doing roll ups in the write model for different reasons than those which motivate the roll ups in the read model.
A lot of people don't seem to realize that, in certain styles, the write model has its own roll ups.  A lot of experts don't seem to realize that there is more than one style -- I tried to give a quick calca on an alternative style at the pub afterwards, but I'm not sure how well I was able to communicate the ideas over the background noise.

The paper based contingency system that protects the business from the software screwing up is probably a good place to look for requirements.
DDD in a nut shell, right there.

That observation brings me back to a question I haven't found a good answer to just yet: why are we rolling our own business process systems, rather than looking to the existing tooling for process management (Camunda, Activiti and the players in the iBPMS Magic Quadrant)?  Are getting that much competitive advantage from rolling our own?

Event sourcing gives you a way to store the ubiquitous language - you get release from the impedance mismatch for free.  A domain expert can look at a sequence of events and understand what is going on.
A different spelling of the same idea - the domain expert can look at a given set of events, and tell you that the information displayed on the roll up screen is wrong.  You could have a field day digging into that observation: for example, what does that say about UUID appearing in the event data?

James raised the usual warning about not leaking the "internal" event representations into the public API.  I think as a community we've been explaining this poorly - "event" as a unit of information that we use to reconstitute state gets easily confused with "event" as a unit of information broadcast by the model to the world at large.

A common theme in the questions during the session was "validation"; the audience gets tangled up in questions about write model vs read model, latency, what the actual requirements of the business are, and so on.

My thinking is that we need a good vocabulary of examples of different strategies for dealing with input conflicts.  A distributed network of ATM machines; both in terms of the pattern of a cash disbursement, and also reconciling the disbursements from multiple machines when updating the accounts.  A seat map on airline, where multiple travelers are competing for a single seat on the plane.

Chris fired up an open source instance of Event Store, gave a quick tour of the portal, and then started a simple live coding exercise: a REPL for debits and credits, writing changes to the stream, and then then reading it back.  In the finale, there were three processes sharing data - two copies of the REPL, and the event store itself.

The implementation of the logic was based on the Reactive-Domain toolkit; which reveals its lineage, as it is an evolution of ideas acquired from working with Jonathan Oliver's Common-Domain and with Yves Reynhout, who maintains AggregateSource.

It's really no longer obvious to me what the advantage of that pattern is; it always looks to me as though the patterns and the type system are getting in the way.  I asked James about this later, and he remarked that no, he doesn't feel much friction there... but he writes in a slightly different style.  Alas, we didn't have time to explore further what that meant.

Sunday, June 10, 2018

Extensible message schema

I had an insight about messages earlier this week, one which perhaps ought to have been obvious.  But since I have been missing it, I figured that I should share.

When people talk about adding optional elements to a message, default values for those optional elements are not defined by the consumer -- they are defined by the message schema.

In other words, each consumer doesn't get to choose their own preferred default value.  The consumer inherits the default value defined by the schema they are choosing to implement.

For instance, if we are adding a new optional "die roll" element to our message, then consumers need to be able to make some assumption about the value of that field when it is missing.

But simply rolling a die for themselves is the "wrong" answer, in the sense that it isn't repeatable, and different consumers will end up interpreting the evidence in the message different ways.  In other words, the message isn't immutable under these rules.

Instead, we define the default value in the schema - documenting that the field is xkcd 221 compliant; just as every consumer that understands the schema agrees on the semantics of the new value, they also agree on the semantic meaning of the new value's absence.


If two consumers "need" to have different default values, that's a big hint that you may have two subtly different message elements to tease apart.

These same messaging rules hold when your "message" is really a collection of parameters in a function call.  Adding a new argument is fine, but if you aren't changing all of the clients at the same time then you really should continue to support calls using the old parameter list.

In an ideal world, the default value of the new parameter won't surprise the old clients, by radically changing the outcome of the call.

To choose an example, suppose we've decided that some strategy used by an object should be configurable by the client.  So we are going to add to the interface a parameter that allows the client to specify the implementation of the strategy they want.

The default value, in this case, really should be the original behavior, or it semantic equivalent.


Wednesday, June 6, 2018

Maven Dependency Management and TeamCity

A colleague got bricked when an engineer checked in a pom file where one of the dependencyManagement entries was missing a version element.

From what I see in the schema, the version element is minOccurs=0, so the pom was still valid.

Running the build locally, the build succeeded.  A review of the dependency:tree output was consistent with an unmanaged dependency -- two of the submodules showed different resolved versions (via transitive dependencies).

Running the build locally, providing the version element, we could see the dependencies correctly managed to the same version in each of the submodules.

But in TeamCity?  Well, the version of the project with the corrected pom built just fine.  Gravity still works.  But the bricked pom, produced this stack trace.


[18:28:22]W:  [Step 2/5] org.apache.maven.artifact.InvalidArtifactRTException: For artifact {org.slf4j:slf4j-log4j12:null:jar}: The version cannot be empty.
 at org.apache.maven.artifact.DefaultArtifact.validateIdentity(DefaultArtifact.java:148)
 at org.apache.maven.artifact.DefaultArtifact.(DefaultArtifact.java:123)
 at org.apache.maven.bridge.MavenRepositorySystem.XcreateArtifact(MavenRepositorySystem.java:695)
 at org.apache.maven.bridge.MavenRepositorySystem.XcreateDependencyArtifact(MavenRepositorySystem.java:613)
 at org.apache.maven.bridge.MavenRepositorySystem.createDependencyArtifact(MavenRepositorySystem.java:120)
 at org.apache.maven.project.DefaultProjectBuilder.initProject(DefaultProjectBuilder.java:808)
 at org.apache.maven.project.DefaultProjectBuilder.build(DefaultProjectBuilder.java:617)
 at org.apache.maven.project.DefaultProjectBuilder.build(DefaultProjectBuilder.java:405)
 at org.apache.maven.DefaultMaven.collectProjects(DefaultMaven.java:663)
 at org.apache.maven.DefaultMaven.getProjectsForMavenReactor(DefaultMaven.java:654)
 at sun.reflect.GeneratedMethodAccessor1975.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:606)
 at org.jetbrains.maven.embedder.MavenEmbedderImpl$4.run(MavenEmbedderImpl.java:447)
 at org.jetbrains.maven.embedder.MavenEmbedderImpl.executeWithMavenSession(MavenEmbedderImpl.java:249)
 at org.jetbrains.maven.embedder.MavenEmbedderImpl.readProject(MavenEmbedderImpl.java:430)
 at org.jetbrains.maven.embedder.MavenEmbedderImpl.readProjectWithModules(MavenEmbedderImpl.java:336)
 at jetbrains.maven.MavenBuildService.readMavenProject(MavenBuildService.java:732)
 at jetbrains.maven.MavenBuildService.sessionStarted(MavenBuildService.java:206)
 at jetbrains.buildServer.agent.runner2.GenericCommandLineBuildProcess.start(GenericCommandLineBuildProcess.java:55)


It looks to me like some sort of pre-flight check by the MavenBuildService before it surrenders control to maven.

If I'm reading the history correctly, the key is https://issues.apache.org/jira/browse/MNG-5727

That change went into 3.2.5; TeamCity's mavenPlugin (we're still running 9.0.5, Build 32523) appears to be using 3.2.3.

A part of what was really weird? Running the build on the command line worked; In my development environment, I had been running 3.3.9. So I had this "fix"; and everything was groovy. When I sshed into the build machine, I was running... 3.0.4. Maybe that's too early for the bug to appear? Who knows - I hit the end of my time box.

If you needed to read this... good luck.

Thursday, May 31, 2018

Bowling Kata v2

Inspired by

Part One

Produce a solution to the bowling game kata using your favorite methods.

Part Two

Changes in bowling alley technology are entering the market; the new hardware offers cheaper, efficient, environmentally friendly bowling.

But there's a catch... the sensors on the new bowling alleys don't detect the number of pins knocked down by each roll, but instead the number that remain upright.

Without making any changes to the existing tests: create an implementation that correctly scores the games reported by the new bowling alleys.

Examples

A perfect game still consists of twelve throws, but when the ball knocks down all of the pins, the sensors report 0 rather than 10.

Similarly, a "gutter game" has twenty throws, and the sensors report 10 for each throw.

A spare where 9 pins are first knocked down, and then 1, is reported as (1,0).

Part Three

The advanced bonus round

Repeat the exercise described in Part Two, starting from an implementation of the bowling game kata that is _unfamiliar_ to you.

Update

I found Tim Riemer's github repository, and forked a copy to use as the seed for part three.  The results are available on github: https://github.com/DanilSuits/BowlingGameKataJUnit5

Thursday, May 24, 2018

enforce failed: Could not match item

The Symptom

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce (enforce-pedantic-rules) on project devjoy-security-kata: Execution enforce-pedantic-rules of goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce failed: Could not match item :maven-deploy-plugin:2.8.2 with superset -> [Help 1]

The Stack Trace

org.apache.maven.lifecycle.LifecycleExecutionException: Failed to execute goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce (enforce-pedantic-rules) on project devjoy-security-kata: Execution enforce-pedantic-rules of goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce failed: Could not match item :maven-deploy-plugin:2.8.2 with superset
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:212)
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:153)
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:145)
 at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:116)
 at org.apache.maven.lifecycle.internal.LifecycleModuleBuilder.buildProject(LifecycleModuleBuilder.java:80)
 at org.apache.maven.lifecycle.internal.builder.singlethreaded.SingleThreadedBuilder.build(SingleThreadedBuilder.java:51)
 at org.apache.maven.lifecycle.internal.LifecycleStarter.execute(LifecycleStarter.java:128)
 at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:307)
 at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:193)
 at org.apache.maven.DefaultMaven.execute(DefaultMaven.java:106)
 at org.apache.maven.cli.MavenCli.execute(MavenCli.java:863)
 at org.apache.maven.cli.MavenCli.doMain(MavenCli.java:288)
 at org.apache.maven.cli.MavenCli.main(MavenCli.java:199)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.codehaus.plexus.classworlds.launcher.Launcher.launchEnhanced(Launcher.java:289)
 at org.codehaus.plexus.classworlds.launcher.Launcher.launch(Launcher.java:229)
 at org.codehaus.plexus.classworlds.launcher.Launcher.mainWithExitCode(Launcher.java:415)
 at org.codehaus.plexus.classworlds.launcher.Launcher.main(Launcher.java:356)
Caused by: org.apache.maven.plugin.PluginExecutionException: Execution enforce-pedantic-rules of goal org.apache.maven.plugins:maven-enforcer-plugin:3.0.0-M1:enforce failed: Could not match item :maven-deploy-plugin:2.8.2 with superset
 at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:145)
 at org.apache.maven.lifecycle.internal.MojoExecutor.execute(MojoExecutor.java:207)
 ... 20 more
Caused by: java.lang.IllegalArgumentException: Could not match item :maven-deploy-plugin:2.8.2 with superset
 at com.github.ferstl.maven.pomenforcers.model.functions.AbstractOneToOneMatcher.handleUnmatchedItem(AbstractOneToOneMatcher.java:63)
 at com.github.ferstl.maven.pomenforcers.model.functions.AbstractOneToOneMatcher.match(AbstractOneToOneMatcher.java:55)
 at com.github.ferstl.maven.pomenforcers.PedanticPluginManagementOrderEnforcer.matchPlugins(PedanticPluginManagementOrderEnforcer.java:142)
 at com.github.ferstl.maven.pomenforcers.PedanticPluginManagementOrderEnforcer.doEnforce(PedanticPluginManagementOrderEnforcer.java:129)
 at com.github.ferstl.maven.pomenforcers.CompoundPedanticEnforcer.doEnforce(CompoundPedanticEnforcer.java:345)
 at com.github.ferstl.maven.pomenforcers.AbstractPedanticEnforcer.execute(AbstractPedanticEnforcer.java:45)
 at org.apache.maven.plugins.enforcer.EnforceMojo.execute(EnforceMojo.java:202)
 at org.apache.maven.plugin.DefaultBuildPluginManager.executeMojo(DefaultBuildPluginManager.java:134)
 ... 21 more

The Cause

<plugin>
    <artifactId>maven-deploy-plugin</artifactId>
    <version>2.8.2</version>
</plugin>

The Cure

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-deploy-plugin</artifactId>
    <version>2.8.2</version>
</plugin>

The Circumstances

I had created a new maven project in Intellij IDEA 2018.1. That produced a build/pluginManagement/plugins section where the plugin entries did not include a groupId element. Maven (3.3.9), as best I could tell, didn't have any trouble with it; but the PedanticPluginManagementOrderEnforcer did.

My best guess at this point is that there is some confusion about the plugin search order

But I'm afraid I'm at the end of my time box. Good luck, DenverCoder9.

Friday, May 11, 2018

Tests, Adapters, and the lifecycle of an API Contract

The problem that I faced today was preparing for a change of an API; the goal is to introduce new interfaces with new spellings that produce the same observable behaviors as the existing code.

Superficially, it looks a bit like paint by numbers.  I'm preparing for a world where I have two different implementations to exercise with the same behavior, ensuring that the same side effects are measured at the conclusion of the automated check.

But the pattern of the setup phase is just a little bit different.  Rather than wiring the automated check directly to the production code, we're going to wire the check to an adapter.


The basic principles at work are those of a dependency injection friendly framework, as described by Mark Seemann.
  • The client owns the interface
  • The framework is the client
In this case, the role of the framework is played by the scenario, which sets up all of the mocks, and verifies the results at the conclusion of the test.  The interface is a factory, which takes a description of the test environment and returns an instance of the system under test.

That returned instance is then evaluated for correctness, as described by the specification.

Of course, if the client owns the interface, then the production code doesn't implement it -- the dependency arrow points the wrong direction.

We beat this with an adapter; the automated check serves as a bridge between the scenario specification and a version of the production code.  In other words, the check stands as a demonstration that the production code can be shaped into something that satisfies the specification.

This pattern gives us a reasonably straight forward way to push two different implementations through the same scenario, allowing us to ensure that the implementation of the new api provides equivalent capabilities to its predecessor.

But I didn't discover this pattern trying to solve that problem...

The problem that I faced was that I had two similar scenarios, where the observable outcome was different -- the observable behavior of the system was a consequence of some configuration settings. Most of my clients were blindly accepting the default hints, and producing the normal result. But in a few edge cases, a deviation from the default hints produced a different result.

The existing test suite was generally soft on this scenario. My desired outcome was two fold -- I wanted tests in place to capture the behavior specification now, and I wanted artifacts that would demonstrate that the edge case behavior needed to be covered in new designs.

We wouldn't normally group these two cases together like this. We're more likely to have a suite of tests ensuring that the default configuration satisfies its cases, and that the edge case configuration satisfies a different suite of results.

We can probably get closer to the desired outcome by separating the scenario and its understanding of ExpectedResult from the specifications.

And likewise for the edge case.

In short, parallel suites with different expected results in shared scenarios, with factory implementations that are bound to a specific implementation.

The promise (actually, more of a hope) is that as we start moving the api contracts through their life cycles -- from stable to legacy/deprecated to retired -- we will along the way catch that there are these edges cases that will need resolution in the new contracts.  Choose to support them, or not, but that choice should be deliberate, and not a surprise to the participants.

Tuesday, May 1, 2018

Ruminations on State

Over the weekend, I took another swing at trying to understand how boundaries should work in our domain models.

Let's start with some assumptions.

First, we capture information in our system because we think it is going to have value to us in the future.  We think there is profit available from the information, and therefore we capture it.  Write only databases aren't very interesting, we expect that we will want to read the data later.

Second, that for any project successful enough to justify additional investment, we are going to know more later than we do today.

Software architecture is those decisions which are both important and hard to change.
We would like to defer hard to change decisions as late as possible in the game.

One example of such a hard decision would be carving up information into different storage locations.  So long as our state is ultimately guarded by a single lock, we can experiment freely with different logical arrangements of that data and the boundaries within the model.  But separating two bounded sets of information into separate storage areas with different locks, then discovering the logical boundaries are faulty makes a big mess.

Vertical scaling allows us to concentrate on the complexity of the model first, with the aim of carving out the isolated, autonomous bits only after we've accumulated several nines of evidence that it is going to work out as a long term solution.

Put another way, we shouldn't be voluntarily entering a condition where change is expensive until we are driven there by the catastrophic success of our earlier efforts.

With that in mind, let's think about services.  State without change is fundamentally just a cache.   "Change" without state is fundamentally just a function.  The interesting work begins when we start combining new information with information that we have previously captured.

I find that Rich Hickey's language helps me to keep the various pieces separate.  In easy cases, state can be thought of as a mutable reference to a value S.  The evolution of state over time looks like a sequence of updates to the mutable reference, where some pure function calculates new values from their predecessors and the new information we have obtained, like so


Now, this is logically correct, but it is very complicated to work with. op(), as shown here, is made needlessly complicated by the fact that it is managing all of the state S. Completely generality is more power than we usually need. It's more likely that we can achieve correct results by limiting the size of the working set. Generalizing that idea could look something like


The function decompose and its inverse compose allow us to focus our attention exclusively on those parts of current state that are significant for this operation.

However, I find it more enlightening to consider that it will be convenient for maintenance if we can re-use elements of a decomposition for different kinds of messages. In other words, we might instead have a definition like


In the language of Domain Driven Design, we've identified re-usable "aggregates" within the current state that will be needed to correctly calculate the next change, while masking away the irrelevant details. New values are calculated for these aggregates, and from them a new value for the state is calculated.

In an object oriented domain model, we normally see at least one more level of indirection - wrapping the state into objects that manage the isolated elements while the calculation is in progress.


In this spelling, the objects are mutable (we lose referential transparency), but so long as their visibility is limited by the current function the risks are manageable.

Sunday, April 22, 2018

Spotbugs and the DDD Sample application.

As an experiment, I decided to run spotbugs against my fork of the DDDSample application.

For the short term, rather than fixing anything in the first pass, I decided to simply suppress the warnings to address them later.

   9 @SuppressFBWarnings("EI_EXPOSE_REP")
   8 @SuppressFBWarnings("EI_EXPOSE_REP2")
 These refer to the fact that java.util.Date isn't an immutable type; exchanging a reference to a Date means that the state of the domain model could be changed from the outside.  Primitive values and immutable types like java.lang.String.

6 @SuppressFBWarnings("SE_BAD_FIELD")
This warning caught my attention, because it flags a pattern that I was already unhappy with in the implementation: value types that hold a reference to an entity.  Here, the ValueObject abstraction is tagged as Serializable, but the entities are not, and so it gets flagged.

That the serialization gets flagged is just a lucky accident.  The bigger question to my mind is whether or not nesting a entity within a value is an anti-pattern.  In many cases, you can do as well by capturing the identifier of the entity, rather than the entity itself.

Those two issues alone cover about 75% of the issues flagged by spotbugs.

Friday, March 30, 2018

TDD: Tales of the Fischer King

Chess960 is a variant of chess introduced by Bobby Fischer in the mid 1990s. The key idea is that each players' pieces, rather than being assigned to fixed initial positions on the home rank, are instead randomized, subject to the following constraints
  • The bishops shall be placed on squares of opposite colors
  • The king shall be positioned between the two rooks.
There are 960 different positions that satisfy these constraints.

In December of 2002, Peter Seibel proposed :
Here's an interesting little problem that I tried to tackle with TDD as an exercise....  Write a function (method, procedure, whatever) that returns a randomly generated legal arrangement of the eight white pieces.
Looking back that the discussion, what surprises me is that we were really close to a number of ideas that would become popular later:
  • We discuss property based tests (without really discovering them)
  • We discuss mocking the random number generator (without really discovering that)
  • We get really very close to the idea that random numbers, like time, are inputs
I decided to retry the exercise again this year, to explore some of the ideas that I have been playing with of late.

Back in the day, Red Green Refactor, was understood as a cycle; but these days, I tend to think of it as a protocol.  Extending an API is not the same flavor of Red as putting a new constraint on the system under test, the Green in the test calibration phase is not the same as the green after refactoring, and so on.

I tend to lean more toward modules and functions than I did back then; our automated check configures a specification that the system under test needs to satisfy.  We tend to separate out the different parts so that they can be run in parallel, but fundamentally it is one function that accepts a module as an argument and returns a TestResult.

An important idea is that the system under test can be passed to the check as an argument.  I don't feel like I have the whole shape of that yet, but the rough idea is that once you are inside the check, you are only talking to an API.  In other words, the checks should be re usable to test multiple implementations of the contract.  Of course, you have to avoid a naming collision between the check and the test.

One consequence of that approach is that the test suite can serve the role of an acceptance test if you decide to throw away the existing implementation and start from scratch.  You'll need a new set of scaffolding tests for the new implementation to drive the dynamo.  Deleting tests that are over fitting a particular implementation is a good idea anyway.

One of the traps I fell into in this particular iteration of the experiment: using the ordering bishops-queens-knights is much easier to work with than attacking the rook king rook problem.  I decided to push through it this time, but I didn't feel like it progressed nearly as smoothly as it did the first time through.

There's hidden duplication in the final iteration of the problem; the strategies that you use for placing the pieces are tightly coupled to the home row.  In this exercise, I didn't even go so far as encapsulating the strategies.

Where's the domain model?  One issue with writing test first is that you are typically crossing the boundary between the tests and the implementation; primitives are the simplest thing that could possibly work, as far as the interface is concerned.

Rediscovering that simplest thing was originally a remedy for writer's block was a big help in this exercise.  If you look closely at the commits, you'll see a big gap in the dates as I looked for a natural way to implement the code that I already had in mind.  A more honest exercise might have just wandered off the rails at that point.

I deliberately put down my pencil before trying to address the duplication in either the test or the implementation.  Joe Rainsberger talks about the power of the TDD Dynamo; but Sandi Metz right points out that keeping things simple leaves you extremely well placed to make the next change easy.

During the exercise, I discovered that writing the check is a test, in the sense described by James Bach and Michael Bolton.
Testing is the process of evaluating a product by learning about it through exploration and experimentation
This is precisely what we are doing during the "Red" phase; we are exploring our candidate API, trying to evaluate how nice it is to work with.  The implementation of the checks can later stand in as an example of how to consume the API.

The code artifacts from this exercise, and the running diary, are available on Github.



Sunday, March 11, 2018

TDD in a sound bite

Test Driven Development is a discipline that asserts that you should not implement functionality until you have demonstrated that you can document the requirements in code.

Monday, February 5, 2018

Events in Evolution

During a discussion today on the DDD/CQRS slack channel, I remembered Rinat Abdullin's discussion of evolving process managers, and it led to the following thinking on events and boundaries.

Let us begin with out simplified matching service; the company profits by matching buyers and sellers.  To begin, the scenario we will use is that Alice, Bob, and Charlie place sell orders, and David places a buy order.  When our domain expert Madhav receives this information, he uses his knowledge of the domain and recognizes that David's order should be matched with Bob's.

Let's rephrase this, using the ideas provided by Rinat, and focusing on the design of Madhav's interface to the matching process.  Alice, Bob, Charlie, and David have placed their orders.  Madhav's interface at this point shows the four unmatched orders; he reviews them, decides that Bob and David's orders match, and sends a message describing this decision to the system.  When that decision reaches the projection, it updates the representation, and now shows two unmatched orders from Alice and Charlie, and the matched orders from Bob and David.

Repeating the exercise: if we consider the inputs of the system, then we see that Madhav's decision comes after four inputs.

So the system uses that information to build Madhav's view of the system. When Madhav reports his decision to the system, it rebuilds his view from five inputs:

With all five inputs represented in the view, Madhav can see his earlier decision to match the orders has been captured and persisted, so he doesn't need to repeat that work.

At this point in the demonstration, we don't have any intelligence built into the model. It's just capturing data; Udi Dahan might say that all we have here is a database. The database collects inputs; Madhav's interface is built by a simple function - given this collection of inputs, show that on the screen.

Now we start a new exercise, following the program suggested by Rinat; we learn from Madhav that within the business of matching, there are a lot of easy cases where we might be able to automate the decision. We're not trying to solve the whole problem yet; in this stage our ambitions are very small: we just want the interface to provide recommendations to Madhav for review. Ignore all the hard problems, don't do anything real, just highlight a possibility. We aren't changing the process at all - the inputs are the same that we saw before.

We might iterate a number of times on this, getting feedback from Madhav on the quality of the recommendations, until he announces that the recommendations are sufficiently reliable that they could be used to reduce his workload.

Now we start a new exercise, where we introduce time as an element. If there is a recommended match available, and Madhav does not override the recommendation within 10 seconds, then the system should automatically match the order.

We're introducing time into the model, and we want to do that with some care. In 1998, John Carmack told us
If you don't consider time an input value, think about it until you do -- it is an important concept


This teaches us that we should be thinking about introducing a new input into our process flow

Let's review the example, with this in mind. The orders arrive as before, and there are no significant changes to what we have seen before

But treating time as an input introduces an extra row before the match is made

While that seems simple enough, something arbitrary has crept in. For example, why would time be an input in only 10 second bursts?

Or perhaps it's a better separation of concerns to use a scheduler

And we start to notice that things are getting complicated; what's worse, they are getting complicated in a way that Madhav didn't care about when he was doing the work by hand.

What's happening here is that we've confused "inputs" with "things that we should remember". We need to remember orders, we need to remember matches -- we saw that when Madhav was doing the work. But we don't need to remember time, or scheduling; those are just plumbing constructs we introduced to allow Madhav to intercede before the automation made an error.

Inputs and thing we should remember were the same when our system was just a database. No, that's not quite right; they weren't the same, they were different things that looked the same. They were different things that happened to have the same representation because all of the complicated stuff was outside of the system. They diverged when we started making the system more intelligent.

With this revised idea, we have two different ways of thinking about the after situation; we can consider its inputs

Or instead we can think about its things to remember

"Inputs" and "Things to Remember" are really really lousy names, and those spellings don't really lend themselves well to the ubiquitous language of domain modeling. To remain in consistent alignment with the literate, we should use the more common terminology: commands and events.

In the design described so far, we happen to have an alignment of commands and events that is one to one. To illustrate, we can think of the work thus far as an enumeration of database transactions, that look something like:

In the next exercise, consider what would happen if we tried to introduce as an invariant that there should be no easily matched items (following the rules we were previously taught by Madhav) left unmatched. In other words, when David's order arrives, it should be immediately matched with Bob's, rather than waiting 10 seconds. That means that one command (the input of David's order) produces two different events; one to capture the observation of David's order, the other to capture the decision made by the automation as a proxy for Madhav.

The motivation for treating these as two separate events is this: it most closely aligns with the events we were generating when Madhav was making all of the decisions himself. Whether we use Madhav in the loop making the decisions, or simply reviewing scheduled decisions, or leaving all of the decisions to the automation, the captured list of events is the same. That in turn means that these different variations in implementation here do not impact the other implementations at all. We're limiting the impact of the change by ensuring the the observable results are consistent in all three cases.

Tuesday, January 16, 2018

Events are messages that describe state, not behavior

I have felt, for some time now, that the literature explains event sourcing poorly.

The basic plot, that current state is just a left fold over previous behaviors, is fine, so far as it goes.  But it rather assumes that the audience is clear on what "previous behaviors" means.

And I certainly haven't been.

In many, perhaps even most, domain models can be thought of as state machines:

Cargo begins its life cycle when it is booked, and and our delivery document changes state when the cargo is received, and when an itinerary is selected, it gets loaded in port, the itinerary changes, each of these messages arrives and changes the state of the delivery document.

So we can think of the entire process as a state machine; each message from the outside world causes the model to follow a transition edge out of one state node and into another.

If we choose to save the state of the delivery document, so that we can resume work on it later, there are three approaches we might take
  • We could simply save the entire delivery document as is
  • We could save the sequence of messages that we received
  • We could save a sequence of patch documents that describe the differences between states
Event sourcing is the third one.

We call the patches "events", and they have domain specific semantics; but they are fundamentally dumb documents that decouple the representation of state from the model that generated it.

This decoupling is important, because it allows us to change the model without changing the semantics of the persisted representation.

To demonstrate this, let's imaging a simple trade matching application.  Buy and Sell orders come in from different customers, and the model is responsible for pairing them up.  There might be elaborate rules in place for deciding how matches work, but to save the headache of working them out we'll instead focus our attention on a batch of buy and sell orders that can be paired arbitrarily -- the actual selects are going to be determined by the model's internal tie breaker strategy.

So we'll image that a new burst of messages appear, at some new price -- we don't need to worry about any earlier orders.  The burst begins...


After things have settled down, we restart the service. That means that the in memory state is lost, as has to be recovered from what was written to the persistent store. We now get an additional burst of messages.


Using a first in, first out tiebreaker, we would expect to see pairs (A,1), (B,2), (C,3), and (D,4). If we were using a last in, first out tiebreaker, we would presumably see (D,1), (E,2), (C,3), (B,4).

But what happens if, during the shutdown, we switch from FIFO to LIFO? During the first phase, we see (A,1) matched, as before. After that, we should see (D,2), (E,3), (C,4).

In order to achieve that outcome, the model in the second phase needs knowledge of the (A,1) match before the shutdown. But it can only know about that match if there is evidence of the match written to the persistent store. Without that knowledge, the LIFO strategy would expect that (D,1) were already matched, and would in turn produce (C,2), (E, 3), and (A,4). The last of these conflicts with the original (A,1) match. In other words, we're in a corrupted state.

Writing the entire document to the event store works just fine, we read a representation that suggests that A and 1 are unavailable, and the domain model can proceed accordingly. Writing the sequence of patches works, because when we fold the patches together we get the state document. It's only the middle case, where we wrote out representations that implied a particular model, that we got into trouble.

The middle approach is not wrong, by any means. The LMAX architecture worked this way; they would write the input messages to a journal, and in the event of a failure they would recover by starting a copy of the same model. The replacement of the model behavior happened off hours.

Similarly, if you have the list of inputs and the old behavior, you can recover current state in memory, and then write out a representation of that state that will allow a new model to pick up from where the old left off.

Not wrong, but different. One approach or the other might be more suitable for your unique collection of operational constraints.

Monday, January 15, 2018

Command messages in REST

Kevin Swiber, of SIREN fame, just taught me a nifty trick for describing commands to a resource

Background: commands are semantically unsafe operations, as described in the HTTP specification. They are messages that expect to induce a change of state on the server, they invalidate caches, and so on.

In considering a design that conforms to the REST architectural style, it's useful to ask "how did we do it on the web?" We had URI, and HTTP, and HTML; those elements alone were enough to support catastrophic success.

Now, in HTML, the only readily available link for unsafe operations was the form, which supported only GET and POST. When there's only one road, you take it: servers would send HTML representations that would include Forms, that would describe to the client how to create the command message.

Notice that this is very much a hypermedia approach - if the server wishes to communicate that the command should not be sent, it simply removes the form from the representation.  Elements that are no longer necessary can be removed from the form.  New optional elements can be added by simply including a new input control and providing a reasonable default value.

What this teaches us is that a POST message with an application/x-www-form-urlencoded payload is a perfectly satisfactory way of modeling a command sent to a resource.

But I've got to admit,  that doesn't feel much like "REST".

So here's another way of thinking about it.

HTTP Patch affords the modification of resources.  We tend to think about the payload as a list of changes to make; but another way of thinking about it is that the payload describes a list of commands.  For example, application/json-patch+json is a document that describes operations; add, remove, and so on.

First trick: any resource that supports PATCH application/json-patch+json can just as easily provide the same functionality via POST application/json-patch+json.  The semantics of PATCH are more specific than those of POST, but they don't fundamentally conflict with each other.

Second trick: application/json-patch+json is just a representation of a sequence of operations; with those operations taken specifically from the vocabulary of manipulating a json document.  We don't particularly want to be borrowing those semantics unless they happen to be a particularly good fit for our domain; we likely have our own names for operations, particular parameters, and so on. So we instead might choose our own patch document representation; and write it up so that clients can be coded against it.

Third trick: these media types don't have any magical properties; they are just rules for taking semantic meaning and encoding that meaning as bytes, and then reversing the process at the other end. We can use any byte representation we like, provided that the client and the server agree on the rules being used. We could use text/plain, or application/json. Alas, both of those representations suffer from the problem that we need additional out of band communication to ensure that both ends of the conversation understand the meaning the same way.

Fourth trick: if our command message can be represented as a collection of key/value pairs (likely the case, if we account for hierarchical keys), then application/x-www-form-urlencoding is another possible representation that we might use. It shares the same problem we just saw: the client and server need to agree on semantics. But with one important difference: we already possess a standard format with facilities for describing application/x-www-form-urlencoded representations in band: HTML.

So, yeah -- it's REST.

That still doesn't mean "easy"; the client still has to know what it wants to do, how to find the right form to do it, how to find the fields it wants to set before submitting the form.  In much the same way that JSON Patch puts semantics on top of JSON, you need something that describes your domain semantics on top of HTML (or whatever other hypermedia representation you prefer).


An advantage in this approach is that it allows us to take advantage of the cache invalidation semantics of POST; which is to say, because we are sending the message to be handled by "the" resource, as opposed to some other resource that understands the semantics of a specific command, the caches are able to invalidate the representations that are actually being changed, rather than invalidating an irrelevant side channel.


Friday, January 5, 2018

Refactoring: paint by numbers

I've been working on a rewrite of an API, and I've been trying to ensure that my implementation of the new API has the same behavior as the existing implementation.  This has meant building up a suite of regression checks, and an adapter that allows me to use the new implementation to support a legacy client.

In this particular case, there is a one-to-one relationship between the methods in the old API and the new -- the new variant just uses different spellings for the arguments and introduces some extra seems.

The process has been "test first" (although I would not call the design "test driven").  It begins with a check, using the legacy implementation.  This stage of the exercise is to make sure that I understand how the existing behavior works.



We call a factory method in the production code, which acts as a composition root to create an instance of the legacy implementation. We pass a reference to the interface to a check, which exercises the API through a use case, validating various checks along the way.

Having done this, we then introduce a new test; this one calling a factory method that produces an instance of an adapter, that acts as a bridge between legacy clients and the new API.


The signature of the factory method here is a consequence of the pattern that follows, where I work in three distinct cycles
  • RED: begin a test calibration, verifying that the test fails
  • GREEN: complete the test calibration, verifying that the test passes
  • REPLACE: introduce the adapter into the mix, and verify that the test continues to pass.
To begin, I create an implementation of the API that is suitable for using to calibrate a test by ensuring that a broken implementation fails. This is straight forward; I just need to throw UnsupportedOperationExceptions

Then, I created an abstract decorator, implementing the legacy API by simply dispatching each method to another implementation of the same interface.

And then I define my adapter, which extends the wrapper of the legacy API, and also accepts an instance of the new API.

Finally, with all the plumbing in place, I return a new instance of this class from the factory method.

My implementation protocol then looks like this; first, I run the test using the adapter as is. With no overrides in place, each call in the api gets directed to TEST_CALIBRATION_FACADE, which throws an UnsupportedOperationException, and the check fails.

To complete the test calibration, I override the implementation of the method(s) I need locally, directing them to a local instance of the legacy implementation, like so:

The test passes, of course, because we're using the same implementation that we used to set up the use case originally.

In the replace phase, the legacy implementation gets inlined here in the factory method, so that I can see precisely what's going on, and I can start moving the implementation details to the new API.

Once I've reached the point that all of the methods have been implemented, I can ditch this scaffolding, and provide an implementation of the legacy interface that delegates all of the work directly to v2; no abstract wrapper required.

There's an interesting mirroring here; the application to model interface is v1 to v2, then then I have a bunch of coordination in the new idiom, but at the bottom of the pile, the v2 implementation is just plugging back into v1 persistence. You can see an example of that here - Booking looks fairly normal, just an orchestration with the repository element. WrappedCargo looks a little bit odd, perhaps -- it's not an "aggregate root" in the usual sense, it's not a domain entity lifted into a higher role. Instead, it's a plumbing element wrapped around the legacy root object (with some additional plumbing to deal with the translations).

Longer term, I'll create a mapping from the legacy storage schema to an entity that understands the V2 API, and eventually swap out the O/RM altogether by migrating the state from the RDBMS to a document store.

Thursday, January 4, 2018

Property based testing: Mars Rover

My go to TDD exercise has been the mars rover kata.  Trying a property test first approach to it has been a struggle.  Compared with an example driven approach, there seems to be a whole lot of front loaded pain without an obvious payoff.

I reached out to Paul Holser to see if I was missing a bet, and of course the discussion turned to what sorts of properties I had discovered in the design.

First property: for all inputs, the output is an ordered collection of representations of rover positions.  If we are checking within the domain boundary, this is pretty trivial -- we can use the type system to enforce this constraint.  If we are checking from outside of the boundary, then we're checking that the lines of text/bytes that have been returned to us can be parsed; three entries per represe, the first two entries are coordinates/integers, the last entry is a member of the set of cardinal directions.

Second property: for all inputs, the number of representations in the output equals the number of rovers described by the input.  Rovers are conserved.

Third property: for any inputs, you get the same output each time you run it.

Fourth property: for any pair of inputs, reversing the order of execution does not change the outputs.


In other words, its a pure function.  I don't actually know the battery of tests that you should run to establish this fact.

A rover with an empty instruction set remains at the same coordinates in the same orientation.  This gives us our fifth property: for all inputs, if we replace a single rover's instructions with an empty instruction set (thereby creating a different input), the corresponding output will have the same position.  Note that this indirectly demonstrates that the order of the outputs matches the order of the inputs.

This property would not apply if the rovers moved other than by consuming their own instructions. For example, if one rover were to push another rover out of the way, then the property would not hold.

The above identity leads to two additional properties, both variations of different paths, same destination.

The sixth property takes advantage of the identity to allow us to establish that the behavior of the rovers is consistent with their programs being executed in the right order.  So for all inputs, we can select any rover, and create a new program in the following way: the positions an instructions of the rovers before the selection in the list are copied directly, the positions of the remaining rovers are copied with empty instructions.  We get the corresponding output, and then build a second program -- the output positions are taken as inputs, with empty instruction sets assigned to the rovers before the selected rover, and the original instruction sets assigned to the remainder.  The property to be checked is that the output of this second program matches the original that we started with.

The seventh property is similar; instead of breaking up the original program along neat rover boundaries, we can also split the selected rover's instructions into two pieces, running part of the instruction set with the rovers ahead of it, then running the remaining instructions with the subsequent rovers.

Both of these are basically establishing that we're dealing with a stateless process; the next collection of positions depends only on the current collection of positions and the next instruction.

Up until this point, we haven't really been looking at the semantics of the output at all; we've got a list of coordinates and orientations, but all we've done is check for equality.

Putting this another way, the simplest thing that could possibly work is still to ignore the instruction sets entirely, and simply return the original positions and orientations in each case.

The plateau is rectangular, and presumably the boundary effects are equivalent on all sides.  If you flip the rectangle upside down -- exchanging north for south and east for west -- you get the same configuration of rovers in a different coordinate system.   Right and Left have the same effect on orientation that they did before the flip.  Any interactions with the boundary will still appear at the same points in the evaluations of the rovers.

This yields our eighth property: for any input, if we flip the input position, compute the output, and then flip the output, we should get the same answer that we would get from running the original output as is.

There is a similar result for quarter rotations, if you adjust the input so that the plateau itself is square.

Rotations are mostly immune to boundary effects and collisions, so it's probably reasonable to start there.

One assumption that we probably need to make is that all valid inputs have rovers that start within the region of the specified plateau.  The problem statement doesn't address that point.  I don't think property based testing helps much here - running the checks can only tell you if the test and implementation made compatible assumptions, not correct assumptions.

Property nine: for all inputs, we can select any rover and replace it's original instruction set with one that has all of the move instructions removed.  When we run this program:
  • The coordinates of the output should match those of the input
  • The orientation of the output will match that of the input if, and only if, the difference in the count of left instructions and the count of right instructions is a multiple of four. 
Finally, we have the possibility of an input that forces us to change the position of the rover!

This establishes that we've got four-symmetry, that left and right cancel each other out, and that rotations preserve position.

Property ten: for all inputs, we can select any rover and replace it's original instruction set with a one that elides all of the left and right instructions.  When we run this program, the orientation of the selected rover will be unchanged.

EDIT: come to think of it, any single instruction leaves at least two of the three properties unchanged; the remaining property changes by at most one unit.  The complication of the move instruction is that you need the orientation to know which property is changing in which direction.

We can introduce into this domain the concept of a displacement - we're in a taxicab geometry.  The number of moves in an instruction set establishes an upper bound on the displacement that can be achieved; we know that the rover will be found somewhere in the circle.

Property eleven: for all inputs, we can measure the displacement of the output coordinates from the input coordinates.  For each rover, the displacement will be less than or equal to the count of moves in its instruction set.

Furthermore, if the circle doesn't reach the boundary of the plateau, then we don't need to worry about boundary effects for that rover.  Taking the initial positions of the rovers, and the size of the bounding circles allows us to compute a bounding box; if the box doesn't reach the plateau boundaries, then the simulation is completely free of boundary effects.

If the plateau is large enough to enclose the bounding box at two different locations, then we can establish twelfth property - that of translation symmetry in the following way.  Given any valid input, we can compute the extents of a bounding box that encloses the possible programs of all rovers.  Choose a displacement in positive x and y.  Compute the dimensions of a plateau by adding the displacement to the extent of the bounding box.  Create two inputs using this plateau: for the first, compute the initial positions of the rovers such that the lower left corner of the bounding box is at the origin; for the second, compute initial positions of the rovers such that the upper right corner of the bounding box is at the upper right corner of the plateau.  Note that the relative displacment of the input coordinates of the corresponding rovers should be the same.  Compute outputs for both programs; the displacements of the corresponding outputs should all match.

The thirteenth property also plays with bounding boxes to establish this property - that if the bounding circles of two rovers don't overlap, then they don't have any interference effects.  The check goes something like this: first you run a single program with all of the rovers participating.  Then you run a program that isolates each rover on the same plateau, and confirm that the final positions of the isolated rovers matches that predicted by the single program containing all of the rovers.

It's not clear how you transform input with overlapping bounding circles into one that doesn't have them.  Do you move the rovers? shrink the bounding circles? skip examples that don't match the criteria?

But in the main, it seems to be a very powerful pattern to use an input as a seed from which you can compute an input that has the properties you want to verify -- using an identity transformation wherever possible.

It's not so clear to me that these constraints are driving the design in any useful way.  For example, we don't have any properties that establish that left and right are correctly oriented.  We don't have any tests that actually demonstrate that the rover has moved.





Tuesday, January 2, 2018

Property based testing: thoughts of a novice.




The tension between these two ideas drives me nuts.  Thinking way harder means that I'm not delivering features faster.

Example based testing is straight forward; choose an input, hard code the output, remove the duplication, repeat until you have no more examples that produce the wrong output.  You may even be able to estimate the minimum required number of tests just by thinking about the cyclomatic complexity of the problem.

But this in turn means that you can't easily judge "complete" just from looking at the demonstrated examples.  As Scott Wlaschin points out, a finite suite of example tests can be beaten by a pathological implementation that is fitted to the suite.

Property based tests handle this concern better -- they explore the domain, rather than just sampling it.  That's a lie, of course; what property based tests are really doing is sampling a lot of the domain -- enough that the pathological fake is more expensive than just solving the problem.

My most startling test result, ever, came about from a property test which revealed that I had completely failed to recognize that the properties that I thought would hold were not consistent with the implementation I had chosen.

But it didn't come from randomly exploring the space, but rather choosing an example from the domain that I recognized was "weird".  Fortunately, there were lots of weird values in the domain, and they all demonstrated that my implementation didn't support the property in question.  So I got
"lucky" without having to write four billion tests.

I'm not at all sold on the idea of using a random input generator to find a needle in the haystack.