Thursday, January 26, 2017

A RESTful Kitchen Oven

Some time back, I chatted with Asbjørn Ulsberg about an example of a REST api using an oven.  Shortly there after, he presented his conclusions at the Nordic APIs 2016 Platform Summit. [slides]
What follows is my own work, but clearly I was influenced by his point of view.

 In my kitchen, there is a free standing gas range.  On the inside, it's got various marvels that we no longer think about much: an ignition system, and a thermostat, safety valves, etc.

But as a home cook, I really don't need to worry about those details at all.  I work with the touch pad at the top of the unit.  Bake, plus-plus-plus-plus, Start, and I get a beep and a display message letting me know that the oven is preheating.  Some time after that, another beep let's me know that the oven has reached the target temperature and on good days the actual temperature stays near that target until the oven is shut off.

Let's explore, for a time, what it would look like to control the oven from a web browser.

For our first cut, we can look at an imperative approach.

HTTP/1.0 gave us everything we need.  GET allows us to retrieve the information identified by the request uri, and POST allows us to provide a block of data to a data handling process.

In a browser, it might look like this: we load the bookmark URI.  That would get some information resource -- perhaps a menu of the services available, perhaps a representation of the current state of the stove, maybe both.  One of the links would include a relation (as well as human readable cues) that communicates that this is the link to follow to set the oven temperature.  Following that link would GET another resource that is a web form; in this case it would just be an annotated field for temperature (set with the default value of 350F), semantic cues, and a submit button.  Submitting the form would post the contents to the server, which would in turn interpret the submitted document as instructions to present to the oven.  In other words, the web server reads the submitted data, and pushes the control buttons for us.

Having done that, the server would return a 200 status to us, with a representation of the action it just took, and links from there to perhaps the status page of the oven, so that we can read updates to know if the oven has reached temperature.  In a simple interface like the one on my stove, the updates will only announce that the oven is preheating.  A richer interface might include the current temperature, an estimate of the expected wait time, and so on.

Great, that gets us a website that allows a human being to control the stove from a web browser, but how do we turn that into a real API?  Answer: that is a real API.  Anybody can grab their favorite http client and their favorite html parser, write an agent that understands the link relations and form annotations (published as part of this API).  The agent uses the client to load the bookmark url, loads the result into the parser, searches the resulting DOM for the elements it needs, submits the form, and so on.  Furthermore, the agent can talk to any stove at all.

And -- bonus trick: if you want to test the agent, but don't have a stove handy, you can just point it at any web server with test cases represented as graphs of html documents.  After all, the only URI that the agent actually knows anything about is the start point.  From that point on, it's just following links.

That's a nice demonstration of hypermedia, and the flexibility that comes from using the uniform interface.  But it misses two key lessons, so let's try again.

This time, we'll go with a declarative approach.  We need another verb, PUT, defined in the HTTP/1.1 spec (although it had appeared in the quarantine of the earlier appendix D).  Puts early definition got right to the heart of it.
The PUT method requests that the enclosed entity be stored under the

supplied Request-URI.
What the heck does that mean for an oven?

To the oven, it means nothing -- but we aren't implementing an oven, we're implementing a web api for an oven.  To be a web-api for something interesting means to express the access to that bit of interesting as though it were a document store.

In other words, the real goal here is that we should be able to control the stove from any HTTP aware document editor.

Here's how that might work.  We start from the bookmark page as we did before.  This includes a hypermedia link to a representation of the current temperature settings of the oven.  For instance, that representation might be a json document which includes a temperature field somewhere in the schema.  So the document can GET the current state represented as a json document (important note: this representation does NOT include hyperlinks -- we're not going to allow the document editor to modify those.  Instead, transitions away from the editable representation of the document are described in the Link header field.)

We load the document of the current state into our editor, which then follows the "self" relation with an OPTIONS call to discover if PUT is supported.  Discovering that this is so, the editor enables the editing controls.  The human operator can now replace the representation of the current temperature of the oven with a representation of the desired temperature, and click save.  The document editor does a simple PUT of the saved representation to the web server.

PUT is analogous to replace, in this use case.  In the case of a document store, we would simply overwrite the existing copy of the document.  We need to give some thought to what it means for an oven to mimic that behavior.

One possibility is that the resource in question represents, not the current *state* of the oven, but the current *program*.  So we initially connect to the oven the state of the program would be do nothing, and we would replace that with a representation of the program "set the temperature to 300 degrees", and the API would "store" that program by actually commanding the oven to heat to the target temperature.  PUT of a new program actually matches very well with the semantics of my oven, which treats entering the desired state as an idempotent operation.

A separate, read-only, resource would be used to monitor the current state of the oven.

In this idiom, "turning off the oven" has a natural analog for our document editor -- deleting the document.  The editor can just as easily determine that DELETE is supported as PUT, and enable the property controls for the user.

If we don't like the "update the program" approach, we can work with the current state document directly.  We enable PUT on the resource for the editable copy of the current oven state, enabling the edit controls in the document editor.  The agent can describe the document that they want, and submit the result.

The tricky bit, that is key to the declarative approach: the API needs to compare the current state with the target state, and determine for itself what commands need to be sent to the oven to bring about that change.  Just as the keypad insulates us from the details of the burners and valves, so to does the API insulate the consumers from the actual details of interacting with the oven.

Reality ensues: the latency for bringing the oven up to temperature isn't really suitable for an HTTP response SLA.  So we need to apply a bit of lateral thinking; the API reports to the document store that the proposed edit has been Accepted (202).  It's not committal, but it is standard.  The response would likely include a link to the status monitor, which the client could load to see that the oven was preheating.

Once again, automating this is easy -- you teach your agent how to speak oven (the standard link relations, the media-types that describe the states and the programs).  You use any HTTP away document editor to publish the agents changes to the server.  You test by directing the agent to a document store full of fake oven responses.
Do we need two versions of the API now? one for an imperative approach, another for the declarative approach?  Not at all -- the key is the bookmark URL.  We require that the agents be forgiving about links they do not recognize -- those can simply be ignored.  So on the bookmark page, we have a link relation that says "this way to the imperative setOvenTemperature interface", and another that says "this way to the declarative setOvenTemperature interface", and those two links can sit next to each other on the page.  Clients can follow whichever link they recognize.

Document editing -- especially for small documents -- is reasonably straight forward in html as well; the basic idea is that we have a resource which renders the current representation of our document inside a  text area in a form, which POSTs the modified version of the document to a resource that interprets it as a replacement state or a replacement program as before.

You can reasonably take two different approaches to enabling this protocol from the bookmark page.  One approach would be to add a third link (inventing a new link relation).  Another would be to use content negotiation -- the endpoint of the setOvenTemperature interface checks to see if the Accept-Type is html (in which case the client is redirected to the entry point of that protocol, otherwise directed back to the previously implement PUT based flow).

Using the HTML declarative flow also raises another interesting point about media types.  Text areas aren't very smart, they support free form text, so you end up relying on the operator not making any data entry errors.  With a standardized media-type, and a document editor that is schema aware, the editor can help the operator do the right thing.  For instance the document editor may be able to assist the agent with navigation hints and a document object model, allowing the agent to seek directly to the document elements relevant to its goal.

Edit: go figure -- having written up my thoughts, I went back to look at Asbjørn Ulsberg's slides and realized that we had originally been talking about toasters, not ovens.

On Anemic Domain Models

The data model has the responsibility to store data, whereas the domain model has the responsibility to handle business logic. An application can be considered CRUD only if there is no business logic around the data model. Even in this (rare) case, your data model is not your domain model. It just means that, as no business logics is involved, we don’t need any abstraction to manage it, and thus we have no domain model.
 Ouarzy, It Is Not Your Domain.

I really like this observation, in that it gives me some language to distinguish CRUD implementations, where we don't have any invariant to enforce, and anemic domains, where there is an invariant, but we don't keep the invariant and the state changing behavior in the same logical unit.

We probably need more variations here
  • There is no invariant (CRUD)
  • There is an invariant, but it is maintained by human beings -- the invariant is not present in the code
  • There is an invariant in the code, but the code is arranged such that modification of state can happen without maintaining the invariant (anemic)
  • There is an invariant in the code, and the code is arranged so that any change of state is checked against the invariant.
Having some anemic code paths isn't wrong -- it's a desirable property for the software to be able to get out of the way so that the operator in the present can veto the judgment of the implementer in the past. 

But if the software is any good, bypassing the domain model should be unusual enough that additional steps to disable the safeties is acceptable.

Thursday, January 19, 2017

Whatever it takes

Whatever it takes is a large cash bonus, paid up front.

A RESTful supply closet

At stackflow, another question was submitted about REST API design for non CRUD actions.  In thinking about it, I found an analogy that may help explain the point.

Imagine a supply closet; an actual physical closet in the real world.  The stock includes boxes of pencils.  What does a web API for the supply closet look like?

As Jim Webber pointed out years ago, HTTP is a document management application.  So the first thing to realize is that we are trying to create an interface that supports the illusion that the supply closet is a document store.  If we want to know about the current state of the closet, we read the latest report out of the store.  To change the state of the closet, we propose changes to the document store.

How do we convert the current state of the closet to a document?  In the real world (think 1950s office), we would ask the quartermaster for the latest inventory document.  If a recent one is available, the quartermaster gives us a copy of that document.  Otherwise, he can look in the closet, count the boxes, and produce send us a copy of the fresh report.

That, fundamentally, is GET.  We ask the API for a copy of the inventory report.  Maybe the API just copies the report that's posted on the closet door, maybe the API goes inside the closet to count everything, maybe the closet isn't accessible, so we just get the last report the API saw.  Doesn't matter, we got a document.

Now, key in the next stage is to realize that the document is not the closet; when we edit the document, boxes of pencils don't magically appear in the closet.  What we need to implement is the illusion that the closet really is a document store.

In our real world model, we read the inventory report, and there aren't enough pencils.  So we create a new document -- a memo to the quartermaster that says "stock more pencils".  When we deliver the memo to the quartermaster, he decides how to get more pencils for the closet -- maybe he gets boxes out of storage, or buys some from the store next door.  Maybe he updates his todo list (another document)  and tells you he'll get back to you.

This is the basic idiom of HTML forms.  We create (POST) a new document to the API, and the API interprets that document as changes to be made to the closet.  The requisition document and the inventory document are different resources.  For that matter, the collection of requisition documents and the inventory document are different resources.  So you need a different namespace of identifiers to work with.

HTTP (but not HTML) also supports another approach.  Instead of interacting with the closet by submitting new documents, we could interact with the closet by proposing edits to the existing documents.

This, to my mind, feels a bit more declarative -- you describe in the edited document the state that you want the closet to be in, and its up to the API to figure out the details of making that happen.  In our analogy, we've sent to the quartermaster a copy of the inventory with a bunch of corrections made to it, and he changes the state of the closet to match the document.

This is PUT -- specifically a PUT to the inventory resource.  Notice that it doesn't change what work the quartermaster needs to do to fill the closet; it doesn't change his schedule for doing that work, it doesn't change the artifacts that he generates while doing the work.  It just changes which document manipulation illusion we are choosing to support.

Now, HTTP is specific about the behavior of the imaginary document store we are mimicking, which is that PUT is an upsertIf we want fine grained control of the contents of the closet ("more pencils, leave everything else alone"), then we need to upsert to a resource with a matching grain.

PATCH is another alternative to introducing finer grained resources; we send the patch to the server, it compares the patched version of the inventory document to the original version, and from there decides what changes need to be made to the closet.

These are all variations of the same fundamental idea - the HTTP request describes the desired end state, and the implementation sitting behind the API figures out how to realize that end.

Thursday, January 5, 2017


Until you have performed a successful restore, it's not a backup.

Same idea, different spelling: nobody needs backups; what they need are restores.

Wednesday, January 4, 2017

TDD: A Tale of two Katas

Over the Holiday, I decided to re-examine Peter Siebel's Fischer Random Chess Kata.

I have, after all, a lot more laps under my belt than when I was first introduced to it, and I thought some of my recent studies would allow me to flesh things out more thoroughly.

Instead, I got a really educational train wreck out of it.  What I see now, having done the kata twice, is that the exercise (once you get the insight to separate the non determinable generator from the board production rules) is about applying specifications to a value type, rather than evaluating the side effects of behaviors.

You could incrementally develop a query object that verifies that some representation of a fair chess row satisfies the constraints -- the rules would come about as you add new tests to verify that that some input satisfies/does-not-satisfy the specification, until the system under test correctly understands the rules.  Another way of saying the same thing: you could write a factory that accepts as input an unconstrained representation of a row, and produces a validated value type for only those inputs that satisfy the constraints.

But, as RFC 1149.5 taught us, you can't push a stateless query out of a local minima.

Realizing this -- that the battle I thought I was going to write about was doomed before I even reached the first green bar, I decided to turn my attention to the bowling game kata.

Amusingly enough, I started from a first passing test, and then never moved off of the original green bar.

Part of the motivation for the exercise was my recent review of Uncle Bob's Dijkstra's Algorithm kata.  I wanted to play more with the boundaries in the test, and get a better feel for how they arise as a response to the iteration through the tests.

So I copied (less than perfectly) Uncle Bob's first green bar, and then started channeling my inner Kent Beck

Do you have some refactoring to do first?
With that idea in mind, I decided to put my attention on "maximizes clarity".  There's some tension in here -- the pattern that emerges is so obviously generic that one is include to suggest I was the victim of big design up front, and that I wasn't waiting for the duplication in tests to realize that pattern for me.  So on some level, one might argue that I've violated YAGNI.  On the other hand, if you can put a name on something, then it is already been realized -- you are simply choosing to acknowledge that realization, or not.

In doing that, I was surprised -- there are more boundaries in play than I had previously recognized.

There's a boundary between the programmer and the specification designer.  We can't think at the IDE and have it do the right thing, we actually need to type something; furthermore, that thing we type needs to satisfy a generic grammar (the programming language).

The specification designer is code that essentially responsible for "this is what the human being really meant."  It's the little DSL we write that makes introducing new specifications easy.

There's a boundary between specification design and the test harness -- we can certainly generate specifications for a test in more than one way, or re-use a specification for more than one test.  Broadly, the specification is a value type (describing input state and output state) where the test is behavior -- organized interactions with the system under test.

The interface between the test and the system under test is another boundary.  The specification describes state, but it is the responsibility of the test to choose when and how to share that state with the system under test.

Finally, there is the boundary within the system under test -- between the actual production code we are testing, and the adapter logic that aligns the interface exposed by our production code with that of the test harness.

This bothered me for a while - I knew, logically, that this separation was necessary if the production code was have the freedom to evolve.  But I couldn't shake the intuition that I could name that separation now, in which case it should be made explicit.

The following example is clearly pointless code

And yet this is the code we write all the time

And that's not a bad idea -- we are checking that two outcomes, produced in different ways, match.
But the spellings are wrong: the names weren't telling me the whole story. In particular, I'm constantly having problems remembering the convention of which argument comes first.

It finally sank in: the boundary that I am trying to name is time. A better spelling of the above is:

We write a failing test, then update an implementation to satisfy a check written in the past, and then we refactor the implementation, continuing to measure that the check is still satisfied after each change. If we've done it right, then we should be able to use our earliest checks until we get an actual change in the required behavior.

I also prefer a spelling like this, because it helps to break the symmetry that gives me trouble -- I don't need to worry any longer about whether or not I'm respecting screen direction, I just need to distinguish then from now.

The adapter lives in the same space; it's binding the past interface with the present interface.  The simplest thing that could possible work has those two things exactly aligned.  But there's a trap there -- it's going to be a lot easier to make this seam explicit now, when there is only a single test, than later, when you have many tests using the wrong surface to communicate with your production code.

There's another interpretation of this sequence.  In many cases, the implementation we are writing is an internal element of a larger application.  So when we write tests specifically for that internal element, we are (implicitly) creating a miniature application that communicates more directly with that internal element.  The test we have written is communicating with the adapter application, not with the internal element.

This happens organically when you work from the outside in - the tests are always interfacing with the outer surface, while the rich behaviors are developed within.

The notion of the adapter as an application is a deliberate one -- the dependency arrow points from the adapter to the test harness.  The adapter is a service provider, implementing an interface defined by the test harness itself.  The adapter is also interfacing with the production code; so if you were breaking these out into separate modules, the adapter would end up in the composition root.

Key benefit of these separations; when you want to take a new interface out for a "test drive", you don't need to touch the tests in any way -- the adapter application serves as the first consumer of the new production interface.

Note that the checks were defined in the past, which is the heuristic that reminds you that checking is the responsibility of the test harness, not the adapter application.  The only check that the adapter can reliably perform is "is the model in an internally consistent state", which is nice, but the safety to refactor comes from having an independent confirmation that the application outputs are unchanged.

Another benefit to this exercise: it has given me a better understanding of primitive obsessionBoundaries are about representations, and primitives are a natural language for describing representations.  Ergo, it makes sense that we describe our specifications with primitives, and we use primitives to communicate across the test boundary to the (implicit) adapter application, and from there to our proposed implementation.  If we aren't aware of the intermediate boundaries, or are deferring them, there's bound to be a lot of coupling between our specification design and the production implementation.