Thursday, January 26, 2017

A RESTful Kitchen Oven

Some time back, I chatted with Asbjørn Ulsberg about an example of a REST api using an oven.  Shortly there after, he presented his conclusions at the Nordic APIs 2016 Platform Summit. [slides]
What follows is my own work, but clearly I was influenced by his point of view.

 In my kitchen, there is a free standing gas range.  On the inside, it's got various marvels that we no longer think about much: an ignition system, and a thermostat, safety valves, etc.

But as a home cook, I really don't need to worry about those details at all.  I work with the touch pad at the top of the unit.  Bake, plus-plus-plus-plus, Start, and I get a beep and a display message letting me know that the oven is preheating.  Some time after that, another beep let's me know that the oven has reached the target temperature and on good days the actual temperature stays near that target until the oven is shut off.

Let's explore, for a time, what it would look like to control the oven from a web browser.


For our first cut, we can look at an imperative approach.

HTTP/1.0 gave us everything we need.  GET allows us to retrieve the information identified by the request uri, and POST allows us to provide a block of data to a data handling process.

In a browser, it might look like this: we load the bookmark URI.  That would get some information resource -- perhaps a menu of the services available, perhaps a representation of the current state of the stove, maybe both.  One of the links would include a relation (as well as human readable cues) that communicates that this is the link to follow to set the oven temperature.  Following that link would GET another resource that is a web form; in this case it would just be an annotated field for temperature (set with the default value of 350F), semantic cues, and a submit button.  Submitting the form would post the contents to the server, which would in turn interpret the submitted document as instructions to present to the oven.  In other words, the web server reads the submitted data, and pushes the control buttons for us.

Having done that, the server would return a 200 status to us, with a representation of the action it just took, and links from there to perhaps the status page of the oven, so that we can read updates to know if the oven has reached temperature.  In a simple interface like the one on my stove, the updates will only announce that the oven is preheating.  A richer interface might include the current temperature, an estimate of the expected wait time, and so on.

Great, that gets us a website that allows a human being to control the stove from a web browser, but how do we turn that into a real API?  Answer: that is a real API.  Anybody can grab their favorite http client and their favorite html parser, write an agent that understands the link relations and form annotations (published as part of this API).  The agent uses the client to load the bookmark url, loads the result into the parser, searches the resulting DOM for the elements it needs, submits the form, and so on.  Furthermore, the agent can talk to any stove at all.

And -- bonus trick: if you want to test the agent, but don't have a stove handy, you can just point it at any web server with test cases represented as graphs of html documents.  After all, the only URI that the agent actually knows anything about is the start point.  From that point on, it's just following links.

That's a nice demonstration of hypermedia, and the flexibility that comes from using the uniform interface.  But it misses two key lessons, so let's try again.

This time, we'll go with a declarative approach.  We need another verb, PUT, defined in the HTTP/1.1 spec (although it had appeared in the quarantine of the earlier appendix D).  Puts early definition got right to the heart of it.
The PUT method requests that the enclosed entity be stored under the

supplied Request-URI.
What the heck does that mean for an oven?


To the oven, it means nothing -- but we aren't implementing an oven, we're implementing a web api for an oven.  To be a web-api for something interesting means to express the access to that bit of interesting as though it were a document store.

In other words, the real goal here is that we should be able to control the stove from any HTTP aware document editor.

Here's how that might work.  We start from the bookmark page as we did before.  This includes a hypermedia link to a representation of the current temperature settings of the oven.  For instance, that representation might be a json document which includes a temperature field somewhere in the schema.  So the document can GET the current state represented as a json document (important note: this representation does NOT include hyperlinks -- we're not going to allow the document editor to modify those.  Instead, transitions away from the editable representation of the document are described in the Link header field.)

We load the document of the current state into our editor, which then follows the "self" relation with an OPTIONS call to discover if PUT is supported.  Discovering that this is so, the editor enables the editing controls.  The human operator can now replace the representation of the current temperature of the oven with a representation of the desired temperature, and click save.  The document editor does a simple PUT of the saved representation to the web server.

PUT is analogous to replace, in this use case.  In the case of a document store, we would simply overwrite the existing copy of the document.  We need to give some thought to what it means for an oven to mimic that behavior.

One possibility is that the resource in question represents, not the current *state* of the oven, but the current *program*.  So we initially connect to the oven the state of the program would be do nothing, and we would replace that with a representation of the program "set the temperature to 300 degrees", and the API would "store" that program by actually commanding the oven to heat to the target temperature.  PUT of a new program actually matches very well with the semantics of my oven, which treats entering the desired state as an idempotent operation.

A separate, read-only, resource would be used to monitor the current state of the oven.

In this idiom, "turning off the oven" has a natural analog for our document editor -- deleting the document.  The editor can just as easily determine that DELETE is supported as PUT, and enable the property controls for the user.

If we don't like the "update the program" approach, we can work with the current state document directly.  We enable PUT on the resource for the editable copy of the current oven state, enabling the edit controls in the document editor.  The agent can describe the document that they want, and submit the result.

The tricky bit, that is key to the declarative approach: the API needs to compare the current state with the target state, and determine for itself what commands need to be sent to the oven to bring about that change.  Just as the keypad insulates us from the details of the burners and valves, so to does the API insulate the consumers from the actual details of interacting with the oven.

Reality ensues: the latency for bringing the oven up to temperature isn't really suitable for an HTTP response SLA.  So we need to apply a bit of lateral thinking; the API reports to the document store that the proposed edit has been Accepted (202).  It's not committal, but it is standard.  The response would likely include a link to the status monitor, which the client could load to see that the oven was preheating.

Once again, automating this is easy -- you teach your agent how to speak oven (the standard link relations, the media-types that describe the states and the programs).  You use any HTTP away document editor to publish the agents changes to the server.  You test by directing the agent to a document store full of fake oven responses.
 
Do we need two versions of the API now? one for an imperative approach, another for the declarative approach?  Not at all -- the key is the bookmark URL.  We require that the agents be forgiving about links they do not recognize -- those can simply be ignored.  So on the bookmark page, we have a link relation that says "this way to the imperative setOvenTemperature interface", and another that says "this way to the declarative setOvenTemperature interface", and those two links can sit next to each other on the page.  Clients can follow whichever link they recognize.

Document editing -- especially for small documents -- is reasonably straight forward in html as well; the basic idea is that we have a resource which renders the current representation of our document inside a  text area in a form, which POSTs the modified version of the document to a resource that interprets it as a replacement state or a replacement program as before.

You can reasonably take two different approaches to enabling this protocol from the bookmark page.  One approach would be to add a third link (inventing a new link relation).  Another would be to use content negotiation -- the endpoint of the setOvenTemperature interface checks to see if the Accept-Type is html (in which case the client is redirected to the entry point of that protocol, otherwise directed back to the previously implement PUT based flow).

Using the HTML declarative flow also raises another interesting point about media types.  Text areas aren't very smart, they support free form text, so you end up relying on the operator not making any data entry errors.  With a standardized media-type, and a document editor that is schema aware, the editor can help the operator do the right thing.  For instance the document editor may be able to assist the agent with navigation hints and a document object model, allowing the agent to seek directly to the document elements relevant to its goal.

Edit: go figure -- having written up my thoughts, I went back to look at Asbjørn Ulsberg's slides and realized that we had originally been talking about toasters, not ovens.





No comments:

Post a Comment