Context

Granary will be a standalone deployable open source application. Users' primary way of interacting with Granary will be through its REST API, rather than through a UI, at least for the foreseeable future. This focus puts a larger burden than usual on the quality of our API documentation.

We'd like to evaluate different ways to keep API documentation up-to-date. For each strategy, we want to know:

What are the general pros and cons?
How do we find out that our docs have drifted from the API?
How can we document different versions of the API at the same time?

The contenders

I evaluated three software solutions that rely on code/documentation generation and also a process solution. The three libraries I considered were rho, tapir, and guardrail. For the software solutions, I created a small repository here.

Manual API spec maintenance

The most successful way we've done this in the past was a spec-first development pattern. In this pattern, we added endpoints to the API spec, then implemented them only after the spec changes were merged. We drifted away from this over time, and eventually wound up with a checklist item in our pull request template that specified that the API spec had been updated. For a long time, our spec wasn't valid Swagger, which we found out about when we tried to create a docs site. For a while after that, parts of our spec were incorrect, which we found out about when users attempted to use the spec for API interaction (though there were other problems as well).

How do we find out that our docs have drifted from the API?

The way we answered this question before was to go through the spec and make sure that the happy path at least is correctly documented, i.e., that a generated python client using Yelp's bravado library can interact with the API without errors. It was a manual process that relied heavily on our python client repository. A different strategy we could consider would be to generate a scala client using the hosted version of a spec on Swaggerhub and ensure that we can drop in the data model in the generated client in place of the existing data model. Testing this would require investing some software development time in tooling. Another option is to rely on upcoming panrec features to parse the generated client's datamodel and the existing datamodel to ensure that they agree.

Both of these strategies will ensure only that the data models are correct without detecting, e.g., whether we've moved a route. That's a consistency check beyond what we've done before, but it still leaves a lot of room for us not to get the spec exactly right in a way that makes us spend potential support time (triaging issues, responding to help requests) on spec maintenance.

How can we document different versions of the API at the same time?

The only strategy I've come up with for the manual maintenance version is a lot of copying and pasting. Supposing some route exists /v1/models and another route exists /v2/models, I don't know how to use OpenAPI to share things between those two endpoints. Later changes, like adding a new response type to both, I think would need to be manually written into both places. This sounds like a headache.

`tapir`

Changes to generate docs with tapir are here. Docs are served on localhost:8080/api/hello/docs.yaml (you can put that directly into swagger editor).

tapir is a library for separating the description of APIs from their implementation and interpreting those descriptions into different outputs. For example, an endpoint description can be interpreted into documentation (a YAML string) or into a server, given a function that maps the inputs described in the endpoint into the outputs described in the endpoint.

Endpoints in tapir explicitly encode input types, output types, and errors. An Endpoint[I, E, O, S] maps inputs of of type I to outputs of type O, returning errors of type E, in streams of type S. So far I have not needed the stream type for anything. tapir makes it easy to add inputs to an endpoint (chain .in calls on the endpoint), to add outputs (.out), and to add metadata (.name and .description).

The worst thing that happened to me while using tapir was that I accidentally wound up with unreachable routes. It seems like tapir's http4s interface wants us not to mount services onto paths (e.g. Router("/v1" -> new V1API, "/v2" -> new V2API)), but instead to include all path components in the endpoint descriptions.

I tested out serving the docs with an algebraic data type and adding an authenticated route to make sure that I understood how both of those paths work. Both were straightforward and the ADT response was correctly encoded as a oneOf. While the default response from a Left in my authentication function was a 400 instead of a 401, that's primarily a consequence of my extremely simplified endpoints that don't know how to encode specific errors, so can't do anything to discriminate the response to return.

tapir endpoints can also be interpreted as clients, but I did not test this feature.

How do we find out that our docs have drifted from the API?

Our docs cannot drift from the API, because the docs and the server are interpretations of the same endpoints.

How can we document different versions of the API at the same time?

I think we can do this by separating endpoint components. The authentication docs mention defining the auth input first, so that it can be shared by many endpoints, and I believe we could do something similar with version inputs, e.g.,

object Endpoints {
  val v1 = endpoint.in("/v1")
  val v2 = endpoint.in("/v2")
  val scenesEndpointV1 = v1.in("/scenes")...
  val scenesEndpointV2 = v2.in("/scenes")...
}

Then each versioned collection of endpoints could be served off of its version prefix.

`rho`

Changes to generate docs with rho are here. Docs are served on localhost:8080/api/hello/swagger.json (you can put that directly into swagger editor).

rho is a library in the http4s ecosystem for automatically generating Swagger documentation with an alternative routing DSL. Route and parameter descriptions are combined with the routing logic to create RhoRoutes[F], which can be transformed into normal HttpRoutes[F] with a RhoMiddleware that also serves the API documentation as json on a configurable endpoint.

The worst part about rho is having to keep a number of odd operators in your head. For example, capturing query parameters is >>>, specifying response types is ^, adding descriptions is **, and binding the route to a function for business logic is |>>. It's possible these are something we'd get used to over time, but I had to look each of them up again to write what they were.

rho generates Swagger (OpenAPI 2.0) specifications as json. Because of this, it does not have access to the oneOf keyword for describing responses that might have one of several different schemas. The generated json included one error, which was that the Json schema (generated from an endpoint returning circe's Json type) was missing but referred to in a route.

How do we find out that our docs have drifted from the API?

Our docs cannot drift from the API, because the RhoMiddleware creates docs for what our routes are actually doing.

How can we document different versions of the API at the same time?

Each service can serve its own docs, so mounting a service in the http4s Router will also mount documentation for that service.

`guardrail`

guardrail http4s support is not currently documented, so I did not investigate this library further.

Decision

We should use tapir for automatically generating API documentation. The ADT support and straightforward API (inputs use .in, outputs use .out, auth extractors use auth) will flatten out the learning curve, and we'll have a stable and correct reference point for API documentation that users setting up their own deployments can refer to. We can call this out at the beginning of the README and hopefully save ourselves from having to answer an entire category of questions.

Consequences

The first routes added to the API will be slightly more difficult, because they'll include writing API routes with a new library for the first time.
The README should be updated to point to the location of the API documentation.