GraphQL and Type Systems

posted 28 May 2023

There will always be an impedance mismatch between your problem domain and the generic tools you use, whether that’s a programming language, a database or a communication protocol. Picking the right tool is usually all about making an educated guess as to which tool creates the least amount of friction and boilerplate.

I wish the problem domain was the only source of complexity, but a system is not an island. Things like development speed, public API usage, versioning and client distribution have to be taken into consideration as well, and I’d take a guess and assume that’s what we focus the most on¹.

GraphQL is kinda interesting in that regard. Before it, if you were to pick a communication protocol, you’d likely go for either REST (via OpenAPI) or GRPC/Protobuf (and similar binary protocols). Ember.js did put JSON:API on the map, though that was probably something you only know about if you worked with Ember.

Not Generic Enough

Most posts discussing whether you should use GraphQL all (justifiably) talk about the big architectural differences between it and REST. I’m not going to repeat those arguments, though in my opinion, it summarises roughly to this:

Clients have a more flexible API where they can mix and match what they want, and the onus of making those queries fast is on the server.

That has some side effects, such as not being able to leverage HTTP caching and other things, but I’m not going to write about that either! No, I’ve been curious about the schema and types GraphQL provide and what it doesn’t provide.

See, the GraphQL type system feels very limiting to me, and it kinda annoys me. So let’s dig into what I think is iffy.

Newtypes

The smallest component I find lacking is newtypes: A type that’s represented by a concrete scalar, but isn’t interchangeable with it.

If I were to make an API, I’d like to distinguish between the IDs denoting one type of object and another – for example, users and blog posts. I can’t do that without making a scalar, but scalar don’t tell the system what the underlying type is. Since GraphQL doesn’t specify what a scalar can’t be, GraphQL generators can’t say whether the scalar value is sortable or printable, meaning you’re either forced to define those types yourself or find a way to inject that information through directives.

Newtypes aren’t only useful for IDs, they’re also useful for denoting different types of units. Consider this:

type Fuel {
  name: String!
  normalBoilingPoint: Float!
}

Is the boiling point in degrees Celsius, degrees Fahrenheit or in Kelvin? You’re either forced to do Hungarian notation, document the field, or make sure the GraphQL generators can somehow interpret the scalar’s underlying type.

The idea would be to have something like this:

newtype Kelvin Float

and let the underlying type be one of the predefined scalars (Int, Float, String, Boolean, ID).

Parametric Polymorphism

If you download GitHub’s public GraphQL schema and search for type .*Edge {, you’ll get 133 matches. Almost all of them contain only the fields cursor and node, with only the node field differing between them. You end up with the same result when you search for type .*Connection {.

It feels a little weird that parametric polymorphism (a.k.a. generics) isn’t part of the specification, especially when the suggested pagination method² forces you to create an additional two types per type you want to paginate over!

For example, here’s how GitHub defines its commit connections:

type CommitConnection {
  edges: [CommitEdge]
  nodes: [Commit]
  pageInfo: PageInfo!
  totalCount: Int!
}

type CommitEdge {
  cursor: String!
  node: Commit
}

type Commit {
  # ...
}

It would be so much more convenient to have the connection specified as follows:

type Connection<T> {
  edges: [Edge<T>]
  nodes: [T]
  pageInfo: PageInfo!
  totalCount: Int!
}

type Edge<T> {
  cursor: String!
  node: T
}

# then in some query call
type mytype {
  connection(): Connection<Commit>!
}

Not only because it’s shorter to write, but also because type systems with generics can leverage this information to make generic pagination iterators on top of the Connection<T> type.

There are other places where this makes sense as well. For example, where I currently work, a lot of the data we have is either estimated, known or user-defined (i.e. overriding the defaults). It’d be nice to have it specified like so:

enum Source {
  ESTIMATED
  KNOWN
  USER_DEFINED
}

type WithSource<T> {
  value: T!
  source: Source!
}

However, right now we either have to make one type for each type we attach such metadata to, or bundle the source in the top-level object by adding a valueSource field. It’s not fun to do this either for the server OR the client.

As a final example, assume you have a simple CRUD app. How would you go about editing multiple values on a single object? For example, assume you have a Graphql blog post API, and you want to provide the means to update it. A reasonable first attempt would be something like this:

type BlogPost {
  id: ID!
  content: String!
  published: Bool!
  tags: [TagID!]!
  responseTo: BlogPostID

  # metadata fields here
}

input BlogPostUpdate {
  content: String
  published: Bool
  tags: [TagID!]
  responseTo: BlogPostID
}

type Mutation {
  updateBlogPost(id: BlogPostID, update: BlogPostUpdate!)
}

All fields are optional to send, and when not sent, the field’s value is left as-is.

This works fine for everything except for responseTo, which you now cannot set to null because it’s masked. You can make a small type to bypass that problem:

input NullableBlogPostID {
  id: BlogPostID
}

But again, a generic type would avoid the redundant nullable wrappers that you’d make all over the place. And as a bonus, such a type can be easily applied to all input types on your CRUD type, depending on how ! is transferred through the type system:

input Optional<T> {
  value: T
}

input BlogPostUpdate {
  content: Optional<String!>
  published: Optional<Bool!>
  tags: Optional<[TagID!]!>
  responseTo: Optional<BlogPostID>
}

Sum Types

It makes sense that the GraphQL type system is asymmetric, but that obviously comes with annoyances. Take unions, for example: Only the server can respond with unions. That means the client is forced to send input types that look like this:

# NB: Only set one field, leave the others alone!
input OneOfThree {
  a: A
  b: B
  c: C
}

Here, you have 3 valid states, and 5 invalid ones that are not caught by the type system.

We actually have this situation over at Maritime Optima, the place where I currently work. When we route shipping vessels from one location to another, the waypoints can either be a vessel’s current location, a port or a point in the map. They are all different, so we have this in our GraphQL schema (simplified):

input LatLngInput {
  lat: Float!
  lng: Float!
}

# NB: Only set one field, leave the others alone!
input WaypointInput {
  vessel: ID
  port: ID
  point: LatLngInput
}

input RouteRequest {
  # ...

  # Must contain at least 2 waypoints
  waypoints: [WaypointInput!]!
}

A sum/union type here would make sure we don’t have to do any runtime type checks.

Additionally, you can inline errors with sum types. Let’s go back to the commit connection example:

type CommitConnection {
  edges: [CommitEdge]
  nodes: [Commit]
  pageInfo: PageInfo!
  totalCount: Int!
}

type CommitEdge {
  cursor: String!
  node: Commit
}

Do you see how the edges and nodes are optional, as well as the node inside a commit edge? I’m sure that’s to allow for error handling: The current GraphQL specification has an example of how that is intended to work in practice:

{
  "errors": [
    {
      "message": "Name for character with ID 1002 could not be fetched.",
      "locations": [{ "line": 6, "column": 7 }],
      "path": ["hero", "heroFriends", 1, "name"]
    }
  ],
  "data": {
    "hero": {
      "name": "R2-D2",
      "heroFriends": [
        {
          "id": "1000",
          "name": "Luke Skywalker"
        },
        {
          "id": "1002",
          "name": null
        },
        {
          "id": "1003",
          "name": "Leia Organa"
        }
      ]
    }
  }
}

… however, I cannot be certain that this is the reason GitHub has it optional in their schema. The schema doesn’t distinguish between a value that’s optional because it’s not present/unset, or a value that’s optional because fetching it may error out. The sad truth is that you need to read the documentation to understand why a field can be nullable, and when the documentation doesn’t tell you why, you’re left guessing.

As with the previous example, we’re also left in a state where we can represent illegal states. If you consider the name of the field and the error state, we get this matrix of valid vs. invalid states:

Value Error Valid state

null No …maybe?

"C-3PO" No ✅

null Yes ✅

"C-3PO" Yes ❌

Value	Error	Valid state
`null`	No	…maybe?
`"C-3PO"`	No	✅
`null`	Yes	✅
`"C-3PO"`	Yes	❌

Additionally, I just find it weird that I’ll have to look in a different section of the response to find out if there was an error fetching this particular field. Why can’t it be inlined into the field itself?

If sum types were available, you could do this:

sumtype Fallible<T> {
  Ok T
  Error ErrorType
}

type Hero {
  name: Fallible<String!>
  heroFriends: [Hero!]!
}

you not only get documentation that this field may fail, but you can also distinguish between the required field (Fallible<String!>) and the optional one (Fallible<String>), and it is inlined into the response.

This is kinda possible with the unions that exist today, but since unions only work on different object types, you may be forced to write wrapper unions to achieve what you want.

In Search of a Petrol Station

It’s not easy to find documentation for why something wasn’t added or considered to a language or specification, and the same applies here. There’s some backstory in the GraphQL Documentary, but it for the most part talks about the timeline from its inception to the first adoption of it outside of Facebook. The details aren’t really discussed, so I @’d Lee Byron on Twitter, and he replied with

Yeah it’s just a tradeoff of expressibility and complexity. Lots of tools and languages need to integrate with GraphQL schema, so it’s useful to keep it simple.

I know some servers offer template types behind the scenes, so similar expressivity but each instance named.

And that makes sense. Parametric polymorphism and sum types are hard to implement, and hard to use idiomatically in some languages with “inferior” type systems. Adoption will suffer if it’s hard to make an implementation: Having hobbyists able to make their own implementation is a feature, not an incidental bonus.

Additionally, I think it’d be super hard to include this in a backwards-compatible manner. GraphQL has optionality as the default and nonnull as an opt-in, and this doesn’t compose well with generics. For example

type MyType<T> {
  value: T!
}

wouldn’t make sense in the type MyType<Int!>, as you can’t have an Int!!. If you were to flip it around and say that nonnull is the default and that optionality is denoted by a sum type, the semantics seem a bit more obvious. For example,

sumtype Opt<T> {
  Some T
  None
}

type MyType<T> {
  value: Opt<T>
}

would work fine, even if you write MyType<Opt<Int>>.

As I wrote earlier, I find GraphQL’s type system very limiting. But I come from languages like Haskell, OCaml and Elm, where the features I’ve mentioned are the norm, not the exception. While I am annoyed with them not being present, I can see that implementing some hacky workaround for them in Go, Java and C#, the languages that I guess are the most prevalent server-side languages these days, would be even more annoying for consumers. And as I wrote, I’m pretty sure it’d be detrimental to the adoption of GraphQL to have it more sophisticated than it is.

It’s tempting to say “Customers ask for a faster horse, not a car”, but it’s pretty moot to have one when there aren’t any auto repair shops or petrol stations in your day-to-day environment.

Whether that is sensible or not is a different blog post. ↩
I think the connection types are needlessly complex for almost all situations. Total count, for example, is expensive to compute up front, and I don’t feel I’ve ever been in situations where using a cursor in the middle of the connection ever made sense. If you read this, at least evaluate this connection type instead:
```
type CommitPage {
  elements: [Commit!]!
  # If nextPageCursor is unset, it means the
  # end of the list has been reached.
  nextPageCursor: String # or ID
}
```
Here I’ve also forced all elements to be present, because it’s easier to say “could not load page” instead of saying “could not load this particular element”, as I don’t really see this happening that often… but it could be relevant for you, so pick the optionality you want. ↩