There are not a lot of attempts to set a standard for REST API. I must say it is an enormous challenge, keeping in mind that there is still a lot of dispute on the definition and the meaning of REST. Meanwhile, there is one attempt that is relatively popular, and it's called JSON:API (unfortunately it is quite un-googleable). Simply JSON:API specification sets rules for how a REST API should behave. In the last couple of months, I found the chance to work in a project which is built upon these principles. Therefore, in this post, I just want to talk about my experience of working with it. But before I do talk about my impressions, I'd like to give a brief overview of the specification.
Like GraphQL, JSON:API is trying to find an answer to architect standardized API's that are robust and easy to deal with. First and foremost, the specification is compatible with REST principles. With that, it provides the missing abstractions for REST APIs in the wild.
JSON:API defines standards for common patterns like pagination, error handling, filtering, sorting and defining the medium for the data model. It aims to reduce the number of requests that are needed to be made to get the data to the clients.
Let's start by imagining a web application that deals with a couple of data models, like below:
type Address = {
street: string;
city: string;
postcode: string;
country: string;
};
type Customer = {
name: string;
email: string;
addresses: Address[];
};
And let's express this in our relational DB schema:
CREATE TABLE customers (id serial PRIMARY KEY, name text, email text);
INSERT INTO customers VALUES(1, 'John Doe', 'johndoe@alican.codes');
CREATE TABLE addresses (id serial PRIMARY KEY, customer_id serial, street text, postcode text, city text, country text);
INSERT INTO addresses VALUES(1, 1, 'Unter den Linden', '10000', 'Berlin', 'Germany');
Customer and Address entities have one-to-many relationship with each other. This means that a Customer can have one or more addresses belong to them.
With JSON:API we can set an API Call to simply fetch all data that is associated with a Customer entity like this:
GET /api/customers/:id?include=addresses
Content-Type: application/vnd.api+json
{
"data": {
"type": "customer",
"id": "1",
"attributes": {
"name": "John Doe",
"email": "johndoe@alican.codes"
},
"relationships": {
"addresses": {
"data": [{"id": "1", "type": "address"}]
}
}
},
"included": [
{
"type": "address",
"id": "1",
"attributes": {
"street": "Unter den Linden",
"postcode": "10000",
"city": "Berlin",
"country": "Germany"
}
}
]
}
If we take a close look, we can see that in the root level of the JSON payload we get data
property for the resource we requested. And the data property consists of type
, id
, attributes
(fields that directly belongs to that resource) and relationships
(entities that belong or are associated with the resource requested).
Meanwhile, included
field in the payload contains all the data that are referenced in relationships
fields in all resources in the payload. When there is an associated resource of an included entity, that value is contained in included
array.
With all that, it can be seen that JSON:API manages to send the data in a flat hierarchy, just like it is stored in a relational database. The core principle of JSON:API is simply treating the data just like the way it is stored in a traditional relational database and by doing so giving clients ability to restore a derivative database on the client side.
One of the biggest challenges of the single page applications or the clients that rely heavily on client-side caching is, to ensure a certain data point is consistent throughout the application. JSON:API forces clients to store the resources in a flat hierarchy so that the data is normalized in local cache.
Let's imagine an API that looks like this for our customer/address example:
GET /api/customers
Content-Type: application/json
{
"data: [{
"id": "1",
"name": "John Doe",
"email": "johndoe@alican.codes",
"addresses": [{
"id": "1",
"street": "Unter den Linden",
"postcode": "10000",
"city": "Berlin",
"country": "Germany"
}]
}]
}
The client can store the response as a whole in a key value store. In the meantime, we might want to provide a feature where we allow users to update their addresses. When a user updates an address, the client cache entry that stores the entire customers
response have to be invalidated to prevent inconsistencies.
On the other hand, JSON:API's flat resource payload encourages us to cache entities separately, in a normalized way. When the customer resource is fetched, two entries are created for customer
and address
. In case there is an update triggered from UI, only the relevant resource is required to be changed. Hence, there is no need to invalidate the customer
entry.
The rules regarding filtering are pretty relaxed in JSON:API. The filtering must be implemented case by case basis, and it should be applied using reserved query parameter filter
.
To only access to a limited number of resources that are segmented in pages, the response payload must provide links to certain points in data collections. In the top level meta property, the number of pages can be also provided.
An example response for a request to get a single customer may look like this:
GET /api/customers?page[number]=1&page[size]=1&include=addresses
Content-Type: application/vnd.api+json
{
"meta": {
"totalPages": 5
},
"data": [{
"type": "customer",
"id": "1",
"attributes": {
"name": "John Doe",
"email": "johndoe@alican.codes"
},
"relationships": {
"addresses": {
"data": [{"id": "1", "type": "address"}]
}
}
}],
"links": {
"self": "https://alican.codes/api/customers?page[number]=1&page[size]=1&include=addresses",
"first": "https://alican.codes/api/customers?page[number]=1&page[size]=1&include=addresses",
"prev": null,
"next": "https://alican.codes/api/customers?page[number]=2&page[size]=1&include=addresses",
"last": "https://alican.codes/api/customers?page[number]=5&page[size]=1&include=addresses"
},
"included": [
{
"type": "address",
"id": "1",
"attributes": {
"street": "Unter den Linden",
"postcode": "10000",
"city": "Berlin",
"country": "Germany"
}
}
]
}
An API request can fail for various reasons. In those cases, to inform clients about what went wrong, the response contains information about the context of the error or errors. In JSON:API, the payload must have a root level errors object which gives us insight about what went wrong. A single error object ideally should provide the following:
status
HTTP status code that might be applicable for the given error (404, 401 etc.)
code
An application specific code for a certain error
title
An intelligible summary about the error
detail
More information about the error in a readable form
An example error payload might look like as below:
GET /api/customers/67
Content-Type: application/vnd.api+json
{
"errors": [
{
"status": "404",
"code": "E-1001",
"title": "Resource does not exist",
"detail": "The customer with id 67 does not exist."
}
]
}
One of the biggest selling points of GraphQL is the ability to fetch the data that is exactly what is needed by a client. JSON:API allows that by include
query parameter in the url to add or remove associated resources for the entity that is requested. But that doesn't make it exactly the same as GraphQL in terms of functional parity. However, by sparse fieldsets, it is also possible to only get the requested fields of the resource.
Imagine that we want to get the name of a Customer
with the country field that is specified in the address resource. We can achieve that by the following query:
GET /api/customers/:id?include=addresses&fields[customer]=name&fields[address]=country
Content-Type: application/vnd.api+json
{
"data": {
"type": "customer",
"id": "1",
"attributes": {
"name": "John Doe"
},
"relationships": {
"addresses": {
"data": [{"id": "1", "type": "address"}]
}
}
},
"included": [
{
"type": "address",
"id": "1",
"attributes": {
"country": "Germany"
}
}
]
}
After we established a basic idea of what JSON:API is, we can start looking at the shortcomings of some of the design choices.
The example I've used in the post to demonstrate JSON:API payload are quite readable and understandable. However, in real world where you have to deal with lists with more than 10 items, things get quite tricky. Imagine a resource that have more one-to-many relationship with another resource. And that resource also have multiple other relationships with other resources. In those cases, it is quite easy to have an included
field with more than 100 items with different types. That list grows exponentially. And when you try to debug out the relationship between one of those resource, there's lots of back and forth using ids to find out what went wrong.
One of the fundemantals of the JSON:API is to find a way to represent flat data structures in a maintainable way using JSON format. This way the data that is fetched is normalized. And even though this is not part of the specification, the resources tend to be the exact representation of the data layer (inertia of similarity). In many cases, that data layer is a traditional relational database. Any change in data layer has a high likelihood of impacting the clients by changing the payload of the API. It might be an unlikely scenario, but when you move from an RDS to a schemaless DB, the whole landscape of resources changes, and with that, the way you handle the data in clients.
In short, JSON:API makes it easy for developers to use the data structures that are defined on server side also on client side. This leads to exposing implementation details, and therefore causing leaky abstractions.
Because of the flat structure, most of the time clients find themselves in a situation where they have to denormalize data. In other words, having data on clients in a normalized way forces them to do the heavy lifting where you need to turn the flat structures into hierarchical objects. At the end of the day, you end up finding yourself inventing JOINs or WHEREs of SQL in the client.
Let's think about a client: An app that wants to display the wiki pages for movies and actors. Even though there is a many-to-many type of relationship between those two data models, every page has a top-level resource: either movie or actor. This in a way helps us to avoid circular references and throws out all the need for flat data. Simply, UI requirements dictate a hierarchical structure.
<Movie movie={movie}>
<ActorList actors={movie.actors} />
</Movie>
Dealing with the normalized data definitely has its advantages. It gives the ability to easily avoid stale data on client side. By constructing a relational database table-like structures (or leveraging IndexedDB on web platform), it is possible to update a resource without worrying invalidating certain resources that has a relation with the updated resource. Since none of the resources encompasses another one in local cache, it is easy to make sure that there is no inconsistency in the UI. But not for all the cases. When you create or delete a resource, you have to invalidate all other resources that might have a relation with the created or deleted resource.
To clarify it, let me give an example. In an app, we might have a bunch of todo items, and the result is cached. Since the result is a list of relationships, when we create a new todo item, we have two choices: Invalidate the cached result or optimistically update the local cache manually on client side. In my opinion, both totally defeats the purpose of dealing with normalized data on clients. If I need to think about the cache validation proactively most of the time, there is no way of overseeing other intricacies imposed by JSON:API.
In the end, we can argue that this outcome is a result of trying to create another source of truth on client side. Namely, a replicated DB. Dealing with caches is not easy in any case, and using JSON:API is too not setting us free. Considering many of the modern applications are collaborative and they have to deal with frequently evolving data, polling and invalidating cache are more effective strategies to overcome the issue.
Certainly it is not the motivation for JSON:API specification, but you can argue that it is a GraphQL alternative. Especially when you consider sparse fieldsets, it is quite similar to what GraphQL proposes. But with its query language GraphQL forces you to think about the data you want to get as a first thought. With sparse fieldset, it is mostly an afterthought. It is mostly about fetching data all at once and hoping that it'll be useful later. But thinking about the performance and all new strategies like prefetching on client side, it feels like an outdated solution.
And another area that the specification is lacking is the discoverability and documentation. JSON:API does not offer a way to expose the resource types or the error codes that might be returned by a certain API. So, no solution to ensure that the contract between server and client is valid.
The intention behind the specification is quite solid. Countless hours have been spent discussing minor details to determine the behavior of an API, and more will be spent. Despite the idea to tame and standardize a solid REST API is noble, there are no silver bullets here. JSON:API specification is providing a solution that is no better than postgrest. A thin wrapper around the data layer with all the effort moved elsewhere.
In small applications (in particular apps rely heavily on CRUD operations), you are fine without it, and you can do wonders by designing every API by your own needs. Pagination, filtering, error handling, you name it. In big applications where lots of people are required to collaborate and align, discoverability and robustness have higher priority. Therefore, for that purpose, GraphQL is likely a better choice.