JSON Serialization Patterns

This document collects some emerging patterns for JSON serializations. If you are developing your own data specification, you may benefit from reading the different solutions to design problems discussed below.

Include data about a related object
Remove unnecessary complexity
- Pagination and metadata
Flag non-specification properties

Include data about a related object

It is very common for a JSON document to embed data about a related object. For example, it makes sense to put information about a bill’s sponsors on the bill object. In this example, you may start off with a relatively flat structure like:

{
  "name": "HR 3501",
  "title": "Humanity and Pets Partnered Through the Years (HAPPY) Act",
  "sponsorships": [
    {
      "entity_type": "person",
      "entity_id": "400260",
      "name": "Thaddeus McCotter",
      "chamber": "lower",
      "sponsorship_type": "sponsor",
      "primary": true
    }
  ]
}

Two problems:

Each Sponsorship object in the sponsorships array contains fields about both the sponsorship (sponsorship_type and primary) and the sponsor (name and chamber), which is not object-oriented.
The entity_type and entity_id fields make the Sponsorship class polymorphic, i.e. a Sponsorship object represents either a person or a committee depending on the values of those two fields. If polymorphism can be avoided, the data model will be simpler.

To eliminate the polymorphism, we can replace the Sponsorship objects like:

{
  "name": "HR 3501",
  "title": "Humanity and Pets Partnered Through the Years (HAPPY) Act",
  "sponsorships": [
    {
      "type": "person",
      "id": "400260",
      "name": "Thaddeus McCotter",
      "chamber": "lower",
      "sponsorship_type": "sponsor",
      "primary": true
    }
  ]
}

Each object in the sponsorships array is now either a Person or a Committee object. If the type of the sponsor is unknown, we can omit the type field. However, we are still left with the first problem of mixing data about the sponsorship with data about the sponsor.

To avoid injecting the sponsorship_type and primary properties onto the Person object, consider:

{
  "name": "HR 3501",
  "title": "Humanity and Pets Partnered Through the Years (HAPPY) Act",
  "sponsorships": [
    {
      "person": {
        "id": "400260",
        "name": "Thaddeus McCotter",
        "chamber": "lower"
      },
      "sponsorship_type": "sponsor",
      "primary": true
    }
  ]
}

The Sponsorship object uses a person field, whose value is an abbreviated Person object (that is, the Person object omits fields that are irrelevant to the sponsorship, e.g. the person’s date of birth). If the sponsor were a committee, the Sponsorship object would use a committee field. If the type of the sponsor is unknown, we can instead use a field corresponding to a superclass, like thing.¹

A less object-oriented approach to isolating the data about the sponsorship from the data about the sponsor is to prefix all sponsor fields, e.g.:

{
  "name": "HR 3501",
  "title": "Humanity and Pets Partnered Through the Years (HAPPY) Act",
  "sponsorships": [
    {
      "person_id": "400260",
      "person_name": "Thaddeus McCotter",
      "person_chamber": "lower",
      "sponsorship_type": "sponsor",
      "primary": true
    }
  ]
}

However, this approach should be avoided, as it is more difficult to identify and collect the fields for the related object than the previous approach.

1. The Web Ontology Language (OWL) defines an owl:Thing class, which is the superclass to all classes. Schema.org also defines a Thing class.

Remove unnecessary complexity

For example, consider the following response from an API call:

{
  "objects": [
    {
      "id": "foo"
    },
    {
      "id": "bar"
    }
  ]
}

The array datatype already captures the fact that the response is a list of objects. The surrounding object is redundant and unnecessary:

[
  {
    "id": "foo"
  },
  {
    "id": "bar"
  }
]

Pagination and metadata

Django REST framework, Tastypie and other API frameworks return responses like:

{
  "count": 20,
  "next": "/path?limit=20&offset=20",
  "previous": null,
  "results": [
    {
      "id": "foo"
    },
    ...
  ]
}

Or with all metadata fields inside a single meta object:

{
  "meta": {
    "limit": 20,
    "next": "/path?limit=20&offset=20",
    "offset": 0,
    "previous": null,
    "total_count": 50
  },
  "objects": [
    {
      "id": "foo"
    },
    ...
  ]
}

To eliminate the metadata from the response, APIs like GitHub’s put this information in the Link HTTP header, following RFC 5988:

GET https://example.com/path HTTP/1.1
Content-Type: application/json; charset=utf-8
Status: 200 OK
Link: <https://example.com/path?offset=20>; rel="next"


[
  {
    "id": "foo"
  },
  ...
]

Flag non-specification properties

It is impractical to define every possible property for a given class in order to satisfy all possible use cases. Some properties used for specific use cases will therefore not be defined in the data specification. For example, the following JSON document adds a hair_colour property:

{
  "name": "Mr. John Q. Public, Esq.",
  "hair_colour": "brown"
}

Data consumers, who are less familiar with the specification itself, might assume that the hair_colour property is part of the specification and expect it to be defined on other documents with the same semantics over the long term.

To avoid misinterpretation, some implementations flag additional properties that may be subject to change using one of the two strategies below:

Prefix the additional property with a + character, e.g.:

 {
   "name": "Mr. John Q. Public, Esq.",
   "+hair_colour": "brown"
 }

Collect all additional properties into a subdocument, e.g.:

 {
   "name": "Mr. John Q. Public, Esq.",
   "extra": {
     "hair_colour": "brown"
   }
 }