This document collects some emerging patterns for JSON serializations. If you are developing your own data specification, you may benefit from reading the different solutions to design problems discussed below.
It is very common for a JSON document to embed data about a related object. For example, it makes sense to put information about a bill’s sponsors on the bill object. In this example, you may start off with a relatively flat structure like:
{
"name": "HR 3501",
"title": "Humanity and Pets Partnered Through the Years (HAPPY) Act",
"sponsorships": [
{
"entity_type": "person",
"entity_id": "400260",
"name": "Thaddeus McCotter",
"chamber": "lower",
"sponsorship_type": "sponsor",
"primary": true
}
]
}
Two problems:
sponsorships
array contains fields about both the sponsorship (sponsorship_type
and primary
) and the sponsor (name
and chamber
), which is not object-oriented.entity_type
and entity_id
fields make the Sponsorship class polymorphic, i.e. a Sponsorship object represents either a person or a committee depending on the values of those two fields. If polymorphism can be avoided, the data model will be simpler.To eliminate the polymorphism, we can replace the Sponsorship objects like:
{
"name": "HR 3501",
"title": "Humanity and Pets Partnered Through the Years (HAPPY) Act",
"sponsorships": [
{
"type": "person",
"id": "400260",
"name": "Thaddeus McCotter",
"chamber": "lower",
"sponsorship_type": "sponsor",
"primary": true
}
]
}
Each object in the sponsorships
array is now either a Person or a Committee object. If the type of the sponsor is unknown, we can omit the type
field. However, we are still left with the first problem of mixing data about the sponsorship with data about the sponsor.
To avoid injecting the sponsorship_type
and primary
properties onto the Person object, consider:
{
"name": "HR 3501",
"title": "Humanity and Pets Partnered Through the Years (HAPPY) Act",
"sponsorships": [
{
"person": {
"id": "400260",
"name": "Thaddeus McCotter",
"chamber": "lower"
},
"sponsorship_type": "sponsor",
"primary": true
}
]
}
The Sponsorship object uses a person
field, whose value is an abbreviated Person object (that is, the Person object omits fields that are irrelevant to the sponsorship, e.g. the person’s date of birth). If the sponsor were a committee, the Sponsorship object would use a committee
field. If the type of the sponsor is unknown, we can instead use a field corresponding to a superclass, like thing
.1
A less object-oriented approach to isolating the data about the sponsorship from the data about the sponsor is to prefix all sponsor fields, e.g.:
{
"name": "HR 3501",
"title": "Humanity and Pets Partnered Through the Years (HAPPY) Act",
"sponsorships": [
{
"person_id": "400260",
"person_name": "Thaddeus McCotter",
"person_chamber": "lower",
"sponsorship_type": "sponsor",
"primary": true
}
]
}
However, this approach should be avoided, as it is more difficult to identify and collect the fields for the related object than the previous approach.
1. The Web Ontology Language (OWL) defines an owl:Thing
class, which is the superclass to all classes. Schema.org also defines a Thing class.
For example, consider the following response from an API call:
{
"objects": [
{
"id": "foo"
},
{
"id": "bar"
}
]
}
The array datatype already captures the fact that the response is a list of objects. The surrounding object is redundant and unnecessary:
[
{
"id": "foo"
},
{
"id": "bar"
}
]
Django REST framework, Tastypie and other API frameworks return responses like:
{
"count": 20,
"next": "/path?limit=20&offset=20",
"previous": null,
"results": [
{
"id": "foo"
},
...
]
}
Or with all metadata fields inside a single meta
object:
{
"meta": {
"limit": 20,
"next": "/path?limit=20&offset=20",
"offset": 0,
"previous": null,
"total_count": 50
},
"objects": [
{
"id": "foo"
},
...
]
}
To eliminate the metadata from the response, APIs like GitHub’s put this information in the Link HTTP header, following RFC 5988:
GET https://example.com/path HTTP/1.1
Content-Type: application/json; charset=utf-8
Status: 200 OK
Link: <https://example.com/path?offset=20>; rel="next"
[
{
"id": "foo"
},
...
]
It is impractical to define every possible property for a given class in order to satisfy all possible use cases. Some properties used for specific use cases will therefore not be defined in the data specification. For example, the following JSON document adds a hair_colour
property:
{
"name": "Mr. John Q. Public, Esq.",
"hair_colour": "brown"
}
Data consumers, who are less familiar with the specification itself, might assume that the hair_colour
property is part of the specification and expect it to be defined on other documents with the same semantics over the long term.
To avoid misinterpretation, some implementations flag additional properties that may be subject to change using one of the two strategies below:
Prefix the additional property with a +
character, e.g.:
{
"name": "Mr. John Q. Public, Esq.",
"+hair_colour": "brown"
}
Collect all additional properties into a subdocument, e.g.:
{
"name": "Mr. John Q. Public, Esq.",
"extra": {
"hair_colour": "brown"
}
}