U/Data/JSON tools

JavaScript Object Notation (JSON) is a data exchange format deigned to be “minimal, portable, textual, and a subset of JavaScript”[1]. The definition of JSON was originally designed as part of JavaScript. Now the format is widely used for many applications.

JSON Format

JSON grammar, data structure, and conformance rules can be found in reference [3].

Below is an example of using json to process a JSON instance found in reference [4]:

import json 

ex1 = {
    "Image": {
        "Width":  800,
        "Height": 600,
        "Title":  "View from 15th Floor",
        "Thumbnail": {
            "Url":    "http://na/image/001",
            "Height": 125,
            "Width":  100
        },
        "Animated" : False,
        "IDs": [116, 943, 234, 38793]
    }
}
print(json.dumps(ex1, indent=2))
{
  "Image": {
    "Width": 800,
    "Height": 600,
    "Title": "View from 15th Floor",
    "Thumbnail": {
      "Url": "http://na/image/001",
      "Height": 125,
      "Width": 100
    },
    "Animated": false,
    "IDs": [
      116,
      943,
      234,
      38793
    ]
  }
}

There are three literal names in JSON format, all lower cases: false, true, null.
Note that the output printed above shows false, while the Python format input is False.

JSON Schema

For a larger dataset, a foundation schema or structure template is helpful to infer, create, modify and validate receiving JSON instances to ensure correct data exchange. A brief history and links about JSON schema can be found in reference [5]. A schema object describe the structure of elements within a JSON dataset [6].

Please note that there are multiple Python tools available for generating JSON schema. Discussions below use package genson for illustration of the use of schema. Other schema tools should work as well.

To generate a schema from an object using package genson:

from genson import SchemaBuilder

builder = SchemaBuilder()
builder.add_object(ex1['Image'])
builder.to_schema()
{'$schema': 'http://json-schema.org/schema#',
 'type': 'object',
 'properties': {'Width': {'type': 'integer'},
  'Height': {'type': 'integer'},
  'Title': {'type': 'string'},
  'Thumbnail': {'type': 'object',
   'properties': {'Url': {'type': 'string'},
    'Height': {'type': 'integer'},
    'Width': {'type': 'integer'}},
   'required': ['Height', 'Url', 'Width']},
  'Animated': {'type': 'boolean'},
  'IDs': {'type': 'array', 'items': {'type': 'integer'}}},
 'required': ['Animated', 'Height', 'IDs', 'Thumbnail', 'Title', 'Width']}

To generate a schema from a list of objects (records) using package genson:

ex2 = [ex1['Image']] + [
    {
        "Width":  "701",
        "Height": -1.0,
        "Title":  "View from 16th Floor",
        "new_field": "This is a new field"
    }
]

builder = SchemaBuilder()
builder.add_object(ex2)
print(json.dumps(builder.to_schema(), indent=2))
{
  "$schema": "http://json-schema.org/schema#",
  "type": "array",
  "items": {
    "type": "object",
    "properties": {
      "Width": {
        "type": [
          "integer",
          "string"
        ]
      },
      "Height": {
        "type": "number"
      },
      "Title": {
        "type": "string"
      },
      "Thumbnail": {
        "type": "object",
        "properties": {
          "Url": {
            "type": "string"
          },
          "Height": {
            "type": "integer"
          },
          "Width": {
            "type": "integer"
          }
        },
        "required": [
          "Height",
          "Url",
          "Width"
        ]
      },
      "Animated": {
        "type": "boolean"
      },
      "IDs": {
        "type": "array",
        "items": {
          "type": "integer"
        }
      },
      "new_field": {
        "type": "string"
      }
    },
    "required": [
      "Height",
      "Title",
      "Width"
    ]
  }
}

The output of to_schema() is a JSON document and can be edited as needed. The package genson also provides functions to update a schema, and to create extended schema builder.

The module util.jsonsch accepts a schema object, and provides summary. Note that there are wide varieties of usage of JSON format and schema. The module util.jsonsch is still in developing stage and only supports a set of keywords.

Reference