JSON Lines format

The GeoSpock ingestor expects data in JSON Lines format to comply with the following:

  • A single line must contain JSON encoded data, terminated by a Line Feed (For more information, see http://jsonlines.org/)
  • The root element must be an object
  • Elements of that object must be either numbers, booleans or strings
  • The object properties may be in any order. The ingestor uses the property name to differentiate the fields, so you must ensure that property names are consistent between files
  • File names should be suffixed by .jsonl
  • The file must contain only flat objects
  • Within the JSON Lines structural elements, your data can include spaces:

    • between the braces

    • before or after quotes surrounding content

    • around the colon or comma

Example content:

{"uuid": "2aadb-99d-97943", "lat": 42.32365, "lon": 44.538375, "calories": 12.5, "timestamp": 1041037198}

Describing JSON lines data

To ingest your source input data, you need to provide a description of the source data for the ingestor. The ingestor uses this data source description to store the ingested data correctly in the GeoSpock database, enabling you to run your queries and do your data analysis. For more information see Creating a data source description for a dataset.

The following table shows the fields you must provide when describing this format of data in a data source description.

Setting Description
id

The name of the column in the SQL table

Example: event_elevation

The ID specified should contain only numbers, lowercase letters and underscores.

sourceFieldName

The name of the field in the JSON object

Example: "height1"

purpose

(Optional) This setting enables you to identify the following fields:

  • latitude
  • longitude
  • elevation
  • source_id

See Special fields (purpose) for more information.

sqlType

The data type for this field. For more information about the data types supported, see Types of data.

Example: REAL

For example:

{
	"id": "taxi_id",
	"sourceFieldName": "tid",
	"purpose": "SOURCE_ID",
	"sqlType": "VARCHAR"
}		

Data validation

For JSON Lines format data, each row of source input data must comply to the JSON specification or it will be rejected.

For a given row, if a source field:

  • is referenced that does not exist, the value associated with that field will be interpreted as NULL
  • is an empty string, it will be interpreted as an empty string (the validity of this is based on its field specification)
  • has a value null, it will be interpreted as NULL (the validity of this is based on its field specification)