Managing your data

This section provides information for data engineers and data scientists with guidance on how to manage data in your GeoSpock stack, covering:

  • dataset authoring
  • data ingestion
  • dataset management

The GeoSpock stack processes, indexes and stores your source input data, ready for use by the analytics tools, such as illumin8 and extrapol8. The resulting datasets are accessible through the dashboard. You can also access to your illumin8 galleries through the dashboard.

Dataset authoring

To be able to ingest your source input data into the GeoSpock stack, you need to:

  • prepare your data, making sure that it is in a supported format, and the file size and directory structure will facilitate the ingestion process; see Preparing your data for ingest
  • create a dataset description, the JSON configuration file that sets how the ingestor should process the data to create the indexes. Your account manager will provide you with a dataset description based upon your data and data analysis requirements

Data ingestion

Once you have your source input files and dataset description, your data can be ingested into the GeoSpock stack.

So that your stack administrator can create an ingestion job, let them know the S3 bucket location of your:

  • source input data
  • dataset description

Dataset management

When you source input data has been ingested into the GeoSpock stack, it will be available as datasets in the dashboard; see Viewing your datasets.

Using an analysis tool, such as illumin8, you can then explore the data and create galleries to share your discoveries with others; see Accessing your galleries.