Managing your data


This section provides information for data engineers and data scientists with guidance on how to manage data in your Platform, covering:

  • dataset authoring
  • data ingestion
  • dataset management

The GeoSpock stack processes, indexes and stores your source input data, ready for use by the analytics tools, such as the Data Explorer and the Analytics API. The resulting datasets are accessible through the dashboard. You can also access to your Data Explorer galleries through the dashboard.



Dataset authoring

To be able to ingest your source input data into the Platform, you need to:

  • prepare your data, making sure that it is in a supported format, and the file size and directory structure will facilitate the ingestion process; see Preparing your data for ingest
  • create a dataset description, the JSON configuration file that sets how the ingestor should process the data to create the indexes. Your account manager will provide you with a dataset description based upon your data and data analysis requirements

Data ingestion

Once you have your source input files and dataset description, your data can be ingested into the Platform.

So that your Platform administrator can create an ingestion job, let them know the S3 bucket location of your:

  • source input data
  • dataset description

Dataset management

When you source input data has been ingested into the Platform, it will be available as datasets in the dashboard; see Viewing your datasets.

Using an analysis tool, such as Data Explorer, you can then explore the data and create galleries to share your discoveries with others; see Accessing your galleries.