Processing your data
The GeoSpock database provides you with the administrative tools and processes to enable you to:
- load your data into the database
- index it for use with the analysis tools, including the automatic creation of AWS resources to support the ingestion of your data
- manage your data throughout its life-cycle to:
- incrementally add data as it becomes available
- delete data that you no longer require
Ingested data is stored as a dataset; see Using datasets for more information.
Before you can load your source input data into the GeoSpock database , you need to prepare your data, by making sure that:
- it is in a supported format
- the file size and directory structure will facilitate the ingestion process
See Preparing your data for ingest for guidance on how to format and organize your source input data.
To start analyzing your data, you need to load it into the GeoSpock database. To do this, you need:
- your source input data; see Preparing your data for ingest for guidance on making your data ready of loading into the database
- a schema for your source input data. This JSON file configures how your data is stored and indexed when it is loaded into the database; see Creating a data source description for a dataset
You can then ingest your data; see Ingesting source input data for guidance on how to do this.
When new data becomes available, you can add this to an existing dataset using an incremental ingest.
The GeoSpock database makes ingested data available as datasets.
Using the GeoSpock CLI, you can:
- list the datasets loaded into the GeoSpock database
- view the status and history of each dataset
- delete datasets that you no longer require
See Using datasets for guidance on using the GeoSpock CLI.