The GeoSpock database
The GeoSpock database is a cloud-based solution that processes big data and makes it available for analysis, enabling you to find patterns and trends in your data. It comprises:
- a command line interface (CLI) to enable you to manage your data, and the user accounts
- components that ingest and index your source input data
- an SQL interface that gives you access to your data for further analysis
- an architecture that enables you to integrate your existing analysis tools
Being installed in the cloud, you can add GeoSpock database components and resources as you need them to fit your analysis needs. See The GeoSpock database architecture for further details about its components.
The first step to using the GeoSpock database is to load your data into it. You ingest each source of data using a schema that describes the source data field types (Data source description files). As new data becomes available, you can add it to the dataset (Ingesting data for an existing dataset).
Ingesting your data into the GeoSpock database (Ingesting data) creates a set of indexes and stored data optimized for big data searches and queries that you can use to explore and analyze your data using the GeoSpock database tools and APIs. This data is then available as a dataset for you to analyze and query (Dataset administration).
The data management process is driven programmatically through a command line tool, the GeoSpock CLI, that enables you ingest your data and manage your datasets (The GeoSpock CLI).
You can integrate your existing Business Intelligence (BI) tools and visualization tools with the GeoSpock database enabling you to use the database in your existing workflows for data analysis (Integrating your third party tools).
You can access the GeoSpock database:
- by using the Presto CLI to run queries on your ingested data; see Running GeoSpock database queries from the Presto CLI
- by running queries from a Business Intelligence (BI) tool, such as Tableau, Quicksight or Power BI; see Using Business Intelligence (BI) tools with the GeoSpock database
- programmatically, for example from Python or Java; see Accessing the GeoSpock database programmatically
Once you've ingested your data and connected your tools, you can run queries on your data (Querying your data). A number of SQL functions have been optimized for use with the GeoSpock database. Refer to GeoSpock database optimized SQL functions for a list of these functions.