Querying your data
The GeoSpock database enables you to run SQL queries on your ingested data, and has optimized a subset of SQL functions that are used frequently in the analysis of geo-spatial data. These optimized functions use the indexes generated when your source input data was ingested.
For the best performance and to make sure that you are taking advantage of the benefits of the indexed data, you should follow the guidelines for creating optimized queries. Refer to Using the GeoSpock DB Presto connector for more information.
Auto-scaling SQL cluster behavior
As you run queries, the GeoSpock database automatically adds worker machines to your SQL cluster as required. You should be aware that the GeoSpock database will not use the new SQL machines for queries that are already in the queue at time the scaling up happens. The GeoSpock database scales down the SQL cluster once it has completed all the queries in the queue.
The number of machines in the SQL cluster never falls below the minimum count value (supplied when deploying the cluster) and never rises above the maximum count value (also supplied when deploying).
Optimized SQL functions
Optimized functions include:
ST_Within()
ST_Distance()
line_locate_point()
ST_Intersects()
Custom GeoSpock functions include:
GS_Distance_Within()
gs_great_circle_distance_within()
See GeoSpock database optimized SQL functions
Accessing the GeoSpock database
The are a number of ways that you can access the GeoSpock database by:
- running queries from a Business Intelligence (BI) tool, such as Tableau, Quicksight or Power BI; see Using Business Intelligence (BI) tools with the GeoSpock database
- using the Presto CLI to run queries on your ingested data; see Running GeoSpock database queries from the Presto CLI
Geotemporal functions
In your data analysis, to find events that occur within a specific time range or at a certain time, use the GeoSpock database's optimized time queries by specifying a:
- single time window
- time interval
Geometry functions
As part of your data analysis, you may want to focus on a specific region or postcode area to answer specific questions about your dataset. The GeoSpock database provides the following optimized geometry functions:
- Simple bounding box
- Points within/intersects with a geometry shape
- Line locate point
- Recorded geometries intersect with a query geometry shape
Using User-Defined Functions (UDFs)
You may wish to write your own custom functions and deploy them with the SQL cluster. The GeoSpock DB Presto connector can then use your UDFs in a SQL query.