Using Spark datasets
in the Analytics API
The Analytics API API performs operations on data it retrieves from one or more Spark datasets. A Spark dataset is the equivalent to a data layer in a GeoSpock dataset that the Data Explorer uses. Your source input data is ingested into a GeoSpock dataset and during this process, a number of data layers are created. Each data layer contains a subset of your data, such as taxi pick up points, supermarket locations or ad requests.
- create a Spark dataset, follow the instructions in Creating a Spark dataset object and the example code in A worked example using a Spark dataset
- combine two datasets; see Combining GeoSpock datasets with other datasets using the Analytics API
- get the history for the devices in your dataset; see Getting the device history for more information
- save the dataset in memory to improve the performance of subsequent operations; see Caching a Spark dataset for more information