Using datasets

A dataset consists of a set of indexed data, from a single source, such as event data, Point of Interest (POI) data, or sensor data, where this source input data has been processed and loaded into the GeoSpock database. You can access and manage your datasets using the GeoSpock CLI. Using this command line tool, you can:

See The GeoSpock CLI for more information about how to install and use this command line interface.

Listing datasets

To list all the datasets that you have permission to view, use the dataset-list command:

$ geospock dataset-list --page-index <value> --page-size <value>

The page index/number can be supplied (starting with page 0) and/or the number of datasets per page. By default page 0 will be returned with 1000 datasets listed per page.

For more information about this command, use the GeoSpock CLI's help command.

Getting information about a dataset

Using the GeoSpock CLI, you can get the following information about a specific dataset:

  • its current status: use the dataset-status command to get information about a specified dataset, including its title, a summary of its contents, any groups that have permission to access it and the status of the most recent operation on that dataset. For example:

    geospock dataset-status --dataset-id nycTaxiData 
    {
        "id": "ingesttest1",
        "title": "ingesttest1",
        "description": "Ingested from \u201cingesttest1\u201d on 1/23/2020-09:45:49",
        "createdDate": "2020-01-23T09:45:50Z",
        "permissions": {
            "id": "readTable",
            "entitiesWithAccess": []
        },
        "operationStatus": {
            "id": "opr-ingesttest1-7",
            "label": "Data ingested",
            "type": "INGEST",
            "status": "COMPLETED",
            "lastModifiedDate": "2020-01-22T16:57:14.985Z",
            "createdDate": "2020-01-23T09:45:51Z"
        }
    }
    
  • For more information about this command, use the GeoSpock CLI's help command.

  • its history: use the dataset-operations command to get a history of all the operations that have been performed on that dataset. For example:

    $ geospock dataset-operations --dataset-id nycTaxiData 
    {
        "listInfo": {
            "totalItemCount": 2,
            "pageCount": 1
        },
        "operations": [
            {
                "label": "Data ingested",
                "type": "INGEST",
                "status": "COMPLETED",
                "createdDate": "2020-01-23T09:45:51Z",
                "lastModifiedDate": "2020-01-23T10:00:09.651Z"
            },
         ...
    
    
  • For more information about this command, use the GeoSpock CLI's help command.

  • its data source description: use the dataset-data-source-description command to get the data source description that was used during the ingestion of a specified dataset. For example:

    $ geospock dataset-data-source-description --dataset-id nycTaxiData 
    {
        "data-source-description": {
            "format": "COLUMNAR",
            "columnarFormatSeparator": ",",
            "properties": [
                {
                    "id": "longitude",
                    "type": "LONGITUDE",
                    "sourceFieldIndex": 5,
                },
                … (other properties) … 
            ],
            "indexes": [
                {
                    "propertyIDs": [
                        "farecategory"
                    ]
                }
            ]
        }
    }
    

    For more information about this command, use the GeoSpock CLI's help command.

Giving a GeoSpock database user group permission to access the dataset

To enable GeoSpock database users to get access to a dataset, you need to grant a group access to a dataset. Use the following command to give a group permission to access a specified dataset:

geospock dataset-permission-grant --dataset-id <dataset-id> --group-name <group-name>

For example:

geospock dataset-permission-grant --dataset-id nycTaxiData --group-name newGroup 
[
    {
        "entityId": "newGroup"
    }
]

For more information about this command, use the GeoSpock CLI's help command.

Refer to Managing data access and Adding permissions to your ingested data for more information about groups.

Removing permission to access a dataset from a user group

If you want to remove the permission from a group to access a specified dataset, use the following command:

geospock dataset-permission-revoke --dataset-id <dataset-id> --group-name <group-name>

For example:

geospock dataset-permission-revoke --dataset-id nycTaxiData --group-name newGroup 
[]

For more information about this command, use the GeoSpock CLI's help command.

Deleting a dataset

To delete a dataset and its associated data from the GeoSpock database, use the dataset-delete command as follows:

$ geospock dataset-delete --dataset-id nycTaxiData 

It takes short while for the dataset to be deleted. You can check that it has been removed by checking the list of datasets.

For more information about this command, use the GeoSpock CLI's help command.