Source input data formats
The GeoSpock ingestor supports the following data formats:
All field values must be string, numeric or boolean.
If your source input data is in a different format, you will have to process it so that it conforms to one of the supported ingest formats.
The files may be uncompressed, or compressed with:
- bzip2 (with the .bz2 suffix)
- lzo (with the .lzo suffix)
- gzip (with the .gz suffix)
- Snappy (with the .snappy suffix)
- lz4 (with the .lz4 suffix)
The ingestor does not support split archives, so you should make sure that your data files are small enough to be compressed; for further guidance, see the documentation about file size.
For compressed data files, you must add a file extension for each file to enable the ingestor to process the data correctly.
During ingestion, the source input data is processed row-wise, based on the format of the source input file.
For all datasets:
- a row will be considered invalid if any of the field’s values in the data source desscription are considered invalid
- an invalid row will be excluded from the dataset