Datasets¶
Arkimet supports different dataset formats, which offer different performance and indexing features to match various ways in which data is stored and queried.
Dataset configuration¶
Datasets are configured with simple key = value
configuration options.
These are all the supported options:
archive age
: data older than this number of days will be moved to the dataset archive during maintenance.delete age
: data older than this number of days will be deleted during maintenance.eatmydata
: disable fsync/fdatasync operations while writing data to dataset, and disable sqlite’ journaling and other data integrity features. This makes acquiring data very fast, but an interrupted import or a concurrent import may cause data corruption.format
: format of data in the dataset (one ofgrib
,bufr
,odimh5
,vm2
)index
: comma-separated list of names of metadata to index for faster queriespath
: path to the dataset, or URL forremote
datasets.replace
: whenyes
, importing duplicate data will replace the existing version . Whenno
, importing duplicate data will be rejected. Whenusn
, importing duplicate BUFR data will replace the existing version only if the BUFR Update Sequence Number is greater than the one currently in the dataset. A replace leaves the old data in the segment and appends the new data at the end, updating the index to refer to the new data. As with deleted data, disk space is only reclaimed when runningarki-check --repack
restrict
: comma-separated list of names that have access to the dataset. This allows filtering with the--restrict
option on command line.smallfiles
:yes
orno
. Whenyes
, the file contents are also saved in the index, to speed up extraction of data with tiny payloads likevm2
.step
: segmentation step for the dataset (one ofdaily
,weekly
,biweekly
,monthly
, andyearly
).type
: dataset type (one ofiseg
,simple
,error
,duplicates
,remote
,outbound
,discard
,file
).unique
: comma-separated list of names of metadata that, taken together, make it unique in the dataset