file listings

Example proposed output, note x_parameterSchema

{
    "HAPI": "3.2",
    "x_createdAt": "2017-02-21T17:27Z",
    "modificationDate": "2026-01-01T00:00Z",
    "x_parameterSchema": "list>fileList>jpgFileList",
    "parameters": [
        {
            "length": 20,
            "name": "Time",
            "type": "isotime",
            "x_format": "$Y-$m-$dT$H:$M:$SZ",
            "fill": null,
            "units": "UTC",
            "timeStampLocation" : "begin"
        },
        {
            "description": "Picture of the creek, unmodified",
            "fill": null,
            "name": "fileURI",
            "length": 26,
            "type": "string",
            "units": null,
            "stringType": {
                "uri": {
                    "base": "https://cottagesystems.com/data/hapi/pics/",
                    "mediaType": "image/jpeg"
                }
            }
        },
        {
            "description": "File modification time",
            "name": "modificationDate",
            "type": "isotime",
            "fill": null,
            "x_format": "$Y-$m-$dT$H:$MZ",
            "length": 17,
            "units": "UTC"
        },
        {
            "description": "File size in kilobytes",
            "name": "fileSize",
            "fill": null,
            "type": "integer",
            "units": "KiB"
        }
    ],
    "sampleStartDate": "2023-01-01T00:00Z",
    "sampleStopDate": "2023-02-01T00:00Z",
    "startDate": "2022-11-01T00:00Z",
    "stopDate": "2026-03-06T00:00Z",
    "cadence": "PT10M",
    "status": {
        "code": 1200,
        "message": "OK"
    }
}

One issue is how to deal with the units on the file size. We could use IEEE units, which seem to be similar (the same?) as what is used in VO units, and astropy units, and probably also IEEE units: https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=9714443

Message sent 2026-04-06 to HAPI dev mailing list with status update:

For a summary of where we are now: We would like there to be a schema to indicate that a HAPI response is a listing of files that are available as URIs. (We did not provide this or encourage it so far because we don’t want providers just offering a file listing and saying they made their data available via HAPI.) If people do list files using HAPI, we would prefer that they all use the same format, so that it becomes possible to interpret file listings interoperably from any HAPI service. Therefore, we will offer a schema, that if followed, will allow clients to: a) know that they are getting a file listing, and b) be able to interpret such a listing from any server with computer precision using a single client.

The most basic file listing will be a HAPI dataset that has only 2 required columns:

a time column as the first column (required by HAIP for any dataset); for a file listing, this represents the start time of the data in the file
filename as a URI; this is a string column that has a special string sub-type of URI (this URI sub-type is part of the existing HAPI spec as of version 3.2) with a link to the file the start time of the data in the file. See here for URI string types: https://github.com/hapi-server/data-specification/blob/master/hapi-3.2.0/HAPI-data-access-spec-3.2.0.md#3616-the-stringtype-object

There can be optional elements after this for: file size, end time of data in the file, file modification time, file creation time, last file access time, checksum If any of these items are included, there are constraints that must be followed for them to be recognized by HAPI. Following any of these optional but constrained items, a dataset may include any number of other, additional columns relevant for these files, such as wavelength, frequency range, observed target, DOI, image type, quality flag, data version, processing level, etc. HAPI does not place any restriction on the number or structure of these additional columns. They just need to be valid HAPI parameters. Any “x_” items in these parameters are of course allowed, as always.

2026-04-20

Discussion about fileSize:

JavaScript does not even have integers, so what should size be? Pandering to JSON and JavaScript is hard since it doesn't have integers (or comments!)
Current thinking: use double and recommend that it be shown as an integer with as full precision as possible so that you get the exact value; if you are above 2GB (more digits than fits in double)
JavaScript: may lose precision for integers larger than 9007199254740991 (2^53 - 1)
see this binary presentation converter: https://www.binaryconvert.com/result_double.html
If a double is in this range: +/- 9,007,199,254,740,991 then represent it exactly, and this value will be represented exactly s a double
Discussed and abandoned: We could suggest that people add their own x_exactFileSize as a clandestine long by actually being a string type JSON; such as "123456789012345" (quotes make it a string to JSON, and then it requires special parsing, like a BigInt)
What about making fileSize as a string
Will summarize and clean this up tomorrow.
This is useful to show that most file sizes (much bigger than 2GB) would be precisely represented: https://www.binaryconvert.com/convert_double.html

See also: https://github.com/hapi-server/data-specification/issues/218

Sample info response for a file listing

{
   "HAPI": "3.3",
   "status": { "code": 1200, "message": "OK"},
   "$schema": "https://hapi-server.org/schemas/HAPI-3.2.json#info-fileListing",
   "startDate": "1998-001Z",
   "stopDate" : "2017-100Z",
   "parameters": [
       { "name": "time",
         "type": "isotime",
         "units": "UTC",
         "fill": null,
         "length": 24 },
       { "name": "fileURI",
         "type": "string",
         "stringType": {"uri": { "base": "https://sample.com/listing", "mediaType": "image/fits" } },
         "fill": null,
         "description": "solar images at 580 nm",
         "label": "filename"},
       { "name": "checksum",
         "type": "string",
         "length": 32,
         "stringType": {"checksum": { "algorithm": "md5" } },
         "fill": null,
         "description": "pre-calculated checksum using MD5 algorithm"},
       { "name": "stopDate",
         "type": "isotime",
         "length": 24,
         "units": "UTC",
         "fill": null,
         "description": "end date and time when the image was taken; integration times range from 10s to 30s",
         "label": "image stop date"}
   ]
}

How to handle duration of files and events

How to handle the fact that event listing and file listings involve content that has an intrinsic time range. Regular HAIP data content has each row associated with a point in time, at least with respect to the query for data.

We decided to keep the query mechanism and rules the same, and will just add a statement about the need to expand a query time range to include potential edge cases, something like: Because event lists and file listings refer to items with an implied durations, a HAPI query for items in this kind of list may need to be expanded, since the query will return only items whose start time falls in the query range. If a server wants to communicate a duration, the stopDate should be used.

How to handle duplicate times in file listings or event lists

Repeated time tags are allowed in fileListing or eventList data schemas. Equivalently, we could say that data must never be decreasing.

We just noticed that the HAPI spec never actually states that HAPI times must only ever increase. So we need to add that to the spec! The definitions for "monotonically increasing: vary, so we will avoid that language. The spec shoudl say that values can only ever increase, with no duplicates.

Comments on case and capitalization

Three places where we have specific capitalization:

http query parameters: we use snake case, such as include_parameters
camelCase everywhere else
AlertCamelCase for the name of the first column, the Time parameter (sort of, since it's only one word)

Defining the schema for what the parameters are

Like the unitsSchema and coordinateSystemSchema, we will use parameterSchema as the keyword.

Other options: datasetSchema - this means keywords outside the parameters have extra requirements

Could datsetSchema be an array? So far, these potential values are envisioned:

Should it just be called "dataType"? Do we need to worry about other usage of "schema"? We have "stringType" already.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

file listings

Example proposed output, note x_parameterSchema

Message sent 2026-04-06 to HAPI dev mailing list with status update:

2026-04-20

How to handle duration of files and events

How to handle duplicate times in file listings or event lists

Comments on case and capitalization

Defining the schema for what the parameters are

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally