Time Series Kind Formats
Properties Shared by All Time Series Based Formats
- Frames should be sorted by the time column/field in ascending order3
- The Time field(s):
- Should have no null values
- Field name is for display purposes only, there should be no labels
- Per each frame, any additional time fields after the first time field are treated as remainder data
- Value Field(s)
- Value fields are called this because it is the field where the value of each datapoint (time,value) is located.
- It can be a numeric or bool field. For numeric values
- Go: Float64, *Float64, or Int64 etc
- in JS 'number'
- The series name comes the Value Field's Name property
Invalid Cases
- There is not at least both a time field and a value field (unless the single frame "no data" case)
- The "No Data" case is present (a frame with no fields) alongside data
- Possibly Warning and not error:
- Duplicate items (identified by name+dimensions)
- Unsorted (time is not sorted from old to new)
Time Series Wide Format (TimeSeriesWide)
Version: 0.1
The wide format has a set of time series in a single Frame that share the same time field. It is called "wide" because it gets wider as more series are added.
Example:
Type: Time Name: T Labels: nil | Type: Number Name: cpu Labels: {"host": "a"} | Type: Number Name: cpu Labels: {"host": "b"} |
2022-04-27 5:00 | 1 | 6 |
2022-04-27 6:00 | 4 | 8 |
2022-04-27 7:00 | 2 | 5 |
2022-04-27 8:00 | 3 | 9 |
It should have the following properties: (Also see Shared Properties):
- The first field of type Time is the time index of all the time series.
- There should be only one Frame with the data type declaration.
- There should be at least one field that is a value Field Type
- If there are multiple numeric fields, the combination of the time field with each value field in the frame creates each time series (metric)
- The time field should have no duplicate values (duplicate timestamps).
Remainder Data:
- Any additional Frames without the type declaration or a different declaration
- Any string fields in the Frame
Notes:
- A Go example of an approximation of this is here.
Time Series Multi Format (TimeSeriesMulti)
Version: 0.1
The TimeSeriesMulti format has one time series per frame. If the response has multiple series where the time values may not line up, this format must be used over TimeSeriesWide. The format is called "multi" because the data lives across multiple data frames.
Example:
Frame 0:
Type: Time Name: T Labels: nil | Type: Number Name: cpu Labels: {"host": "a"} |
2022-04-27 5:00 | 1 |
2022-04-27 6:00 | 4 |
2022-04-27 7:00 | 2 |
2022-04-27 8:00 | 3 |
Frame 1:
Type: Time Name: T Labels: nil | Type: Number Name: cpu Labels: {"host": "b"} |
2022-04-27 5:00 | 6 |
2022-04-27 6:00 | 8 |
2022-04-27 7:00 | 5 |
2022-04-27 8:00 | 9 |
It should have the following properties: (Also see Shared Properties):
- Each frame should have at least time and one numeric value column. The first occurrence of each field of this type is used for the series.
- Different Frames can have different field lengths (but within a frame, they must be of the same length)
- Each time field should have no duplicate values (duplicate timestamps)
Remainder Data:
- Any numeric or time fields after the first of each in each frame
- Any additional Frames without the type declaration or a different declaration
- Any string fields in the Frame
Notes:
- Go example here.
- The multi format is the only format that can be converted to from the other formats without data manipulation. Therefore it is a type that can contain the series information of all the other types.
Time Series Long Format (TimeSeriesLong) [SQL-Like]
Version: 0.1
This is a response format common to SQL like systems4. See Grafana documentation: Multiple dimensions in table format for some more simple (but not complete) examples. It currently exists as a data transformation within some datasources5 in the backend that query SQL-like data, see this Go Example for how that code works.
The format is called "Long" because there are more rows to hold the same series than the "wide" format and therefore it grows longer.
Example:
Type: Time Name: T Labels: nil | Type: String Name: host Labels: nil | Type: Number Name: cpu Labels: nil |
2022-04-27 5:00 | a | 1 |
2022-04-27 5:00 | b | 6 |
2022-04-27 6:00 | a | 4 |
2022-04-27 6:00 | b | 8 |
2022-04-27 7:00 | a | 2 |
2022-04-27 7:00 | b | 5 |
2022-04-27 8:00 | a | 3 |
2022-04-27 8:00 | b | 9 |
It should have the following properties: (Also see Shared Properties)::
- The first time field is used as the timestamps
- The Time field can have duplicate timestamps (but must be sorted in ascending time)
- There may optionally be string fields. For each string field:
- The column/field Name is the dimension (e.g. "label") name
- Corresponding string values in that field (by row) are the label values
- Series are constructed by iterating over the rows of the dataframe table response.
- The name of any value fields/columns becomes the name for each series
- The labels property of fields is not used
Remainder Data:
- Any additional time fields after the first
- Any additional Frames without the type declaration or a different declaration
Additional Properties or Considerations:
- In this format, the full dimension (e.g. "host"=value) is extracted from the values within a field, instead of being declared within the fields schema like the other formats.
- Since dimensions are represented in fields that are present for all derived series, this can not hold mixed dimension keys so all series will have the same set of dimension keys. For example, one could not have net.bytes{host="a"} and net.bytes{host="a",int="eth0"} together - the first would have to become net.bytes{host="a",int=""}
- It is unclear if a bool type Field should be considered a value field (e.g. and up/down metric) or a dimension (where it would be treated conceptually like labels)
Converting Between Time Series Formats
Src | Dst | Modifies Data | Properties |
Wide | Multi | No[^6] |
|
Multi | Wide | Yes |
|
Wide | Long[^7] | Yes |
|
Long | Wide | Yes[^8] |
|
Long | Multi | No |
|
Multi | Long | Yes |
|
Notes
- This is because sorting is generally expensive in terms of resources, and is best done by the database behind a datasource in most cases.↩
- I don't believe our current SQL datasources strictly follow this, but some Azure ones do. This was either due to miscommunication about the intent of this format and the upgrade to Grafana 8 and/or lack of understanding about breaking changes, or both.↩
- This transformation happens when queried with "Format As=Time Series". The problem with the transformation happening at this stage of the pipeline is that while it does give the user Time Series for a common Time Series in Table format, it makes it so the "Table View" of the data doesn't like up with SQL returns from their query. TODO: Define this general concept later, maybe call it "What you see is NOT what you get", "Data Miscommunication", something. This means we either need to return two things (sort of like exemplars?), or the operation should be moved, or something else.↩