Time series in data modeling
Cognite's core data model has a built-in CogniteTimeSeries concept to ingest and retrieve time series. CogniteTimeSeries integrates with the time series API. To ingest and retrieve data points, you must use the same API.
This article describes how time series are handled differently in data modeling and in asset-centric data models.
Create time series
In data modeling, each time series is a node in the property graph. You can create, update, and delete time series using the same API as other instances. The instanceType
is node
, and the combination of space
and externalId
identifies the time series and links it to the data points.
Time series must be of a specific type: numeric
or string
. You set the type when creating the time series. The type affects how the data points are stored, and to prevent potential data loss, you can't change it later. The isStep
property determines how to interpolate between data points, and you can change the property later. Data modeling doesn't support all the time series properties of the asset-centric data model.
Create time series with the API
To create time series in data modeling, use the create or update nodes/edges endpoint.
The node must have data in the cdf_cdm:CogniteTimeSeries/v1
system view to be recognized as a time series.
The type
is the only mandatory property.
Example request
{
"items": [
{
"space": "test",
"externalId": "minimal_example",
"instanceType": "node",
"sources": [
{
"source": {
"type": "view",
"space": "cdf_cdm",
"externalId": "CogniteTimeSeries",
"version": "v1"
},
"properties": {
"name": "Time series A",
"type": "numeric"
}
}
]
}
]
}
Example request with all properties
{
"items": [
{
"space": "test",
"externalId": "hello",
"instanceType": "node",
"sources": [
{
"source": {
"type": "view",
"space": "cdf_cdm",
"externalId": "CogniteTimeSeries",
"version": "v1"
},
"properties": {
"type": "numeric",
"isStep": true,
"unit": {
"space": "cdf_cdm_units",
"externalId": "temperature:deg_c"
},
"sourceUnit": "C",
"name": "Hello, world!",
"description": "My new description",
"tags": ["TagA", "TagB"],
"aliases": ["hello_world"],
"assets": [],
"equipment": [],
"sourceId": "examples",
"sourceContext": "documentation",
"sourceCreatedTime": "2024-08-28T13:16:25.228Z",
"sourceUpdatedTime": "2024-08-28T13:20:25.228Z",
"sourceCreatedUser": "John Doe",
"sourceUpdatedUser": "Jane Doe"
}
}
]
}
]
}
Create time series with the Python SDK
from cognite.client import CogniteClient
from cognite.client.data_classes.data_modeling import DirectRelationReference, NodeId
from cognite.client.data_classes.data_modeling.cdm.v1 import CogniteTimeSeriesApply
# Instantiate Cognite SDK client:
client = CogniteClient()
# Insert a new CogniteTimeSeries
client.data_modeling.instances.apply(
CogniteTimeSeriesApply(
space="test",
external_id="hello",
name="Hello, world!",
is_step=True,
time_series_type="numeric",
unit=DirectRelationReference("cdf_cdm_units", "temperature:deg_c"),
)
)
# Insert data points
client.time_series.data.insert(
instance_id=NodeId("test", "hello"),
datapoints=[
(1724845953621, 0.0),
(1724845970101, 1.0),
],
)
# Retrieve data points
client.time_series.data.retrieve(
instance_id=NodeId("test", "hello"),
start=0,
)
For more examples using the Python SDK, please see the cognite-sdk documentation.
Access control
Data modeling uses spaces for governance and access control, not data sets like asset-centric data models. You need the dataModels:read
capability to the cdf_cdm
space to access time series. Also, you need dataModelInstances
access to the space with the time series. You have the same access to data points that you have to time series. If you have read/write access to a time series, you can also read/write its data points.
CogniteTimeSeries replicates to the asset-centric data model where they can be accessed and modified, but not deleted. The properties you set through data modeling are protected, and must be updated through data modeling.
If you modify dataSetId
or assetId
through the asset-centric data model, the time series and data points may become accessible to users without data modeling access. Users with access to all
time series will also have access to time series in data modeling.
Time series with a write-protected data set denies write access to data points, unless the user has datasets:owner
access to the data set. Properties set by data modeling aren't write-protected.
Data modeling doesn't support security categories.
Migrating time series from asset-centric data models to data modeling
Migrating time series from asset-centric data models to data modeling requires careful planning. If you currently rely on data sets or security categories to control access to time series, you must set up equivalent access control in data modeling.
Ensure you don't grant users unintended access, including access to delete time series and data points. Users with access to a space in data modeling get access to all time series and data points in the space. They can modify and delete time series and data points if they have write access.
Consistency
When a core data model time series is created, its metadata is first stored in the data model storage API, and the time series datapoints themselves are stored in the time series database. Then, the metadata is synced from data model storage into the 'classic' time series API. In most circumstances the data is immediately consistent (i.e. as soon as data points are ingested, they're available to query). However, there are some scenarios where the data is eventually consistent, because of the distributed nature of the storage (i.e. changes take time to propagate throughout the system), particularly where a large volume of changes are written at the same time.
Please account for this eventually consistent behaviour when defining client behaviour.
Search consistency
When using CDF Data Explorer, or the 'classic' time series API the search database will require time to be updated before the results are updated to the current status. Please allow sufficient time (up to a few minutes) for changes to propagate.
Deleting a Core Data Model Time Series instance
In the case of deletion, the core data model time series instance is deleted first, and after some time (typically within seconds, but at busy times can be over ten minutes) the corresponding time series API instance is deleted.
If data points continue to be written to the time series API even after the data model instance has been deleted, the data points may continue to appear to be being stored (i.e. write requests receive a 200 OK response) for a time, until the data points storage location has been deleted. After this time, all data will be deleted.
To avoid critical data loss, always ensure that data points are being written to their intended location before commencing deletion of time series core data model instances.
This deletion behaviour may also be problematic in scenarios where it is required to delete a time series instance and then recreate it.
In such a scenario, (specifically, where a recreated time series uses the same space
and externalId
as the deleted instance) if the process of deleting the data points storage location has not completed, it may transpire that the data points for the newly created instance are routed to the data points storage location that is marked for deletion. The resulting behaviour will be such that the newly written data points will be missing from the new instance as they were deleted as part of the previous deletion job.
Therefore, where it is required to delete a time series instance and then recreate it again straight away, Cognite recommends that a verification step is implemented to ensure that the removal of the old instance is fully completed and all internal caches are cleared.
For example, periodically send a POST
/byids
request to the Time Series API for the specific space
and external ID
you have deleted, ensuring that ignoreUnknownIds:true
. When the API returns a blank response, the instance is confirmed to have been deleted and it is now safe to recreate the instance. More details on the /byids
API endpoint may be found in the time series API documentation.
Not implementing this pause before recreating a time series introduces the risk of inserting data points into the soon-to-be deleted instance and consequently, the newly inserted data points will not be stored.
This information only applies to time series Core Data Model instances. This does not apply to time series instances created in the 'classic' Time Series API method.