Skip to main content

Availability and business continuity

Service availability

We measure availability and performance metrics on specific tenants or CDF clusters per month, excluding downtime for maintenance. Availability refers to the fraction of valid requests that return successful responses during a calendar month.

Response codes and requests are aggregated over a calendar month, but traffic during maintenance windows is excluded from the calculation. We also make necessary adjustments to consider downtime resulting in zero traffic.

Maintenance and updates

We continuously release software updates and perform maintenance to keep services running smoothly. Updates are part of the continuous platform operation and don't require announcements or approvals.

Some services might be unavailable during maintenance windows.

We announce maintenance at least two weeks in advance and post notifications in the appropriate channels. We will work with you to agree on the best time for maintenance to minimize the impact on your production environments.

Because many clusters are multi-tenant, there is a chance that we can't accommodate all requests. In those cases, we must reserve the right to complete maintenance at the originally scheduled time.

Application availability

We measure application availability by the fraction of 1-minute intervals during a calendar month where:

  • less than 5% of the end-user page views are served with errors.
  • at least 90% of load time is less than 20 seconds.

The error rate is measured across CDF projects and from the server that runs the application.

Load time is measured on specific bookmarks for important pages in the application.

Calculations exclude maintenance windows.

API availability

We measure API availability by the fraction of valid requests that result in a successful response during a calendar month. Traffic that results in response code 429 is ignored. Calculations subtract the periods when the system is down for maintenance.

To calculate availability, we subtract the fraction of the months that have unplanned downtime from the success rate outside the planned maintenance windows. We don't consider the time and traffic during the maintenance windows.

Down minutes refers to the number of minutes during a month with zero traffic on the API gateway.

Extractors

You can see which extractors we support integration with by looking at the list of extractors in the CDF user interface. Extractors run on 3rd party operating systems and networks in your company environments.

You are responsible for downloading and deploying maintenance releases based on assessments communicated in CDF release notes or advice from Cognite Support.

When we deprecate an API, you must be aware of the deprecation of old extractor versions. We offer new versions and updated documentation of components interfacing with CDF APIs in a reasonable time before the deprecation of an API version. In those cases, you must install and use the current versions and releases of the extractors.

Backup and restore

We ensure that data for all resource types is backed up. You can read about resource types in this article. The backup frequency and retention is listed below. Backups are encrypted at rest, and stored in geo-redundant storage services provided by our cloud providers. In some cases, the backups are handled by our cloud providers, and Cognite will as a minimum always use a services that has backups stored on two different data centers. Cognite's backups are stored in at least two different availability zones (Azure) or two regions (Google) when available.

The following tables show the frequency and retention backups for resource types and the worst-case estimated time to restore the data. These tables only apply to backup restore Scenario A described under Request data restore.

Multi-tenant cluster backup and restore

Resource typeData typeBackup frequencyRetentionRestore
AuthenticationAllDaily6 weeks*1 business day
AssetsAllDaily6 weeks*5 business days
Data modeling**AllDaily6 weeks*5 business days
EventsAllDaily6 weeks*5 business days
FilesMetadataDaily6 weeks*5 business days
FilesFile contentDaily6 weeks*5 business days
Time series and sequencesAllWeekly6 weeks5 business days
3D modelsModel dataDaily6 weeks*5 business days
Application user settings (InField)User-made configurations, lists, charts, and settingsDaily28 days1 business day
Audit logsLogsContinuous400 days5 business days
RelationshipsAllDaily6 weeks*5 business days
AnnotationsAllDaily6 weeks*5 business days
LabelsAllDaily6 weeks*5 business days
Entitymatching, and engineering diagramsAllDaily6 weeks*5 business days
ExtpipesAllDaily6 weeks*5 business days
FunctionsAllDaily6 weeks*5 business days
TemplatesAllDaily6 weeks*5 business days
GeospatialAllDaily6 weeks*5 business days

*Backup and retention time for these resource types is 35 days in Azure-based CDF clusters. **Restore is only available for scenario A (see below).

Single-tenant cluster backup and restore

The following table describes the backup frequency, retention, and restore time targets valid for customers on a dedicated single-tenant CDF cluster.

Resource typeData typeBackup frequencyRetentionRestore
AuthenticationAllDaily6 weeks*1 business day
AssetsAllDaily6 weeks*1 business day
Data modelingAllDaily6 weeks*1 business day
EventsAllDaily6 weeks*1 business day
FilesMetadataDaily6 weeks*1 business day
FilesFile contentDaily6 weeks*1 business day
Time series, and sequencesAllWeekly6 weeks1 business day
3D modelsModel dataDaily6 weeks*1 business day
Application user settings (InField)User-made configurations, lists, charts, and settingsDaily28 days1 business day
Audit logsLogsContinuous400 days1 business day
RelationshipsAllDaily6 weeks*1 business day
AnnotationsAllDaily6 weeks*1 business day
LabelsAllDaily6 weeks*1 business day
Entitymatching, and engineering diagramsAllDaily6 weeks*1 business day
ExtpipesAllDaily6 weeks*1 business day
FunctionsAllDaily6 weeks*1 business day
TemplatesAllDaily6 weeks*1 business day
GeospatialAllDaily6 weeks*1 business day

*Backup and retention time for these resource types is 35 days in Azure-based CDF clusters.

Request data restore

There are two scenarios for restoring data:

  • Scenario A - Incident: Disaster recovery after an incident with a root cause outside your control.

  • Scenario B - Request: Backup/restore after a user mistake that has corrupted data.

The party responsible for recovery depends on the incident's root cause.

We can manage the request for restoring data in scenario B when the responsible party covers the time and material costs (labor costs and service fees) for the data restore work. Data restore in scenario B will only involve restoring data stores to a state found at a time prior to the incident triggering the request. Data can only be restored to the original project. We don't offer cross project restore. Cognite won't offer additional processing of data.

To request data restore

  1. Submit a request to Cognite support asking for data recovery.

    The request must contain detailed descriptions explaining:

    1. What data needs to be restored. This must be specified by a listing of the resource types that should be restored.
    2. Recovery Point Target Time: The point in time when the data was deleted or corrupted. Cognite will restore data to a point in time as close as we can get BEFORE the specified Recovery Point Target Time.
    3. The URL name and project ID of the CDF project to restore data to.
    4. The request must contain an approval allowing Cognite to block access to the CDF project during the restore operation.
    5. The request must contain an approval allowing Cognite to delete all data that's newer than the Recovery Point Target Time. The customer needs to give Cognite permission to delete data of the specified resource types listed in 1.1 that the customer has updated or created after the time specified in 1.2.
  2. You receive a reply with a time and material estimate and a plan for how the restore will happen.

  3. We start the data restore when the cost estimate and the plan are accepted.

Note

We can't promise that this data restore will happen as quickly as the restore times indicated in the overview tables, which only apply to scenario A.