API rate limits - Cognite Docs

Cognite Data Fusion (CDF) enforces rate and concurrency limits on API endpoints to ensure fair usage and system stability. Both the rate of requests (denoted as requests per second, or rps) and the number of concurrent (parallel) requests are governed by limits for all CDF API endpoints. If a request exceeds one of the limits, it will be throttled with a 429: Too Many Requests response.

For more on limit types and how to avoid being throttled, see Request throttling.

How request budgets work

Limits are defined at both the overall API service level and on the API endpoints belonging to the service. Some types of requests consume more resources (compute, storage IO) than others. CDF categorizes endpoints into three request budget types:

CRUD endpoints (Create, Retrieve, Request ByIDs, Update, and Delete) are less resource-intensive and receive the largest share of the overall budget.
Analytical endpoints (List, Search, and Filter) are more resource-intensive and receive a smaller budget.
Aggregation endpoints are the most resource-intensive and receive a dedicated sub-allocation within the analytical budget.

The CRUD and analytical budgets are sub-allocations of the overall API-level budget. The aggregation budget is a further sub-allocation of the analytical budget.

These limits are subject to change, pending review of changing consumption patterns and resource availability over time.

Rate limits by API

Assets

See Assets for the concept overview.

Budget level		Overall	Per identity
API level	Requests per second	200	150
	Concurrent requests	40	30
CRUD	Requests per second	170	130
	Concurrent requests	30	23
Analytical	Requests per second	30	23
	Concurrent requests	15	12
Aggregation	Requests per second	15	12
	Concurrent requests	7	5

Events

See Events for the concept overview.

Budget level		Overall	Per identity
API level	Requests per second	200	150
	Concurrent requests	50	40
CRUD	Requests per second	170	130
	Concurrent requests	40	30
Analytical	Requests per second	30	23
	Concurrent requests	15	12
Aggregation	Requests per second	15	12
	Concurrent requests	7	5

Files

See Files for the concept overview.

Budget level		Overall	Per identity
API level	Requests per second	160	120
	Concurrent requests	30	23
CRUD	Requests per second	130	100
	Concurrent requests	23	17
Analytical	Requests per second	30	23
	Concurrent requests	15	12
Aggregation	Requests per second	15	12
	Concurrent requests	7	5

Raw

See Raw for the concept overview. The Raw service uses concurrency and data rate limits rather than rps limits. Under high load, some deviation may occur for short periods as the service scales up.

Limit	Per project	Per identity
Concurrency	64 parallel requests	48 parallel requests
Data rate (retrieve)	8.3 GB / 10 minutes	6.6 GB / 10 minutes
Data rate (insert)	1.6 GB / 10 minutes	1.3 GB / 10 minutes

Streams

See Streams for the concept overview. As streams are intended to be long-lived, users are not expected to interact with these endpoints frequently.

Operation		Overall	Per identity
Create / Delete	Requests per second	2	1
	Concurrent requests	1	1
Retrieve / List	Requests per second	5	3
	Concurrent requests	3	2

Records

See Records for the concept overview. Records limits vary between mutable and immutable streams. Query endpoints (Sync, Retrieve, Aggregate) share a common query budget, while Retrieve and Aggregate each have additional dedicated budgets. Limits for query endpoints (Sync, Retrieve, Aggregate) have a hierarchical structure:

All query endpoints share a common Query request budget.
The Retrieve endpoint has an additional dedicated budget checked first.
The Aggregate endpoint has an additional dedicated budget checked first.

The Sync endpoint only checks the Query request budget. The Retrieve and Aggregate endpoints must pass both their dedicated budget check and the Query request budget check.

For example, with mutable streams, you can make up to 40 rps total across all query endpoints (Query budget limit), but only up to 20 of those can be Retrieve requests (Retrieve budget limit) and only up to 15 can be Aggregate requests (Aggregate budget limit).This means you could make: 20 Retrieve + 15 Aggregate + 5 Sync = 40 total RPS.

Query performance and rate limits vary between mutable and immutable streams due to their different storage characteristics.

Mutable streams provide consistent high-performance queries and higher rate limits.
Immutable streams are optimized for ingesting very large amounts of data, resulting in lower query performance and stricter rate limits.

When designing your data access patterns, use mutable streams for high-performance queries with higher rate limits, or immutable streams for high-volume data ingestion and long-term storage. The amount of data the service can return in responses to query endpoints (Sync, Retrieve, Aggregate) is also limited. We recommend reading only the data you will actually use (by providing appropriate filter and/or limiting the sources to be retrieved) rather than retrieving excessive amounts of unused data.

Records request budgets (detailed tables)

Budget		Overall	Per identity
Ingest	Requests per second	40	30
	Concurrent requests	20	15
Query (mutable)	Requests per second	40	30
	Concurrent requests	30	22
Query (immutable)	Requests per second	10	7
	Concurrent requests	10	7
Response MB per second	4	3

The Retrieve and Aggregate endpoints have dedicated budgets checked in addition to the Query request budget. A request to these endpoints must pass both budget checks.

Budget		Overall	Per identity
Retrieve (mutable)	Requests per second	20	15
	Concurrent requests	20	15
Retrieve (immutable)	Requests per second	10	7
	Concurrent requests	10	7
Aggregate (mutable)	Requests per second	15	12
	Concurrent requests	10	7
Aggregate (immutable)	Requests per second	5	4
	Concurrent requests	5	4

Retrieve and Aggregate endpoint requests are checked against both their dedicated budget and the Query request budget.

Summary: Sync endpoint is limited only by Query request budget. Retrieve endpoint is limited by Retrieve request budget AND Query request budget. Aggregate endpoint is limited by Aggregate request budget AND Query request budget.

Transformations

See Transformations for the concept overview. Transformations use concurrency limits rather than rps limits:

10 parallel jobs per project

Simulators

See Simulators for the concept overview.

Limit	Value
Requests per minute	1000

For simulator-specific troubleshooting and best practices, see Simulator troubleshooting.

Translating rps into data speed

A single request can retrieve up to 1000 items. In the context of Assets, Events, or Files, 1 item = 1 record. The maximum theoretical data speed at the API level is:

API	Overall	Per identity
Assets / Events	200,000 items/sec	150,000 items/sec
Files	160,000 items/sec	120,000 items/sec

For analytical queries (List, Filter, Search), the per-identity budget of 23 rps gives a theoretical maximum of 23,000 items per second per identity.

Throttling response

CDF returns an HTTP 429 Too Many Requests response in the following cases:

Rate limiting — you send too many requests in a given time period.
Concurrency limiting — you send too many concurrent (parallel) requests.
Conflicting requests — concurrent requests try to modify the same resource (for example, two requests updating the same asset simultaneously).

The response body explains the cause:

{
  "error": {
    "code": 429,
    "message": "Too Many Requests"
  }
}

All requests contribute to throttling equally regardless of their source (extractors, functions, transformations, UI, SDKs, etc.).

Throttled requests still consume some resources and may be counted toward your budget. For example, a request throttled due to concurrency may still consume rate capacity.

Conflicting requests

Requests are considered conflicting when they concurrently try to modify the same resource. For example, if two concurrent requests update the same asset, one of them may be throttled. To avoid conflicting requests:

Instead of replicating every modification from the source system (“change log”-style updates), replicate only the latest state.
Reduce the number of parallel workers sending write requests.

Best practices

Use exponential backoff

When you receive a 429 response, retry the request after an increasing delay. A truncated exponential backoff strategy works well:

Wait 1 second, then retry.
If still throttled, wait 2 seconds, then retry.
Continue doubling the wait time up to a maximum (for example, 32 seconds).

For more on implementing backoff strategies, see Microsoft’s retry guidance and Google’s retry strategy.

Use parallel retrieval wisely

Use parallel retrieval when a single request flow is slow due to query complexity. By splitting requests across partitions, you can tune data retrieval performance to meet your application’s needs. Use at most 10 partitions. For example, using the Assets or Events API request budget:

A single request can retrieve up to 1000 items.
Up to 23 rps can be issued for analytical queries (per identity), such as /list or /filter.
This provides a theoretical maximum of 23,000 items per second per identity.
If query complexity causes a single request to take longer than 1 second, you can retrieve fewer items per request and use parallel requests up to the theoretical maximum.

Parallel retrieval does not act as a speed multiplier on optimally running queries. Regardless of the number of concurrent requests, the overall requests per second limit still applies.For example, a single request returning data at approximately 18,000 items per second will only benefit marginally from a second parallel request — only an additional 5,000 items per second will return before the budget limit is reached.

Monitor your usage

Log API call timestamps and response codes to track request patterns.
Set up alerts for 429 responses to detect when you approach limits.
Distribute requests over time rather than making burst requests.

Use unique client credentials per application

Do not share client IDs and secrets across multiple applications, even if they have common authentication requirements. Shared credentials cause applications to compete for the same per-identity budget, and can also cause issues with audit logs.

Request only what you need

Reduce the data returned per request to stay within response size limits:

Use filters to narrow results.
Limit the sources or properties you retrieve.
Use appropriate page sizes (the limit parameter) for pagination.

​How request budgets work

​Rate limits by API

​Assets

​Events

​Files

​Raw

​Streams

​Records

​Transformations

​Simulators

​Translating rps into data speed

​Throttling response

​Conflicting requests

​Best practices

​Use exponential backoff

​Use parallel retrieval wisely

​Monitor your usage

​Use unique client credentials per application

​Request only what you need

How request budgets work

Rate limits by API

Assets

Events

Files

Raw

Streams

Records

Transformations

Simulators

Translating rps into data speed

Throttling response

Conflicting requests

Best practices

Use exponential backoff

Use parallel retrieval wisely

Monitor your usage

Use unique client credentials per application

Request only what you need