Rate and concurrency limits for Cognite Data Fusion (CDF) API endpoints.
Cognite Data Fusion (CDF) enforces rate and concurrency limits on API endpoints to ensure fair usage and system stability.Both the rate of requests (denoted as requests per second, or rps) and the number of concurrent (parallel) requests are governed by limits for all CDF API endpoints. If a request exceeds one of the limits, it will be throttled with a 429: Too Many Requests response.
For more on limit types and how to avoid being throttled, see Request throttling.
Limits are defined at both the overall API service level and on the API endpoints belonging to the service. Some types of requests consume more resources (compute, storage IO) than others.CDF categorizes endpoints into three request budget types:
CRUD endpoints (Create, Retrieve, Request ByIDs, Update, and Delete) are less resource-intensive and receive the largest share of the overall budget.
Analytical endpoints (List, Search, and Filter) are more resource-intensive and receive a smaller budget.
Aggregation endpoints are the most resource-intensive and receive a dedicated sub-allocation within the analytical budget.
The CRUD and analytical budgets are sub-allocations of the overall API-level budget. The aggregation budget is a further sub-allocation of the analytical budget.
These limits are subject to change, pending review of changing consumption patterns and resource availability over time.
See Raw for the concept overview.The Raw service uses concurrency and data rate limits rather than rps limits. Under high load, some deviation may occur for short periods as the service scales up.
See Records for the concept overview.Records limits vary between mutable and immutable streams. Query endpoints (Sync, Retrieve, Aggregate) share a common query budget, while Retrieve and Aggregate each have additional dedicated budgets.Limits for query endpoints (Sync, Retrieve, Aggregate) have a hierarchical structure:
All query endpoints share a common Query request budget.
The Retrieve endpoint has an additional dedicated budget checked first.
The Aggregate endpoint has an additional dedicated budget checked first.
The Sync endpoint only checks the Query request budget. The Retrieve and Aggregate endpoints must pass both their dedicated budget check and the Query request budget check.
For example, with mutable streams, you can make up to 40 rps total across all query endpoints (Query budget limit), but only up to 20 of those can be Retrieve requests (Retrieve budget limit) and only up to 15 can be Aggregate requests (Aggregate budget limit).This means you could make: 20 Retrieve + 15 Aggregate + 5 Sync = 40 total RPS.
Query performance and rate limits vary between mutable and immutable streams due to their different storage characteristics.
Mutable streams provide consistent high-performance queries and higher rate limits.
Immutable streams are optimized for ingesting very large amounts of data, resulting in lower query performance and stricter rate limits.
When designing your data access patterns, use mutable streams for high-performance queries with higher rate limits, or immutable streams for high-volume data ingestion and long-term storage.The amount of data the service can return in responses to query endpoints (Sync, Retrieve, Aggregate) is also limited. We recommend reading only the data you will actually use (by providing appropriate filter and/or limiting the sources to be retrieved) rather than retrieving excessive amounts of unused data.
Records request budgets (detailed tables)
Budget
Overall
Per identity
Ingest
Requests per second
40
30
Concurrent requests
20
15
Query (mutable)
Requests per second
40
30
Concurrent requests
30
22
Query (immutable)
Requests per second
10
7
Concurrent requests
10
7
Response MB per second
4
3
The Retrieve and Aggregate endpoints have dedicated budgets checked in addition to the Query request budget. A request to these endpoints must pass both budget checks.
Budget
Overall
Per identity
Retrieve (mutable)
Requests per second
20
15
Concurrent requests
20
15
Retrieve (immutable)
Requests per second
10
7
Concurrent requests
10
7
Aggregate (mutable)
Requests per second
15
12
Concurrent requests
10
7
Aggregate (immutable)
Requests per second
5
4
Concurrent requests
5
4
Retrieve and Aggregate endpoint requests are checked against both their dedicated budget and the Query request budget.
Summary: Sync endpoint is limited only by Query request budget. Retrieve endpoint is limited by Retrieve request budget AND Query request budget. Aggregate endpoint is limited by Aggregate request budget AND Query request budget.
A single request can retrieve up to 1000 items. In the context of Assets, Events, or Files, 1 item = 1 record.The maximum theoretical data speed at the API level is:
API
Overall
Per identity
Assets / Events
200,000 items/sec
150,000 items/sec
Files
160,000 items/sec
120,000 items/sec
For analytical queries (List, Filter, Search), the per-identity budget of 23 rps gives a theoretical maximum of 23,000 items per second per identity.
CDF returns an HTTP 429 Too Many Requests response in the following cases:
Rate limiting — you send too many requests in a given time period.
Concurrency limiting — you send too many concurrent (parallel) requests.
Conflicting requests — concurrent requests try to modify the same resource (for example, two requests updating the same asset simultaneously).
The response body explains the cause:
{ "error": { "code": 429, "message": "Too Many Requests" }}
All requests contribute to throttling equally regardless of their source (extractors, functions, transformations, UI, SDKs, etc.).
Throttled requests still consume some resources and may be counted toward your budget. For example, a request throttled due to concurrency may still consume rate capacity.
Requests are considered conflicting when they concurrently try to modify the same resource. For example, if two concurrent requests update the same asset, one of them may be throttled.To avoid conflicting requests:
Instead of replicating every modification from the source system (“change log”-style updates), replicate only the latest state.
Reduce the number of parallel workers sending write requests.
Use parallel retrieval when a single request flow is slow due to query complexity. By splitting requests across partitions, you can tune data retrieval performance to meet your application’s needs. Use at most 10 partitions.For example, using the Assets or Events API request budget:
A single request can retrieve up to 1000 items.
Up to 23 rps can be issued for analytical queries (per identity), such as /list or /filter.
This provides a theoretical maximum of 23,000 items per second per identity.
If query complexity causes a single request to take longer than 1 second, you can retrieve fewer items per request and use parallel requests up to the theoretical maximum.
Parallel retrieval does not act as a speed multiplier on optimally running queries. Regardless of the number of concurrent requests, the overall requests per second limit still applies.For example, a single request returning data at approximately 18,000 items per second will only benefit marginally from a second parallel request — only an additional 5,000 items per second will return before the budget limit is reached.
Do not share client IDs and secrets across multiple applications, even if they have common authentication requirements. Shared credentials cause applications to compete for the same per-identity budget, and can also cause issues with audit logs.