Skip to main content
Cognite Data Fusion (CDF) enforces rate and concurrency limits on API endpoints to ensure fair usage and system stability. Both the rate of requests (denoted as requests per second, or rps) and the number of concurrent (parallel) requests are governed by limits for all CDF API endpoints. If a request exceeds one of the limits, it will be throttled with a 429: Too Many Requests response.
For more on limit types and how to avoid being throttled, see Request throttling.

How request budgets work

Limits are defined at both the overall API service level and on the API endpoints belonging to the service. Some types of requests consume more resources (compute, storage IO) than others. CDF categorizes endpoints into three request budget types:
  • CRUD endpoints (Create, Retrieve, Request ByIDs, Update, and Delete) are less resource-intensive and receive the largest share of the overall budget.
  • Analytical endpoints (List, Search, and Filter) are more resource-intensive and receive a smaller budget.
  • Aggregation endpoints are the most resource-intensive and receive a dedicated sub-allocation within the analytical budget.
The CRUD and analytical budgets are sub-allocations of the overall API-level budget. The aggregation budget is a further sub-allocation of the analytical budget.
These limits are subject to change, pending review of changing consumption patterns and resource availability over time.

Rate limits by API

Assets

See Assets for the concept overview.
Budget levelOverallPer identity
API levelRequests per second200150
Concurrent requests4030
CRUDRequests per second170130
Concurrent requests3023
AnalyticalRequests per second3023
Concurrent requests1512
AggregationRequests per second1512
Concurrent requests75

Events

See Events for the concept overview.
Budget levelOverallPer identity
API levelRequests per second200150
Concurrent requests5040
CRUDRequests per second170130
Concurrent requests4030
AnalyticalRequests per second3023
Concurrent requests1512
AggregationRequests per second1512
Concurrent requests75

Files

See Files for the concept overview.
Budget levelOverallPer identity
API levelRequests per second160120
Concurrent requests3023
CRUDRequests per second130100
Concurrent requests2317
AnalyticalRequests per second3023
Concurrent requests1512
AggregationRequests per second1512
Concurrent requests75

Raw

See Raw for the concept overview. The Raw service uses concurrency and data rate limits rather than rps limits. Under high load, some deviation may occur for short periods as the service scales up.
LimitPer projectPer identity
Concurrency64 parallel requests48 parallel requests
Data rate (retrieve)8.3 GB / 10 minutes6.6 GB / 10 minutes
Data rate (insert)1.6 GB / 10 minutes1.3 GB / 10 minutes

Streams

See Streams for the concept overview. As streams are intended to be long-lived, users are not expected to interact with these endpoints frequently.
OperationOverallPer identity
Create / DeleteRequests per second21
Concurrent requests11
Retrieve / ListRequests per second53
Concurrent requests32

Records

See Records for the concept overview. Records limits vary between mutable and immutable streams. Query endpoints (Sync, Retrieve, Aggregate) share a common query budget, while Retrieve and Aggregate each have additional dedicated budgets. Limits for query endpoints (Sync, Retrieve, Aggregate) have a hierarchical structure:
  • All query endpoints share a common Query request budget.
  • The Retrieve endpoint has an additional dedicated budget checked first.
  • The Aggregate endpoint has an additional dedicated budget checked first.
The Sync endpoint only checks the Query request budget. The Retrieve and Aggregate endpoints must pass both their dedicated budget check and the Query request budget check.
For example, with mutable streams, you can make up to 40 rps total across all query endpoints (Query budget limit), but only up to 20 of those can be Retrieve requests (Retrieve budget limit) and only up to 15 can be Aggregate requests (Aggregate budget limit).This means you could make: 20 Retrieve + 15 Aggregate + 5 Sync = 40 total RPS.
Query performance and rate limits vary between mutable and immutable streams due to their different storage characteristics.
  • Mutable streams provide consistent high-performance queries and higher rate limits.
  • Immutable streams are optimized for ingesting very large amounts of data, resulting in lower query performance and stricter rate limits.
When designing your data access patterns, use mutable streams for high-performance queries with higher rate limits, or immutable streams for high-volume data ingestion and long-term storage. The amount of data the service can return in responses to query endpoints (Sync, Retrieve, Aggregate) is also limited. We recommend reading only the data you will actually use (by providing appropriate filter and/or limiting the sources to be retrieved) rather than retrieving excessive amounts of unused data.
BudgetOverallPer identity
IngestRequests per second4030
Concurrent requests2015
Query (mutable)Requests per second4030
Concurrent requests3022
Query (immutable)Requests per second107
Concurrent requests107
Response MB per second43
The Retrieve and Aggregate endpoints have dedicated budgets checked in addition to the Query request budget. A request to these endpoints must pass both budget checks.
BudgetOverallPer identity
Retrieve (mutable)Requests per second2015
Concurrent requests2015
Retrieve (immutable)Requests per second107
Concurrent requests107
Aggregate (mutable)Requests per second1512
Concurrent requests107
Aggregate (immutable)Requests per second54
Concurrent requests54
Retrieve and Aggregate endpoint requests are checked against both their dedicated budget and the Query request budget.
Summary: Sync endpoint is limited only by Query request budget. Retrieve endpoint is limited by Retrieve request budget AND Query request budget. Aggregate endpoint is limited by Aggregate request budget AND Query request budget.

Transformations

See Transformations for the concept overview. Transformations use concurrency limits rather than rps limits:
  • 10 parallel jobs per project

Simulators

See Simulators for the concept overview.
LimitValue
Requests per minute1000
For simulator-specific troubleshooting and best practices, see Simulator troubleshooting.

Translating rps into data speed

A single request can retrieve up to 1000 items. In the context of Assets, Events, or Files, 1 item = 1 record. The maximum theoretical data speed at the API level is:
APIOverallPer identity
Assets / Events200,000 items/sec150,000 items/sec
Files160,000 items/sec120,000 items/sec
For analytical queries (List, Filter, Search), the per-identity budget of 23 rps gives a theoretical maximum of 23,000 items per second per identity.

Throttling response

CDF returns an HTTP 429 Too Many Requests response in the following cases:
  • Rate limiting — you send too many requests in a given time period.
  • Concurrency limiting — you send too many concurrent (parallel) requests.
  • Conflicting requests — concurrent requests try to modify the same resource (for example, two requests updating the same asset simultaneously).
The response body explains the cause:
{
  "error": {
    "code": 429,
    "message": "Too Many Requests"
  }
}
All requests contribute to throttling equally regardless of their source (extractors, functions, transformations, UI, SDKs, etc.).
Throttled requests still consume some resources and may be counted toward your budget. For example, a request throttled due to concurrency may still consume rate capacity.

Conflicting requests

Requests are considered conflicting when they concurrently try to modify the same resource. For example, if two concurrent requests update the same asset, one of them may be throttled. To avoid conflicting requests:
  • Instead of replicating every modification from the source system (“change log”-style updates), replicate only the latest state.
  • Reduce the number of parallel workers sending write requests.

Best practices

Use exponential backoff

When you receive a 429 response, retry the request after an increasing delay. A truncated exponential backoff strategy works well:
  1. Wait 1 second, then retry.
  2. If still throttled, wait 2 seconds, then retry.
  3. Continue doubling the wait time up to a maximum (for example, 32 seconds).
For more on implementing backoff strategies, see Microsoft’s retry guidance and Google’s retry strategy.

Use parallel retrieval wisely

Use parallel retrieval when a single request flow is slow due to query complexity. By splitting requests across partitions, you can tune data retrieval performance to meet your application’s needs. Use at most 10 partitions. For example, using the Assets or Events API request budget:
  • A single request can retrieve up to 1000 items.
  • Up to 23 rps can be issued for analytical queries (per identity), such as /list or /filter.
  • This provides a theoretical maximum of 23,000 items per second per identity.
  • If query complexity causes a single request to take longer than 1 second, you can retrieve fewer items per request and use parallel requests up to the theoretical maximum.
Parallel retrieval does not act as a speed multiplier on optimally running queries. Regardless of the number of concurrent requests, the overall requests per second limit still applies.For example, a single request returning data at approximately 18,000 items per second will only benefit marginally from a second parallel request — only an additional 5,000 items per second will return before the budget limit is reached.

Monitor your usage

  • Log API call timestamps and response codes to track request patterns.
  • Set up alerts for 429 responses to detect when you approach limits.
  • Distribute requests over time rather than making burst requests.

Use unique client credentials per application

Do not share client IDs and secrets across multiple applications, even if they have common authentication requirements. Shared credentials cause applications to compete for the same per-identity budget, and can also cause issues with audit logs.

Request only what you need

Reduce the data returned per request to stay within response size limits:
  • Use filters to narrow results.
  • Limit the sources or properties you retrieve.
  • Use appropriate page sizes (the limit parameter) for pagination.
Last modified on April 23, 2026