Why it matters
Latency outliers can impact user experience and system reliability even when median latency is low. If you assume every request completes quickly, a small number of slow queries can cause retries, thread exhaustion, cascading timeouts, and noisy alerts. Designing for variability helps your application stay responsive and predictable at scale.Where latency variability shows up
In data modeling, latency variability typically shows up when you query instances with these endpoints:/models/instances/list/models/instances/query(including GraphQL)
- p50 (median) latency is low.
- p90/p95 latency is higher.
- A small percentage of requests (p99) can be much slower.
- p50: 200 ms
- p90: 1.5 s
- p99: 4.5 s
Why latency can vary
Latency for/list and /query depends on several factors.
Query shape and index alignment
Filters and sorting that align with defined indexes are typically much more efficient than queries that require scans or expensive sorting. To learn how indexes affect query execution, see Performance considerations.Schema and view complexity
Wider views and more complex schemas can increase the amount of data the service needs to materialize and process. For example:- Views mapping many containers
- Queries that traverse relations (direct or reverse)
- Large property selection sets
Data volume and distribution
The number of instances and how values are distributed affects the cost of query execution. Two queries with the same structure can behave differently if the data distribution changes over time.Payload size
Large responses (for example, hundreds of KB to multiple MB) increase total time due to:- Serialization on the server
- Network transfer
Backend execution conditions
Even when you run the same query repeatedly, internal conditions can change:- Cache state (cold vs. warm)
- Execution plan preparation
- Resource scheduling and load distribution
What to expect
- Occasional slow requests (outliers) are expected.
- Outliers do not indicate service degradation if:
- Requests complete successfully (HTTP 200).
- You don’t see a sustained latency increase across most requests.
CDF provides availability guarantees, but not fixed per-request latency guarantees for individual endpoints.
How to design for latency variability
Design your SDKs and applications so they remain reliable when a small number of requests are slow.Don’t assume constant response time
Avoid designs that require every call to complete within a strict time budget (for example, “always under 0.5 seconds”).Use timeouts appropriate to context
Set client-side timeouts that match the user interaction context:- For interactive UX, keep timeouts shorter and show progress or partial results.
- For background jobs, allow longer timeouts and add retries.