Debugging query performance
The features described in this section are currently in alpha testing and are likely to change. Do not use the features in production systems.
Query performance directly impacts your application's scalability. Debug notices help you understand how your queries run internally and identify opportunities for optimization.
See the Performance considerations article for core principles and best practices for optimizing queries. Use debug notices to see details about filtering efficiency, index usage, sorting behavior, and execution plans.
Debug notices highlight common performance patterns but don't detect all inefficiencies.
For full details on debug notices, see the alpha version of the API specification. Look for the schema under datamodels/instances/query
or datamodels/instances/sync
sections. The request schema shows configuration options, and the response schema lists all possible debug notices.
Enabling debug notices
-
Add a
debug
object to your API request to activate debug notices:{
"debug": {}
} -
If your query times out, increase the timeout value and disable query results output:
{
"debug": {
"emitResults": false,
"timeout": 20000
}
}obs!The maximum timeout is 60,000 milliseconds (60 seconds). Specify timeout values in milliseconds.
Understand grades and profiling
Debug notices highlight both positive and negative aspects of your query execution. They warn about potential performance issues and confirm when your query is well optimized.
Grades
Each debug notice includes a grade that shows the importance or impact of the notice. Use this grading system to prioritize issues and recognize good practices:
Grade | Description |
---|---|
A | Excellent; best practice followed |
B | Good; minor optimizations possible |
C | Moderate concern; worth reviewing |
D | Significant issue; likely to affect performance |
E | Critical issue; urgent attention needed |
The grading is conservative by default. Enable detailed profiling by setting profile: true
to see the full spectrum, including critical warnings (E) and strong positive confirmations (A). Profiling provides deeper insights for more accurate grading.
Profiling
The profile
option enables detailed query profiling, which gathers runtime statistics like execution timing and resource usage for each query part.
Profiling focuses on internal diagnostics, and you must disable query results by setting emitResults
to false
. Use this analysis for complex queries to uncover hidden performance bottlenecks.
Activate profiling with this basic example:
{
"debug": {
"emitResults": false,
"profile": true,
"timeout": 30000
}
}
Emit execution plans
Request the underlying PostgreSQL execution plan as part of debug notices by enabling the appropriate debug options. The execution plans show how the query processes at the database level.
Execution plans help you analyze and diagnose query performance issues. If you have advanced PostgreSQL knowledge, you can also use these plans to understand query execution strategy and identify optimizations.
Include execution plans with this minimal example:
{
"debug": { "includePlan": true }
}
Handle common debug notices
This section focuses on specific debug notices the API returns and how to resolve them.
Quick reference
Debug notice | Issue type | Quick fix |
---|---|---|
sortNotBackedByIndex | Missing index | Add matching sort index. |
unindexedThrough | Missing index | Add index on relation target. |
significantPostFiltering | Query structure | Move filters earlier. |
significantHasDataFiltering | View complexity | Add requires constraints. |
sortNotBackedByIndex
Problem: Your query uses a sort order without a cursorable index. The system performs an in-memory sort at query time, which increases resource usage and slows performance as data scales.
Solution: Ensure your data model includes a cursorable index that matches your query's sort order. This matters especially when sorting by multiple properties. Your query's sort order must exactly match the property order in the index for the database to use it efficiently.
If your query contains this sort:
{
"sort": [
{ "property": ["space", "container", "A"], "direction": "ascending" },
{ "property": ["space", "container", "B"], "direction": "descending" }
]
}
Ensure that "container"
has a compound index on properties A,B
in that order. The order matters, and an index on B,A
won't work. Create the index using NEAT, Schema Service, or the DMS API. To create the index using the DMS API, send this request to models/containers
:
{
"items": [
{
"space": "space",
"externalId": "container",
"indexes": {
"index-identifier": {
"properties": ["A", "B"],
"cursorable": true
}
}
}
]
}
Replace index-identifier
with an appropriate identifier for your index.
unindexedThrough
Problem: Your query uses a through
clause to traverse a direct relation. This notice appears when the traversal direction requires a missing index. Traversing a direct relation inwards
requires an index on the relation's target. Without it, the query scans many nodes to find the ones pointing to the source, which is inefficient and won't scale.
Solution: Define an index on the property that is the target of the inwards direct relation.
If your query looks like this:
{
"with": {
"0": {
"nodes": {
"filter": {
"hasData": [
[
"space",
"viewA",
"1"
]
]
}
}
},
"1": {
"edges": {
"from": "0",
"through": {
"source": {
"type": "container",
"space": "space",
"externalId": "containerB"
},
"identifier": "directRelationB",
"direction": "inwards"
}
}
},
"select": {
"1": {}
}
}
}
Because the direction is "inwards", the query engine needs to efficiently look up nodes in "containerB" that have a direct relation named "directRelationB" pointing to the nodes from step "0". Create an index on the "directRelationB" property within "containerB" as shown in the sortNotBackedByIndex
section.
significantPostFiltering
Problem: The query engine processed many intermediate rows before filtering them to the final result set. This is inefficient because significant work goes to data that gets discarded. This happens when filters aren't selective enough in early stages of a chained query.
Solution: Review your query to introduce more selective filters earlier. Use your knowledge about the data graph to narrow intermediate results as much as possible.
Consider this query that fetches all pumps in space "SiteA" with an edge from another pump:
{
"with": {
"0": { "nodes": { "filter": { "hasData": [["schema", "Pump", "v1"]] }}},
"1": { "edges": { "from": "0" }},
"2": {
"nodes": {
"from": "1",
"filter": {
"and": [
{ "hasData": [["schema", "Pump", "v1"]] },
{ "equals": {"property": ["node", "space"], "value": "SiteA"} }
]
}
}
}
},
"select": { "2": {} }
}
Steps 0 and 1 could fetch many pumps and connecting edges from all sites before step 2 filters them to just "SiteA".
Use additional graph knowledge to make the query more efficient. If you know all relevant pumps are at the same site, move the space filter to the first step. If you know the relevant edges are always type "FLOWS_TO", add this filter to step 1.
Apply these filters early to dramatically reduce the data processed in subsequent steps for better performance.
Modified and improved query:
{
"with": {
"0": {
"nodes": {
"filter": {
"and": [
{ "hasData": [["schema", "Pump", "v1"]] },
{ "equals": { "property": ["node", "space"], "value": "SiteA" } }
]
}
}
},
"1": {
"edges": {
"from": "0",
"filter": {
"equals": {"property": ["edge", "type"], "value": {"space": "types", "externalId": "FLOWS_TO"}}
}
}
},
"2": {
"nodes": {
"from": "1",
"filter": { "hasData": [["schema", "Pump", "v1"]] }
}
}
},
"select": { "2": {} }
}
significantHasDataFiltering
Problem: A hasData
filter applies to a view composed of many underlying containers. To satisfy the filter, the query engine may perform a join operation for each container to check for data existence. This becomes computationally expensive and slows your query.
Solution: Help the query planner optimize your query by adding requires
constraints to container definitions used in the query. If you know your data model, for example, that data presence in one container implies data presence in another, use this to help the query engine prune unnecessary joins.
Suppose you have view V
that maps three containers: A
, B
, and C
. You know that any node with data in container B
or C
is guaranteed to have data in container A
.
Without optimization, a hasData
filter on V
checks for data in A
AND B
AND C
, resulting in multiple joins.
By defining requires constraints on containers B
and C
that point to A
, you inform the query engine of this dependency. When it sees a hasData
filter on V
, it understands it only needs to check for data in container A
. This simplifies the query, reduces joins, and can significantly improve performance.