Troubleshooting transformations

Use this guide to diagnose execution issues, performance problems, and monitoring gaps for CDF Transformations.

Operations and monitoring

Error 429 or 503: Service overload

Symptoms

Runs fail with HTTP 429 or 503 errors.
Multiple transformations fail in the same time window.

Cause

Peak concurrency or heavy workloads overload the service.

Resolution

Spread transformation start times to reduce concurrency peaks.
Orchestrate with Data workflows and add dependencies so jobs run only after prerequisites succeed.
Reduce per‑run data volume with incremental filters such as is_new().

Prevention

Use workflow orchestration to balance load and avoid synchronized schedules.
Rebalance schedules instead of relying on retries when overload persists.

Error 408: Timeout on long-running queries

Symptoms

Runs fail with HTTP 408.
Data model queries take a long time to complete.

Cause

Slow queries or large joins increase runtime.

Resolution

Apply incremental filters early to reduce scan volume.
Review query patterns in SQL patterns and best practices.
For data modeling sources, use the is_new() variant on cdf_nodes() or cdf_edges().
Simplify complex joins in data modeling and test with smaller scopes first.

Prevention

Keep transformations scoped to one resource type and avoid wide RAW scans.

Driver restarts or repeated failures on long jobs

Symptoms

Long runs fail and restart.
The same job repeatedly fails after long execution time.

Cause

Driver restarts can occur during rollouts or for transformation hygiene. The service starts a new driver to handle new requests and gives the old driver time to finish. Long‑running jobs may fail during this transition.

Resolution

Split large transformations into smaller, focused jobs.
Orchestrate retries with Data workflows for automatic recovery.
Use incremental processing to shorten run times.

Prevention

Favor incremental processing and smaller transformation scope.

Tooling limitations

Preview results differ from full runs

Symptoms

Preview succeeds but full run fails.
Preview returns unexpected results with multi‑source joins.
Preview times out on long‑running transformations.

Cause

Preview uses sampled data from the first rows and may not exercise full joins or filters. Preview is not a full execution and does not evaluate all input data.

Resolution

Validate logic with small but representative datasets.
Run a full execution on a limited scope (for example, a single source or time window).

Prevention

Treat preview as a sanity check, not as a performance or correctness benchmark.

Columns not found in RAW tables

Symptoms

Runs fail with column‑not‑found errors.
Queries worked previously but fail after new RAW writes.

Cause

RAW is schema‑less. Transformations infer schema from the first 10,000 rows. If a column is not present in that sample, it is not available to SQL.

Resolution

Use get_json_object for fields in semi‑structured payloads.
Insert a schema row with all expected columns and sort it to the top.

Prevention

Keep RAW tables stable and avoid frequent schema drift in the first rows.

Logging and diagnostics

Limited error detail in run history

Symptoms

Only request IDs are visible in run history.
Error messages do not show root cause details.

Cause

Run history surfaces high‑level errors without expanded context.

Resolution

Capture the full error response payload from the API or SDK.
Correlate request IDs with backend logs when available.
Use the expand control in run history to view detailed error messages when available.

Prevention

Use structured logging and store request IDs alongside transformation metadata.

Internal error or unknown failure

Symptoms

Run fails with a generic internal error.

Cause

Transient service issues or unhandled edge cases.

Resolution

Retry after reducing concurrency or data volume.
If the error persists, contact Cognite Support with the transformation ID and request ID.

Data engineering

Troubleshooting transformations

Operations and monitoring

Tooling limitations

Logging and diagnostics

Further reading

​Operations and monitoring

​Tooling limitations

​Logging and diagnostics

​Further reading

Operations and monitoring

Tooling limitations

Logging and diagnostics

Further reading