Skip to main content

Transformation architecture and purpose

A transformation specifies how source data is mapped and written to a target structure in CDF. Transformations can enrich data with other sources or calculations, contextualize it by matching related objects, and check data quality before it reaches downstream users. A common pattern is to read staged data from CDF RAW and write it to a structured target such as a data model. This helps you deliver consistent, queryable data to apps and workflows, for example when you standardize equipment metadata before it is used in dashboards. Transformations are developer-centric tools for data engineers and developers who define schemas, write SQL, and manage pipelines. CDF Transformations run on a managed Spark SQL engine. You express logic in SQL (or map fields in the UI), and CDF handles scheduling, scaling, and access to CDF data sources.
  • Start with SQL and use transformations when your logic is declarative and best expressed as set-based operations across tables.
  • Use CDF Functions when you need Python logic, external libraries, or custom API calls.
  • Use the Cognite Toolkit when you want to manage transformations as code. The Toolkit lets you define transformations, schedules, and notifications in YAML (with optional SQL files) and deploy them through CI/CD.
  • You can also run transformations using the Cognite API, and the Cognite Python SDK.
Avoid transformations for high-volume or low-latency writes such as high-frequency datapoints. For these scenarios, prioritize direct ingestion pipelines or Files to keep latency and throughput within system limits.

Target selection and operational constraints

Decide where data should land early, and avoid defaulting to the CDF staging area (RAW) as the final destination. Use RAW for staging, then write to a target that matches how the data will be used.
Treat RAW as a staging layer and align the target with how the data will be used.

Create a transformation

Define and run your first transformation in CDF.

SQL syntax and functions

Reference for SQL syntax and custom functions.
Last modified on March 18, 2026