Skip to main content

About Data workflows

Data workflows is a managed orchestration service within Cognite Data Fusion (CDF) that coordinates the execution of interdependent processes and tasks and integrates with CDF Transformations, Cognite Functions, API requests, and simulations.

This service includes built-in retry logic, error handling, and failure recovery, and lets you track execution history, monitor task status, and debug failures. Data workflows can be triggered on schedules or on data changes. These are the core concepts of data workflows:

Flowchart for data workflow

Workflows are collections of workflow versions, each tied to a workflow definition and the workflow version executions. An external ID uniquely identifies a workflow.

Workflow versions are used for systematic handling of changes and iterations. A version is tied to a workflow definition and the executions of the workflow versions. Updates to the workflow definition don't automatically require a version change. This means you can decide how to handle versioning and updates.

Workflow definitions contain the details about the tasks to be run and their interdependencies. The definition outlines the structural layout and progression of the workflow, determining the sequence of tasks to be executed, and serves as a blueprint for the execution process.

Tasks are fundamental units of work within a workflow. Tasks trigger the execution of a process, like running transformations or simulations, calling functions, making API requests, or orchestrating other tasks.

Workflow executions document the history of runs for a workflow version. Whenever the run endpoint is called, a workflow definition is executed. A workflow execution incorporates particulars such as start time, end time, status, and relevant logs or execution details. It also contains granular task-level execution specifics. This information gives insights into individual tasks' advancement, success, or potential failure and facilitates active monitoring, debugging, and in-depth workflow performance analysis.

Triggers automate the execution of workflows based on specific conditions. Instead of manually starting workflows, triggers can run workflows on a schedule or in response to data changes.