Skip to main content

About data workflows

Beta

The features described in this section are currently in beta testing and are subject to change.

Data workflows is a managed process orchestration service within Cognite Data Fusion (CDF). Use data workflows to effectively coordinate the order and timely execution of interdependent processes such as CDF Transformations, Cognite Functions, dynamic tasks, and requests to the CDF APIs.

Core concepts of data workflows

The diagram below illustrates the core concepts of data workflows.

Flowchart for data workflow

Workflow: A workflow represents a collection of workflow versions, each tied to a workflow definition and its executions. A workflow is uniquely identified by an external ID.

Workflow definition: The workflow definition contains the details about the tasks to be executed and their interdependencies. It outlines the structural layout and progression of the workflow, precisely determining the sequence of tasks that require execution. This definition serves as a blueprint guiding the workflow's execution process.

Workflow version: Creating multiple versions of a workflow permits systematic handling of changes and iterations. A version is tied to a workflow definition and the executions of the workflow version. An update to the workflow definition doesn't automatically require a version change, meaning the user can decide how to handle versioning and updates in their particular case.

Task: A task constitutes a fundamental unit of work within a workflow. It holds the responsibility of triggering the execution of a particular process. Tasks can be Transformations, Cognite Functions, requests to CDF APIs, dynamic actions, and more.

For more information on task types, see Tasks in data workflows.

Workflow executions: The workflow executions document the history of runs for a particular workflow version. Whenever the run endpoint is invoked, a workflow definition is executed. A workflow execution incorporates particulars such as start time, end time, status, and relevant logs or execution details. Additionally, it contains granular task-level execution specifics that offer valuable insights into individual tasks' advancement, success, or potential failure. This comprehensive information facilitates active monitoring, debugging, and in-depth workflow performance analysis.