Pular para o conteúdo principal

Manage data workflows

You can manage data workflows using the Cognite API or the Cognite Python SDK.

Before you start

To add the necessary capabilities for data workflows, see assign capabilities.

You can assign a workflow to a data set during workflow creation. All read or write operations on the workflow or related resources (versions, runs, tasks, and triggers) require access to this data set.

Data workflows in CDF

Beta

The data workflows user interface is currently in beta testing and is subject to change.

Data workflows automate and manage the running of tasks and processes in CDF.

To create a new data workflow:

  1. Navigate to Data management > Data workflows.

  2. Select + Create workflow and follow the wizard to start building your workflow.

When you work with data workflows, you can:

  • Add workflow tasks.

  • Create workflow triggers.

  • Switch between editing the workflow and viewing the workflow run history.

  • View or modify workflow versions.

  • Run the workflow, with or without input data.

Run data workflows

You can trigger a workflow run by invoking the API's /run endpoint, providing a workflowExternalId and version.

Authenticate a workflow

To start a workflow run, you require a nonce, a temporary token that is used for authentication when the workflow triggers the processes defined by its tasks. A nonce can be retrieved from the Sessions API when creating a session. The session will be created behind the scenes when triggering a workflow when using the Python SDK.

Input data and metadata

You can provide custom input data to the workflow run and refer to it in the workflow tasks using references. Input data is limited to 100KB in size, meaning it's not intended to pass large amounts of data to the workflow. In such cases, the recommendation is to use the data stores in CDF itself and have the tasks in the workflow read from and write to these.

A workflow run can also have custom, application-specific metadata.

Other concepts

Task input and output

Each task will receive the task definition as input during the workflow run. Depending on the type, a task returns a different set of outputs.

  • Transformation tasks return the jobId of the transformation job in CDF.

  • Function tasks return the callId, id, and response of the Function in CDF. See Functions for more information.

  • Simulation tasks return the runId, outputs, logs, and statusMessage of the simulation run in CDF.

  • CDF tasks return the statusCode and response (body).

cuidado

Secrets as input to workflows and tasks Some tasks in a workflow may require access to secrets or other confidential information, such as client credentials, during the workflow run. Data workflows currently do not support secure storage of such secrets as input to workflows or tasks. For such cases, the recommended solution is to leverage the capabilities for secure storage of secrets in Cognite Functions, combined with a Function task in the workflow. For more information, see Functions.

References

When specifying task properties in the workflow definition, you can use static values (strings) or references (expressions). A Reference is an expression that dynamically injects input to a task during the workflow run. Use references to reference the workflow's input or the input or output of a previous task in the workflow.

Note: The injected value must be valid in the context of the parameter the reference is used for.

References must adhere to the format ${prefix.jsonPath}, a JSON Path preceded by a prefix. The valid prefixes are:

  • <taskExternalId>.output
  • <taskExternalId>.input
  • workflow.input

The jsonPath refers to the path of a key in the JSON object defined by the prefix.

For instance, a task output reference could look like ${myTaskExternalId.output.someKey}, or a workflow input reference like this ${workflow.input.myKey}.