跳至主要内容

Use data workflows

Data workflows using the API and Python SDK

To learn and manage Data workflows using API, see API specifications.

For more information about Data workflows using Python, visit Cognite Python SDK documentation.

Access capabilities

To add the necessary capabilities for data workflows, see assign capabilities.

You can assign a workflow to a dataset during workflow creation. All read or write operations on the workflow or related resources (versions, executions, tasks, and triggers) require access to this data set.

Run data workflows

You can trigger a workflow execution by invoking the API's /run endpoint, providing a workflowExternalId and version.

Authenticate a workflow

To start a workflow run, you require a nonce, a temporary token that is used for authentication when the workflow triggers the processes defined by its tasks. A nonce can be retrieved from the Sessions API when creating a session. The session will be created behind the scenes when triggering a workflow when using the Python SDK.

Input data and metadata

You can provide custom input data to the workflow execution and refer to it in the workflow tasks using references. Input data is limited to 100KB in size, meaning it's not intended to pass large amounts of data to the workflow. In such cases, the recommendation is to use the data stores in CDF itself and have the tasks in the workflow read from and write to these.

A workflow execution can also have custom, application-specific metadata.

Other concepts

Task input and output

Each task will receive the task definition as input during execution. Depending on the type, a task returns a different set of outputs.

  • Transformation tasks return the jobId of the transformation job in CDF.

  • Function tasks return the callId, id, and response of the Function in CDF. See Functions for more information.

  • Simulation tasks return the runId, outputs, logs, and statusMessage of the simulation run in CDF.

  • CDF tasks return the statusCode and response (body).

小心

Secrets as input to workflows and tasks Some tasks in a workflow may require access to secrets or other confidential information, such as client credentials, during execution. Data workflows currently do not support secure storage of such secrets as input to workflows or tasks. For such cases, the recommended solution is to leverage the capabilities for secure storage of secrets in Cognite Functions, combined with a Function task in the workflow. For more information, see Functions.

References

When specifying task properties in the workflow definition, you can use static values (strings) or references (expressions). A Reference is an expression that dynamically injects input to a task during execution. Use references to reference the workflow's input or the input or output of a previous task in the workflow.

Note: The injected value must be valid in the context of the parameter the reference is used for.

References must adhere to the format ${prefix.jsonPath}, a JSON Path preceded by a prefix. The valid prefixes are:

  • <taskExternalId>.output
  • <taskExternalId>.input
  • workflow.input

The jsonPath refers to the path of a key in the JSON object defined by the prefix.

For instance, a task output reference could look like ${myTaskExternalId.output.someKey}, or a workflow input reference like this ${workflow.input.myKey}.