Use data workflows
Data workflows using the API and Python SDK
To learn and manage Data workflows using API, see API specifications.
For more information about Data workflows using Python, visit Cognite Python SDK documentation.
Access capabilities
To add the necessary capabilities for data workflows, see assign capabilities.
You can assign a workflow to a dataset during workflow creation. All read or write operations on the workflow or related resources (versions, executions, tasks, and triggers) require access to this data set.
Run data workflows
You can trigger a workflow execution by invoking the API's /run
endpoint, providing a workflowExternalId
and version
.
Authenticate a workflow
To start a workflow run, you require a nonce, a temporary token that is used for authentication when the workflow triggers the processes defined by its tasks. A nonce can be retrieved from the Sessions API when creating a session. The session will be created behind the scenes when triggering a workflow when using the Python SDK.
Input data and metadata
You can provide custom input data to the workflow execution and refer to it in the workflow tasks using references
. Input data is limited to 100KB in size, meaning it's not intended to pass large amounts of data to the workflow. In such cases, the recommendation is to use the data stores in CDF itself and have the tasks in the workflow read from and write to these.
A workflow execution can also have custom, application-specific metadata.
Other concepts
Task input and output
Each task will receive the task definition as input during execution. Depending on the type, a task returns a different set of outputs.
-
Transformation tasks return the
jobId
of the transformation job in CDF. -
Function tasks return the
callId
,id
, andresponse
of the Function in CDF. See Functions for more information. -
Simulation tasks return the
runId
,outputs
,logs
, andstatusMessage
of the simulation run in CDF. -
CDF tasks return the
statusCode
andresponse
(body).
Secrets as input to workflows and tasks Some tasks in a workflow may require access to secrets or other confidential information, such as client credentials, during execution. Data workflows currently do not support secure storage of such secrets as input to workflows or tasks. For such cases, the recommended solution is to leverage the capabilities for secure storage of secrets in Cognite Functions, combined with a Function task in the workflow. For more information, see Functions.
References
When specifying task properties in the workflow definition, you can use static values (strings) or references (expressions). A Reference is an expression that dynamically injects input to a task during execution. Use references to reference the workflow's input or the input or output of a previous task in the workflow.
Note: The injected value must be valid in the context of the parameter the reference is used for.
References must adhere to the format ${prefix.jsonPath}
, a JSON Path preceded by a prefix. The valid prefixes are:
<taskExternalId>.output
<taskExternalId>.input
workflow.input
The jsonPath
refers to the path of a key in the JSON object defined by the prefix.
For instance, a task output reference could look like ${myTaskExternalId.output.someKey}
, or a workflow input reference like this ${workflow.input.myKey}.