YAML reference library
The YAML resource configuration files are core to the Cognite Toolkit. Each of the files configures one of the resource types that are supported by the Cognite Toolkit and the CDF API. This article describes how to configure the different resource types.
The Cognite Toolkit bundles logically connected resource configuration files in modules, and each module stores the configuration files in directories corresponding to the resource types, called resource directories. The available resource directories are:
./<module_name>/
├── 3dmodels/
├── auth/
├── classic/
├── data_models/
├── data_sets/
├── extraction_pipelines/
├── files/
├── functions/
├── hosted_extractors/
├── locations/
├── raw/
├── robotics/
├── timeseries/
├── transformations/
└── workflows/
Note that a resource directory can host one or more configuration types. For example, the data_models/
directory hosts
the configuration files for spaces, containers, views, data models, and nodes. While the classic/
directory hosts the
configuration files for labels, assets, and sequences.
When you deploy, the Cognite Toolkit uses the CDF API to implement the YAML configurations in the CDF project.
In general, the format of the YAML files matches the API specification for the resource types. We recommend
that you use the externalId
of the resources as (part of) the name of the YAML file. This is to enable using the same resource
configuration across multiple CDF projects, for example, a development, staging and production project. Use number prefixes
(1.<filename.suffix>) to control the order of deployment within each resource type.
3D Models
Resource directory: 3dmodels/
Requires Cognite Toolkit v0.3.0 or later
API documentation: 3D models
3D model configurations are stored in the module's 3dmodels/ directory. You can have one or more 3D models in a single YAML file.
The filename must end with 3DModel
, for example, my_3d_model.3DModel.yaml
.
Example 3D model configuration:
name: my_3d_model
dataSetExternalId: ds_3d_models
metadata:
origin: cdf-toolkit
Assets
Resource directory: classic/
Requires Cognite Toolkit v0.3.0 or later
API documentation: Assets
Asset configurations are stored in the module's assets/ directory. You can have one or more asset in a single YAML file.
The filename must end with Asset
, for example, my_asset.Asset.yaml
.
Example Asset configuration:
externalId: my_root_asset
name: SAP hierarchy
description: The root asset in the SAP hierarchy
dataSetExternalId: ds_sap_assets
source: SAP
metadata:
origin: cdf-toolkit
The asset API uses an internal ID for the data set, while the YAML configuration files reference the external ID; dataSetExternalId
. The Cognite Toolkit resolves the external ID to the internal ID before sending the request to the CDF API.
Table formats
In addition, to yaml
Cognite Toolkit also supports csv
and parquet
formats for asset configurations. As with the yaml
format,
the filename must end with Asset
, for example, my_asset.Asset.csv
.
externalId,name,description,dataSetExternalId,source,metadata.origin
my_root_asset,SAP hierarchy,The root asset in the SAP hierarchy,ds_sap_assets,SAP,cdf-toolkit
Note that the column names must match the field names in the yaml
configuration. The exception is the metadata
field,
which is a dictionary in the yaml
configuration, but a string in the csv
configuration. This is solved by using the
notation metadata.origin
column in the csv
configuration.
Groups
Resource directory: auth/
API documentation: Groups
The group configuration files are stored in the module's auth/ directory. Each group has a separate YAML file.
The name
field is used as a unique identifier for the group. If you change the name of the group manually in CDF,
it will be treated as a different group and will be ignored by the Cognite Toolkit.
Example group configuration:
name: 'my_group'
sourceId: '{{mygroup_source_id}}'
metadata:
origin: 'cognite-toolkit'
capabilities:
- projectsAcl:
actions:
- LIST
- READ
scope:
all: {}
We recommend using the metadata:origin
property to indicate that the group is created by the Cognite Toolkit
Populate the sourceId
with the group ID from CDF project's identity provider.
You can specify each ACL capability in CDF as in the projectsAcl
example above. Scoping to dataset, space, RAW table,
current user, or pipeline is also supported (see ACL scoping).
Groups and group deletion
If you delete groups with the cdf clean
or cdf deploy --drop
command, the Cognite Toolkit skips the groups that the running
user or service account is a member of. This prevents the cleaning operation from removing access rights from
the running user and potentially locking the user out from further operation.
ACL scoping
Dataset scope
Use to restrict access to data in a specific data set.
- threedAcl:
actions:
- READ
scope:
datasetScope: { ids: ['my_dataset'] }
The groups API uses an internal ID for the data set, while the YAML configuration files reference the external ID. The Cognite Toolkit resolves the external ID to the internal ID before sending the request to the CDF API.
Space scope
Use to restrict access to data in a data model space.
- dataModelInstancesAcl:
actions:
- READ
scope:
spaceIdScope: { spaceIds: ['my_space'] }
Table scope
Use to restrict access to a database or a table in a database.
- rawAcl:
actions:
- READ
- WRITE
scope:
tableScope:
dbsToTables:
my_database:
tables: []
Current user-scope
Use to restrict actions to the groups the user is a member of.
- groupsAcl:
actions:
- LIST
- READ
scope:
currentuserscope: {}
Security categories
Requires Cognite Toolkit v0.2.0 or later
Resource directory: auth/
API documentation: Security categories
The security categories are stored in the module's auth/ directory. You can have one or more security categories in a single YAML file.
We recommend that you start security category names with sc_
and use _
to separate words. The file name is not significant,
but we recommend that you name it after the security categories it creates.
- name: sc_my_security_category
- name: sc_my_other_security_category
Data models
Resource directory: data_models/
API documentation: Data modeling
The data model configurations are stored in the module's data_models directory.
A data model consists of a set of data modeling entities: one or more spaces, containers, views, and data models. Each entity has its own file with a suffix to indicate the entity type: my.space.yaml, my.container.yaml, my.view.yaml, my.datamodel.yaml.
You can also use the Cognite Toolkit to create nodes to keep configuration for applications (for instance, InField) and to create node types that are part of the data model. Define nodes in files with the .node.yaml suffix.
The Cognite Toolkit applies configurations in the order of dependencies between the entity types: first spaces, next containers, then views, and finally data models.
If there are dependencies between the entities of the same type, you can use prefix numbers in the filename to have the Cognite Toolkit apply the files in the correct order.
The Cognite Toolkit supports using subdirectories to organize the files, for example:
data_models/
┣ 📂 containers/
┣ 📂 views/
┣ 📂 nodes/
┣ 📜 my_data_model.datamodel.yaml
┗ 📜 data_model_space.space.yaml
Spaces
API documentation: Spaces
Spaces are the top-level entities and is the home of containers, views, and data models. You can create a space with a .space.yaml file in the data_models/ directory.
space: sp_cognite_app_data
name: cognite:app:data
description: Space for InField app data
CDF doesn't allow a space to be deleted unless it's empty. If a space contains, for example, nodes that aren't governed by the Cognite Toolkit, the Cognite Toolkit will not delete the space.
Containers
API documentation: Containers
Containers are the home of properties and data. You can create a container with a .container.yaml file in the data_models/ directory. You can also create indexes and constraints according to the API specification.
externalId: MyActivity
usedFor: node
space: sp_activity_data
properties:
id:
type:
type: text
list: false
collation: ucs_basic
nullable: true
title:
type:
type: text
list: false
collation: ucs_basic
nullable: true
description:
type:
type: text
list: false
collation: ucs_basic
nullable: true
The example container definition creates a container with three properties: id
, title
, and description
.
Note that sp_activity_data
requires its own activity_data.space.yaml file in the data_models/ directory.
Views
API documentation: Views
Use views to ingest, query, and structure the data into meaningful entities in your data model. You can create a view with a .view.yaml file in the data_models/ directory.
externalId: MyActivity
name: MyActivity
description: 'An activity represents a set of maintenance tasks with multiple operations for individual assets. The activity is considered incomplete until all its operations are finished.'
version: '3'
space: sp_activity_model
properties:
id:
description: 'Unique identifier from the source, for instance, an object ID in SAP.'
container:
type: container
space: sp_activity_data
externalId: MyActivity
containerPropertyIdentifier: id
title:
description: 'A title or brief description of the maintenance activity or work order.'
container:
type: container
space: sp_activity_data
externalId: MyActivity
containerPropertyIdentifier: title
description:
description: 'A detailed description of the maintenance activity or work order.'
container:
type: container
space: sp_activity_data
externalId: MyActivity
containerPropertyIdentifier: description
This example view configuration creates a view with three properties: id
, title
, and description
.
The view references the properties from the container MyActivity
in the sp_activity_data
space. The view
exists in a space called sp_activity_model
, while the container exists in the sp_activity_data
space.
Data models
API documentation: Data models
Use data models to structure the data into knowledge graphs with relationships between views using edges. From an implementation perspective, a data model is a collection of views.
You can create a data model with a .datamodel.yaml file in the data_models/ directory.
externalId: ActivityDataModel
name: My activity data model
version: '1'
space: sp_activity_model
description: 'A data model for structuring and querying activity data.'
views:
- type: view
externalId: MyActivity
space: sp_activity_model
version: '3'
- type: view
externalId: MyTask
space: sp_activity_model
version: '2'