Resource naming reference

Consistent naming makes Cognite Data Fusion (CDF) resources easier to find, govern, and automate. This reference defines Cognite’s recommended conventions for external IDs (machine-readable identifiers) and display names (human-readable labels) across building blocks, core resource types, and data modeling resources. Use it when you create resources through the CDF API, SDKs, the CDF portal application, or deployment tooling such as the Cognite Toolkit. This reference starts with the three patterns that cover every resource, then gives you resource-specific tables to use as a lookup.

These are Cognite’s recommended naming conventions and provide a consistent baseline. Actual conventions can vary based on customer-specific requirements and existing standards. Where a customer or project already has documented naming conventions, follow those and record any deviations in project governance documentation.

Scope and prerequisites

This reference governs naming for artifacts created during a CDF deployment — both resource definitions (schema) and instances (data). It covers three resource families:

CDF building blocks — extraction pipelines, hosted extractors, transformations, functions, data workflows, CDF RAW databases, data sets, entity matching, access groups, service principals, and related deployment resources
Core resource types — time series, streams, records, 3D models and revisions, 3D scenes, location filters, and files
Data modeling resources — data models, spaces, views, containers, edges, instances, and properties

Out of scope:

Legacy CDF asset-centric data model types unless mapped to the new framework
Code repository directory structure and Cognite Toolkit module names (governed by separate standards)

You need administrative or write access to your CDF deployment to create and manage resources. The automation example later in this reference assumes familiarity with Cognite Toolkit configuration and variable substitution. A git-based workflow is recommended when you manage resource definitions as code.

The three naming patterns

Every CDF external ID follows one of three patterns. Identify which pattern your resource uses, then look up its exact token order in the resource tables further down.

Pattern	Shape	Example	Used by
Prefixed	`prefix_token_token_token`	`ep_files_valhall_sharepoint`	Most building blocks, core resources
Source-to-target	`prefix_source_location_to_target`	`tr_sap_valhall_to_cdm`	Transformations, mapping workflows
Persona-led	`persona_[scope]_environment`	`producer_pp_dev`	Access groups, service principals

A name is an ordered sequence of tokens — short codes such as valhall, sap, or files — running from general to specific. The prefix tells you the resource type (ep = extraction pipeline); the remaining tokens add location, source, and intent. To name a resource:

Find its row in the relevant resource table.
Assemble the external ID from approved tokens, in the order the grammar column specifies.
Expand those tokens into a sentence-case display name (ep_files_valhall_sharepoint → Valhall SharePoint file extraction).

Tokens come from your approved list — a maintained, machine-readable set of valid codes for locations, sources, personas, types, and segments (one enum per token type; see layer 2b). Agreeing on this list up front prevents variation sprawl (ny, newyork, nyc all meaning the same site). If an artifact applies to all locations, use all (for example, ep_all_sap_assets) to keep parsing consistent.

Rules

These rules apply across all three patterns. Resource families that deviate (data models, properties) note the deviation in their own tables.

Rule	External ID	Display name
Charset	Lowercase letters, digits, and underscore (`a-z`, `0-9`, `_`) only. Variable-name-safe.	English words, space-separated.
Casing	`snake_case`, unless a resource family specifies otherwise (PascalCase for data models and views; `camelCase` for properties; kebab-case for CDF projects).	Sentence case (for example, Valhall SAP maintenance extraction).
Language	English, singular nouns, present-tense verbs.	Same as external ID.
Separators	Underscore (`_`) between tokens. The `_to_` connector is the only allowed connector word, used only in source-to-target resources.	Spaces between words. Do not use colons, hyphens, or underscores as token separators.
Length	Maximum 255 characters (external IDs). No null bytes.	Same 255-character limit where the API enforces it.
Environment	Environment-agnostic. Use the same ID in dev, test, and prod.	Same.
Tokens	Use approved values from your list. Reuse the same token for the same thing everywhere (`description`, never `desc` then `comment`).	Expand tokens into readable phrases.

Environment tokens (dev, test, prod) appear only in CDF project resource names (am-long-eu-no-oslo-dev) and persona-led names (producer_pp_dev). Never embed them in building-block external IDs such as extraction pipelines, transformations, or data sets.

Anti-patterns to avoid

Ambiguity and generic naming: Avoid my_script, test_pipeline, data_loader, or transformation_1. These provide no context for scope, ownership, or function.
Casing chaos: Identifiers that differ only by capitalization are prohibited (pumpid vs PumpID causes silent data loss).
Unsafe characters: Do not use spaces, special characters (&, %, /, \), or non-ASCII characters in external IDs.
Implicit context: Do not omit location or source because “everyone knows the deployment.” A pipeline named sap_assets collides when a second SAP site is added.
Hardcoded randomness: Random GUIDs sacrifice debuggability. Prefer semantic naming for operational awareness.
Environment in resource IDs: Do not embed dev, test, or prod in building-block external IDs.
Type prefix on instances: Data instances take no type prefix. A pump instance is P-101, not asset_P-101.

If your deployment already uses a documented convention that differs from this guide, follow that convention and record the deviation in project governance documentation. Treat deviations as informational, not errors.

CDF building blocks

Building blocks are operational resources you configure to ingest, transform, and orchestrate data. They use type prefixes on external IDs for filtering and sorting in the CDF portal application and API. Access groups and service principals are exceptions — they use the persona-led pattern. Display names do not include the type prefix.

Building block	Prefix	Token grammar	Example ID	Example name	Casing	Regex
CDF project	None	`{enterprise}-{segment}-{region}-{country}-{site}-{env}`	`am-long-eu-no-oslo-dev`	am-long-eu-no-oslo-dev	kebab-case	`^[a-z0-9]+(-[a-z0-9]+)*-(dev\|test\|prod)$`
CDF organization	None	`{enterprise}`	`cognite`	Cognite	kebab-case	`^[a-z0-9]+(-[a-z0-9]+)*$`
Extraction config	`ec`	`ec_{source}_{intent}`	`ec_sap_maintenance`	SAP maintenance extraction config	snake_case	`^ec_[a-z0-9]+_[a-z0-9_]+$`
Extraction pipeline	`ep`	`ep_{data_type}_{location}_{source}`	`ep_files_valhall_sharepoint`	Valhall SharePoint file extraction	snake_case	`^ep_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$`
Hosted extractor	`he`	`he_{data_type}_{location}_{messaging}_{source}_{suffix}`	`he_timeseries_valhall_kafka_pi_source`	Valhall Kafka PI time series source	snake_case	`^he_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_(source\|job\|destination\|mapping)$`
Raw database	`raw`	`raw_{data_type}_{location}_{source}`	`raw_asset_oid_workmate`	Asset OID workmate raw database	snake_case	`^raw_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$`
Transformation	`tr`	`tr_{source}_{location}_to_{target}`	`tr_sap_valhall_to_cdm`	SAP valhall to CDM	snake_case	`^tr_[a-z0-9]+_[a-z0-9]+_to_[a-z0-9_]+$`
Function	`fn`	`fn_{source}_{location}_{intent}_{target}`	`fn_timeseries_valhall_anomaly_alerts`	Valhall time series anomaly alerts	snake_case	`^fn_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$`
Data workflow	`wf`	Follows the pattern that fits: `wf_{source}_{location}_to_{target}` (mapping) or `wf_{location}_{intent}` (action)	`wf_valhall_calculate_asset_downtime`	Valhall calculate asset downtime	snake_case	`^wf_[a-z0-9]+_[a-z0-9_]+$`
Data set	`ds`	`ds_{data_type}_{location}`	`ds_files_valhall`	Valhall files data set	snake_case	`^ds_[a-z0-9]+_[a-z0-9_]+$`
Entity matching	`em`	`em_{source}_{location}`	`em_sap_pi_valhall`	SAP PI valhall entity matching	snake_case	`^em_[a-z0-9_]+_[a-z0-9_]+$`
Location filter	`loc`	`loc_{location}`	`loc_valhall`	Valhall location filter	snake_case	`^loc_[a-z0-9_]+$`
Atlas AI agent	`agent`	`agent_{domain}_{intent}`	`agent_maintenance_anomaly_detection`	Maintenance anomaly detection agent	snake_case	`^agent_[a-z0-9]+_[a-z0-9_]+$`
Processing pipeline (access scope)	None (access-scope token)	`pp` is a scope token used inside access group and service principal names — not a resource external ID prefix	`producer_pp_dev`	Producer Processing Pipeline dev	N/A	N/A
Access group	None	Persona-led with required `type` — see Access management	`producer_pp_dev`	Producer Processing Pipeline dev	snake_case	`^(producer\|consumer\|admin)(_[a-z0-9]+)+_(dev\|test\|prod)$`
Service principal	None	Persona-led with required `type` — see Access management	`producer_valhall_ep_pi_dev`	Producer Valhall Extraction Pipeline PI dev	snake_case	`^(producer\|consumer\|admin)(_[a-z0-9]+)+_(dev\|test\|prod)$`
Charts / Canvases / Flows / Streamlit	None	Name only	N/A	Maintenance overview dashboard	Sentence case	N/A

Core resource types

Core resources store and expose industrial data — often ingested from source systems. The naming strategy prioritizes traceability.

Resource type	Token grammar	Example ID	Example name	Casing	Regex (charset gate)
Time series	`{source}_{location}_{original_id}` or preserve source ID	`pi_valhall_2342`	PI valhall 2342	snake_case	`^[a-z0-9_]+$`
Stream	`{source}_{location}_{concept}`	`kafka_valhall_pi_pressure`	Kafka valhall PI pressure	snake_case	`^[a-z0-9_]+$`
Record	`{source}_{location}_{machine}_{id}`	`sap_valhall_rec_001`	SAP valhall rec 001	snake_case	`^[a-z0-9_]+$`
3D model	N/A (system-generated)	N/A	`ValhallModel`	PascalCase	N/A
3D scene	`{tokens}`	`scene_deck_a`	Scene deck A	snake_case	`^[a-z0-9_]+$`
Location filter	`loc_{location}`	`loc_valhall`	Valhall location filter	snake_case	`^loc_[a-z0-9_]+$`
File	Preserve source ID or `{source}_{location}_{original_id}`	Source-defined	Source-defined	snake_case when generated	`^[a-z0-9_]+$` when generated

Source ID preservation (time series and files). Time series and files often arrive with IDs from source systems. Decide as follows:

Preserve as-is when the source ID is already variable-name-safe (alphanumeric and underscore only) and unique within the deployment.
Prefix for uniqueness when the bare source ID could collide across sites or systems: {source}_{location}_{original_id} (for example, pi_valhall_2342).
Sanitize only when required — replace unsafe characters (spaces, /, %, non-ASCII) with underscore, trim leading and trailing underscores, and record the mapping in the resource metadata when traceability matters.

Prefer traceability over strict conformance when sanitization would break a known source-system reference.

Data modeling resources

Data modeling resources define the schema and instances in your knowledge graph. They use strict capitalization and layer suffixes to distinguish source, domain, and solution models. See Designing scalable data models for the layered architecture these abbreviations map to.

Resource type	Token grammar	Example ID	Example name	Casing	Regex (charset gate)
Data model	`{PascalCaseTokens}_{SRC\|DOM\|SOL}`	`MaintenanceManagement_DOM`	Maintenance management DOM	PascalCase + layer suffix	`^[A-Z][A-Za-z0-9](_[A-Z][A-Za-z0-9])*_(SRC\|DOM\|SOL)$`
DM space	`dm_{src\|dom\|sol}_{tokens}`	`dm_dom_maintenance_management`	Maintenance management DM DOM	snake_case	`^dm_(src\|dom\|sol)_[a-z0-9_]+$`
Instance space	`inst_{tokens}`	`inst_oid`	OID instance space	snake_case	`^inst_[a-z0-9_]+$`
View	`{PascalCaseTokens}`	`FailureCause`	Failure cause	PascalCase	`^[A-Z][A-Za-z0-9](_[A-Z][A-Za-z0-9])*$`
Container	`{PascalCaseTokens}`	`PumpContainer`	Pump container	PascalCase	Same as view
Edge	`{PascalCaseTokens}`	`PumpEdge`	Pump edge	PascalCase	Same as view
Instance	`{tokens}` (no type prefix)	`pump_101`	N/A	snake_case	`^[a-z0-9_]+$`
Property	`{camelCaseTokens}`	`flowRate`	flowRate	camelCase	`^[a-z][a-zA-Z0-9]*$`

Property descriptions should follow AI-friendly description practices. Layer abbreviations: src (source), dom (domain/enterprise), sol (solution), inst (instances). The dm_ prefix identifies a space; the data model itself carries the layer as a _SRC/_DOM/_SOL suffix, not a prefix. See spaces and instances.

Access management

Access groups and service principals use the persona-led pattern.

<persona>_[{segment}_][{region}_][{country}_][{site}_]<type>_<environment>

Token	Required	Values	Example
`persona`	Yes	`producer`, `consumer`, `admin`	`producer`
`segment`	No	Project-specific	`long`
`region`	No	Geographic region	`eu`
`country`	No	ISO 3166 code	`no`
`site`	No	Site or asset code	`valhall`
`type`	Yes	Building-block or scope token; use `all` for broad grants	`ep`, `pp`, `ep_pi`, `all`
`environment`	Yes	`dev`, `test`, `prod`	`dev`

The examples below show access group and service principal names built from these tokens.

Example	Resource	Meaning
`producer_pp_dev`	Access group	Producer access to processing pipelines in dev
`consumer_valhall_all_prod`	Access group	Consumer access to everything at Valhall in production
`admin_all_prod`	Access group	Admin access to everything in production
`producer_eu_no_valhall_ep_dev`	Access group	Producer access to extraction pipelines at Valhall, Norway, EU region, dev
`producer_valhall_ep_pi_dev`	Service principal	Service principal for a PI extraction pipeline at Valhall
`producer_oid_pp_dev`	Service principal	Service principal for processing pipelines at OID in dev

Apply the same pattern to identity provider (IdP) access groups and CDF access groups (CDF groups mirror the IdP name).

Alternative schemes — ACL-based, data-domain-based, or source-system-based — are valid when they better match your identity provider. Document the chosen scheme in project governance docs.

Security categories: If you use security categories for fine-grained access on time series and files, name them with an sc_ prefix mirroring the persona-led pattern (for example, sc_producer_pp_dev). Security categories apply only to time series and files — not to data modeling instances linked via instanceId. See Security categories.

Complete naming example (Open Industrial Data)

The example below shows a complete naming layout for a CDF deployment based on Open Industrial Data (OID), using oid as the location token.

CDF project resource: oid-dev
  Data set: ds_asset_oid — name: Asset OID
    Extraction pipeline: ep_asset_oid_workmate — name: Asset OID workmate extraction
    CDF RAW database: raw_asset_oid_workmate — table: assets
    Transformation: tr_workmate_oid_to_asset_hierarchy — name: Workmate OID to asset hierarchy
    Access groups:
      producer_oid_ep_dev
      producer_oid_pp_dev
      consumer_oid_all_dev

  Data set: ds_files_oid — name: Files OID
    Extraction pipeline: ep_files_oid_fileshare — name: Files OID fileshare extraction
    Transformation: tr_fileshare_oid_to_file_metadata — name: Fileshare OID to file metadata
    Function: fn_fileshare_oid_annotation_alerts — name: Fileshare OID annotation alerts
    Access groups:
      producer_oid_ep_dev
      producer_oid_pp_dev
      consumer_oid_all_dev
    Service principal: producer_oid_ep_fileshare_dev

  Data set: ds_timeseries_oid — name: Time series OID
    ...

  Instance space: inst_oid — name: OID instance space

Key conventions in this example:

Building-block external IDs use type prefixes (ep_, tr_, fn_, ds_) and the _to_ connector for transformations.
Access groups use the persona-led pattern with a required type token (ep, pp, or all for broad grants).
Service principals use the same pattern with a more specific type token (ep_fileshare).
Display names use sentence case with spaces — no colon separators.

Validation

Naming compliance works in four complementary layers — one generates correct names, two catch incorrect ones, and one applies human judgment:

Layer	When it acts	What it does	Catches
Cognite Toolkit / scripted generation	Author time	Assembles external IDs from shared token variables for correct names by construction	Manual typos, separator drift, inconsistent tokens — before they reach a commit
CI/CD shape regex (hard gate)	Merge time	Scans `externalId` fields and rejects anything failing the prefix/charset/casing regex	Wrong prefix, illegal characters, bad casing
CI/CD token check (hard gate)	Merge time	Tokenizes each `externalId` by its resource grammar and asserts every token exists in the approved-list enum, in the right slot and count	Unapproved tokens (`newyork` vs `nyc`), wrong token order, wrong token count
Approved-list review (soft gate)	Code review	Reviewer judges display-name quality and whether a new token deserves to be added to the enum	Naming that is valid but unclear; proposals for new approved tokens

Generate names with the Cognite Toolkit (layer 1)

If you manage resources as code, the Cognite Toolkit produces correct external IDs by construction through variable substitution. Define shared tokens once in config.yaml and reference them in resource YAML files to assemble every name from the same approved values:

# config.yaml
variables:
  location: valhall
  source: sap
  prefix_ep: ep
  prefix_db: raw

The same approach works with SDK scripts or API calls — centralize token values and assemble external IDs programmatically. This eliminates most manual errors, but names that bypass the Cognite Toolkit (hand-edited YAML, ad-hoc API calls) still need the gates below.

Shape regex in CI/CD (layer 2a)

The regex column in each resource table validates prefix, charset, and casing only. It deliberately does not validate token values, order, or count — a trailing [a-z0-9_]+ group accepts any underscore-separated tokens. Treat regex as a fast structural filter, not a correctness guarantee. Implement it in CI by scanning resource configuration files for externalId fields and rejecting pull requests that fail the hard-gate regex.

Resource family	Shape regex (hard gate)
Extraction pipeline	`^ep_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$`
Hosted extractor	`^he_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_(source\|job\|destination\|mapping)$`
Transformation	`^tr_[a-z0-9]+_[a-z0-9]+_to_[a-z0-9_]+$`
Function	`^fn_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$`
Data workflow	`^wf_[a-z0-9]+_[a-z0-9_]+$`
Data set	`^ds_[a-z0-9]+_[a-z0-9_]+$`
Entity matching	`^em_[a-z0-9_]+_[a-z0-9_]+$`
Raw database	`^raw_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$`
Access group / service principal	`^(producer\|consumer\|admin)(_[a-z0-9]+)+_(dev\|test\|prod)$`
DM space	`^dm_(src\|dom\|sol)_[a-z0-9_]+$`
Instance space	`^inst_[a-z0-9_]+$`

Token check in CI/CD (layer 2b)

This is a recommended capability. The rules below describe what a token-validation step should assert; your team implements it as a small CI script. Until then, these checks fall back to review (layer 3).

Shape regex alone lets ep_files_newyork_sharepoint (unapproved location), ep_sharepoint_files_valhall (wrong order), and ep_a_b_c_d (wrong count) pass. The token check closes that gap by validating against the approved-list enum rather than a pattern. Maintain the approved list as machine-readable enums, one entry per token type, as the single source of truth for authors and CI:

approved_tokens.yaml

location:  [valhall, oid, all]
source:    [sap, pi, sharepoint, fileshare, workmate]
persona:   [producer, consumer, admin]
type:      [ep, pp, ep_pi, ep_fileshare, all]
segment:   [long, short]
data_type: [files, asset, timeseries]

A token check validates each externalId against its resource grammar: every token must exist in its type’s enum, and the token count must match the grammar. Any miss fails the build. New tokens are added only by a reviewed change to approved_tokens.yaml.

Approved-list review (layer 3)

With token membership, order, and count enforced at layer 2b, review focuses on what machines cannot judge: is the display name clear, and does a proposed new token belong in the enum? Use this checklist for the remaining human-judgment items.

Category	Checklist item
Names	Are display names sentence case, space-separated, English, singular, and present tense?
Names	Is the name clear and unambiguous to someone outside the authoring team?
Source IDs	For time series and files, was the preserve/sanitize decision documented when non-conformant?
New tokens	Does each new token proposed for `approved_tokens.yaml` deserve to be added, and is it distinct from existing tokens?

Handling exceptions

Exceptions should be rare and documented. If a legacy or brownfield source system requires a naming format that violates the standard, document the deviation in the resource description or metadata field in CDF.

External standards

Align tokens with established international standards where possible:

Standard	Scope	Requirement
CFIHOS	Disciplines, document type	Use specific codes from CFIHOS standards
ISO 3166	Country codes	Use standard 2-letter codes (`no`, `us`)
ISO 639-1/2	Language codes	Use standard language codes (`en`, `no`)
ISA-95	Manufacturing hierarchy	Use levels (site, area, line, work center) for equipment and location tokens

​Scope and prerequisites

​The three naming patterns

​Rules

​Anti-patterns to avoid

​CDF building blocks

​Core resource types

​Data modeling resources

​Access management

​Complete naming example (Open Industrial Data)

​Validation

​Generate names with the Cognite Toolkit (layer 1)

​Shape regex in CI/CD (layer 2a)

​Token check in CI/CD (layer 2b)

​Approved-list review (layer 3)

​Handling exceptions

​External standards

​Further reading

Scope and prerequisites

The three naming patterns

Rules

Anti-patterns to avoid

CDF building blocks

Core resource types

Data modeling resources

Access management

Complete naming example (Open Industrial Data)

Validation

Generate names with the Cognite Toolkit (layer 1)

Shape regex in CI/CD (layer 2a)

Token check in CI/CD (layer 2b)

Approved-list review (layer 3)

Handling exceptions

External standards

Further reading