Skip to main content
Consistent naming makes Cognite Data Fusion (CDF) resources easier to find, govern, and automate. This reference defines Cognite’s recommended conventions for external IDs (machine-readable identifiers) and display names (human-readable labels) across building blocks, core resource types, and data modeling resources. Use it when you create resources through the CDF API, SDKs, the CDF portal application, or deployment tooling such as the Cognite Toolkit. This reference starts with the three patterns that cover every resource, then gives you resource-specific tables to use as a lookup.
These are Cognite’s recommended naming conventions and provide a consistent baseline. Actual conventions can vary based on customer-specific requirements and existing standards. Where a customer or project already has documented naming conventions, follow those and record any deviations in project governance documentation.

Scope and prerequisites

This reference governs naming for artifacts created during a CDF deployment — both resource definitions (schema) and instances (data). It covers three resource families:
  • CDF building blocks — extraction pipelines, hosted extractors, transformations, functions, data workflows, CDF RAW databases, data sets, entity matching, access groups, service principals, and related deployment resources
  • Core resource types — time series, streams, records, 3D models and revisions, 3D scenes, location filters, and files
  • Data modeling resources — data models, spaces, views, containers, edges, instances, and properties
Out of scope:
  • Legacy CDF asset-centric data model types unless mapped to the new framework
  • Code repository directory structure and Cognite Toolkit module names (governed by separate standards)
You need administrative or write access to your CDF deployment to create and manage resources. The automation example later in this reference assumes familiarity with Cognite Toolkit configuration and variable substitution. A git-based workflow is recommended when you manage resource definitions as code.

The three naming patterns

Every CDF external ID follows one of three patterns. Identify which pattern your resource uses, then look up its exact token order in the resource tables further down.
PatternShapeExampleUsed by
Prefixedprefix_token_token_tokenep_files_valhall_sharepointMost building blocks, core resources
Source-to-targetprefix_source_location_to_targettr_sap_valhall_to_cdmTransformations, mapping workflows
Persona-ledpersona_[scope]_environmentproducer_pp_devAccess groups, service principals
A name is an ordered sequence of tokens — short codes such as valhall, sap, or files — running from general to specific. The prefix tells you the resource type (ep = extraction pipeline); the remaining tokens add location, source, and intent. To name a resource:
  1. Find its row in the relevant resource table.
  2. Assemble the external ID from approved tokens, in the order the grammar column specifies.
  3. Expand those tokens into a sentence-case display name (ep_files_valhall_sharepointValhall SharePoint file extraction).
Tokens come from your approved list — a maintained, machine-readable set of valid codes for locations, sources, personas, types, and segments (one enum per token type; see layer 2b). Agreeing on this list up front prevents variation sprawl (ny, newyork, nyc all meaning the same site). If an artifact applies to all locations, use all (for example, ep_all_sap_assets) to keep parsing consistent.

Rules

These rules apply across all three patterns. Resource families that deviate (data models, properties) note the deviation in their own tables.
RuleExternal IDDisplay name
CharsetLowercase letters, digits, and underscore (a-z, 0-9, _) only. Variable-name-safe.English words, space-separated.
Casingsnake_case, unless a resource family specifies otherwise (PascalCase for data models and views; camelCase for properties; kebab-case for CDF projects).Sentence case (for example, Valhall SAP maintenance extraction).
LanguageEnglish, singular nouns, present-tense verbs.Same as external ID.
SeparatorsUnderscore (_) between tokens. The _to_ connector is the only allowed connector word, used only in source-to-target resources.Spaces between words. Do not use colons, hyphens, or underscores as token separators.
LengthMaximum 255 characters (external IDs). No null bytes.Same 255-character limit where the API enforces it.
EnvironmentEnvironment-agnostic. Use the same ID in dev, test, and prod.Same.
TokensUse approved values from your list. Reuse the same token for the same thing everywhere (description, never desc then comment).Expand tokens into readable phrases.
Environment tokens (dev, test, prod) appear only in CDF project resource names (am-long-eu-no-oslo-dev) and persona-led names (producer_pp_dev). Never embed them in building-block external IDs such as extraction pipelines, transformations, or data sets.

Anti-patterns to avoid

  • Ambiguity and generic naming: Avoid my_script, test_pipeline, data_loader, or transformation_1. These provide no context for scope, ownership, or function.
  • Casing chaos: Identifiers that differ only by capitalization are prohibited (pumpid vs PumpID causes silent data loss).
  • Unsafe characters: Do not use spaces, special characters (&, %, /, \), or non-ASCII characters in external IDs.
  • Implicit context: Do not omit location or source because “everyone knows the deployment.” A pipeline named sap_assets collides when a second SAP site is added.
  • Hardcoded randomness: Random GUIDs sacrifice debuggability. Prefer semantic naming for operational awareness.
  • Environment in resource IDs: Do not embed dev, test, or prod in building-block external IDs.
  • Type prefix on instances: Data instances take no type prefix. A pump instance is P-101, not asset_P-101.
If your deployment already uses a documented convention that differs from this guide, follow that convention and record the deviation in project governance documentation. Treat deviations as informational, not errors.

CDF building blocks

Building blocks are operational resources you configure to ingest, transform, and orchestrate data. They use type prefixes on external IDs for filtering and sorting in the CDF portal application and API. Access groups and service principals are exceptions — they use the persona-led pattern. Display names do not include the type prefix.
Building blockPrefixToken grammarExample IDExample nameCasingRegex
CDF projectNone{enterprise}-{segment}-{region}-{country}-{site}-{env}am-long-eu-no-oslo-devam-long-eu-no-oslo-devkebab-case^[a-z0-9]+(-[a-z0-9]+)*-(dev|test|prod)$
CDF organizationNone{enterprise}cogniteCognitekebab-case^[a-z0-9]+(-[a-z0-9]+)*$
Extraction configecec_{source}_{intent}ec_sap_maintenanceSAP maintenance extraction configsnake_case^ec_[a-z0-9]+_[a-z0-9_]+$
Extraction pipelineepep_{data_type}_{location}_{source}ep_files_valhall_sharepointValhall SharePoint file extractionsnake_case^ep_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$
Hosted extractorhehe_{data_type}_{location}_{messaging}_{source}_{suffix}he_timeseries_valhall_kafka_pi_sourceValhall Kafka PI time series sourcesnake_case^he_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_(source|job|destination|mapping)$
Raw databaserawraw_{data_type}_{location}_{source}raw_asset_oid_workmateAsset OID workmate raw databasesnake_case^raw_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$
Transformationtrtr_{source}_{location}_to_{target}tr_sap_valhall_to_cdmSAP valhall to CDMsnake_case^tr_[a-z0-9]+_[a-z0-9]+_to_[a-z0-9_]+$
Functionfnfn_{source}_{location}_{intent}_{target}fn_timeseries_valhall_anomaly_alertsValhall time series anomaly alertssnake_case^fn_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$
Data workflowwfFollows the pattern that fits: wf_{source}_{location}_to_{target} (mapping) or wf_{location}_{intent} (action)wf_valhall_calculate_asset_downtimeValhall calculate asset downtimesnake_case^wf_[a-z0-9]+_[a-z0-9_]+$
Data setdsds_{data_type}_{location}ds_files_valhallValhall files data setsnake_case^ds_[a-z0-9]+_[a-z0-9_]+$
Entity matchingemem_{source}_{location}em_sap_pi_valhallSAP PI valhall entity matchingsnake_case^em_[a-z0-9_]+_[a-z0-9_]+$
Location filterlocloc_{location}loc_valhallValhall location filtersnake_case^loc_[a-z0-9_]+$
Atlas AI agentagentagent_{domain}_{intent}agent_maintenance_anomaly_detectionMaintenance anomaly detection agentsnake_case^agent_[a-z0-9]+_[a-z0-9_]+$
Processing pipeline (access scope)None (access-scope token)pp is a scope token used inside access group and service principal names — not a resource external ID prefixproducer_pp_devProducer Processing Pipeline devN/AN/A
Access groupNonePersona-led with required type — see Access managementproducer_pp_devProducer Processing Pipeline devsnake_case^(producer|consumer|admin)(_[a-z0-9]+)+_(dev|test|prod)$
Service principalNonePersona-led with required type — see Access managementproducer_valhall_ep_pi_devProducer Valhall Extraction Pipeline PI devsnake_case^(producer|consumer|admin)(_[a-z0-9]+)+_(dev|test|prod)$
Charts / Canvases / Flows / StreamlitNoneName onlyN/AMaintenance overview dashboardSentence caseN/A

Core resource types

Core resources store and expose industrial data — often ingested from source systems. The naming strategy prioritizes traceability.
Resource typeToken grammarExample IDExample nameCasingRegex (charset gate)
Time series{source}_{location}_{original_id} or preserve source IDpi_valhall_2342PI valhall 2342snake_case^[a-z0-9_]+$
Stream{source}_{location}_{concept}kafka_valhall_pi_pressureKafka valhall PI pressuresnake_case^[a-z0-9_]+$
Record{source}_{location}_{machine}_{id}sap_valhall_rec_001SAP valhall rec 001snake_case^[a-z0-9_]+$
3D modelN/A (system-generated)N/AValhallModelPascalCaseN/A
3D scene{tokens}scene_deck_aScene deck Asnake_case^[a-z0-9_]+$
Location filterloc_{location}loc_valhallValhall location filtersnake_case^loc_[a-z0-9_]+$
FilePreserve source ID or {source}_{location}_{original_id}Source-definedSource-definedsnake_case when generated^[a-z0-9_]+$ when generated
Source ID preservation (time series and files). Time series and files often arrive with IDs from source systems. Decide as follows:
  1. Preserve as-is when the source ID is already variable-name-safe (alphanumeric and underscore only) and unique within the deployment.
  2. Prefix for uniqueness when the bare source ID could collide across sites or systems: {source}_{location}_{original_id} (for example, pi_valhall_2342).
  3. Sanitize only when required — replace unsafe characters (spaces, /, %, non-ASCII) with underscore, trim leading and trailing underscores, and record the mapping in the resource metadata when traceability matters.
Prefer traceability over strict conformance when sanitization would break a known source-system reference.

Data modeling resources

Data modeling resources define the schema and instances in your knowledge graph. They use strict capitalization and layer suffixes to distinguish source, domain, and solution models. See Designing scalable data models for the layered architecture these abbreviations map to.
Resource typeToken grammarExample IDExample nameCasingRegex (charset gate)
Data model{PascalCaseTokens}_{SRC|DOM|SOL}MaintenanceManagement_DOMMaintenance management DOMPascalCase + layer suffix^[A-Z][A-Za-z0-9]*(_[A-Z][A-Za-z0-9]*)*_(SRC|DOM|SOL)$
DM spacedm_{src|dom|sol}_{tokens}dm_dom_maintenance_managementMaintenance management DM DOMsnake_case^dm_(src|dom|sol)_[a-z0-9_]+$
Instance spaceinst_{tokens}inst_oidOID instance spacesnake_case^inst_[a-z0-9_]+$
View{PascalCaseTokens}FailureCauseFailure causePascalCase^[A-Z][A-Za-z0-9]*(_[A-Z][A-Za-z0-9]*)*$
Container{PascalCaseTokens}PumpContainerPump containerPascalCaseSame as view
Edge{PascalCaseTokens}PumpEdgePump edgePascalCaseSame as view
Instance{tokens} (no type prefix)pump_101N/Asnake_case^[a-z0-9_]+$
Property{camelCaseTokens}flowRateflowRatecamelCase^[a-z][a-zA-Z0-9]*$
Property descriptions should follow AI-friendly description practices. Layer abbreviations: src (source), dom (domain/enterprise), sol (solution), inst (instances). The dm_ prefix identifies a space; the data model itself carries the layer as a _SRC/_DOM/_SOL suffix, not a prefix. See spaces and instances.

Access management

Access groups and service principals use the persona-led pattern.
<persona>_[{segment}_][{region}_][{country}_][{site}_]<type>_<environment>
TokenRequiredValuesExample
personaYesproducer, consumer, adminproducer
segmentNoProject-specificlong
regionNoGeographic regioneu
countryNoISO 3166 codeno
siteNoSite or asset codevalhall
typeYesBuilding-block or scope token; use all for broad grantsep, pp, ep_pi, all
environmentYesdev, test, proddev
The examples below show access group and service principal names built from these tokens.
ExampleResourceMeaning
producer_pp_devAccess groupProducer access to processing pipelines in dev
consumer_valhall_all_prodAccess groupConsumer access to everything at Valhall in production
admin_all_prodAccess groupAdmin access to everything in production
producer_eu_no_valhall_ep_devAccess groupProducer access to extraction pipelines at Valhall, Norway, EU region, dev
producer_valhall_ep_pi_devService principalService principal for a PI extraction pipeline at Valhall
producer_oid_pp_devService principalService principal for processing pipelines at OID in dev
Apply the same pattern to identity provider (IdP) access groups and CDF access groups (CDF groups mirror the IdP name).
Alternative schemes — ACL-based, data-domain-based, or source-system-based — are valid when they better match your identity provider. Document the chosen scheme in project governance docs.
Security categories: If you use security categories for fine-grained access on time series and files, name them with an sc_ prefix mirroring the persona-led pattern (for example, sc_producer_pp_dev). Security categories apply only to time series and files — not to data modeling instances linked via instanceId. See Security categories.

Complete naming example (Open Industrial Data)

The example below shows a complete naming layout for a CDF deployment based on Open Industrial Data (OID), using oid as the location token.
CDF project resource: oid-dev
  Data set: ds_asset_oid — name: Asset OID
    Extraction pipeline: ep_asset_oid_workmate — name: Asset OID workmate extraction
    CDF RAW database: raw_asset_oid_workmate — table: assets
    Transformation: tr_workmate_oid_to_asset_hierarchy — name: Workmate OID to asset hierarchy
    Access groups:
      producer_oid_ep_dev
      producer_oid_pp_dev
      consumer_oid_all_dev

  Data set: ds_files_oid — name: Files OID
    Extraction pipeline: ep_files_oid_fileshare — name: Files OID fileshare extraction
    Transformation: tr_fileshare_oid_to_file_metadata — name: Fileshare OID to file metadata
    Function: fn_fileshare_oid_annotation_alerts — name: Fileshare OID annotation alerts
    Access groups:
      producer_oid_ep_dev
      producer_oid_pp_dev
      consumer_oid_all_dev
    Service principal: producer_oid_ep_fileshare_dev

  Data set: ds_timeseries_oid — name: Time series OID
    ...

  Instance space: inst_oid — name: OID instance space
Key conventions in this example:
  • Building-block external IDs use type prefixes (ep_, tr_, fn_, ds_) and the _to_ connector for transformations.
  • Access groups use the persona-led pattern with a required type token (ep, pp, or all for broad grants).
  • Service principals use the same pattern with a more specific type token (ep_fileshare).
  • Display names use sentence case with spaces — no colon separators.

Validation

Naming compliance works in four complementary layers — one generates correct names, two catch incorrect ones, and one applies human judgment:
LayerWhen it actsWhat it doesCatches
Cognite Toolkit / scripted generationAuthor timeAssembles external IDs from shared token variables for correct names by constructionManual typos, separator drift, inconsistent tokens — before they reach a commit
CI/CD shape regex (hard gate)Merge timeScans externalId fields and rejects anything failing the prefix/charset/casing regexWrong prefix, illegal characters, bad casing
CI/CD token check (hard gate)Merge timeTokenizes each externalId by its resource grammar and asserts every token exists in the approved-list enum, in the right slot and countUnapproved tokens (newyork vs nyc), wrong token order, wrong token count
Approved-list review (soft gate)Code reviewReviewer judges display-name quality and whether a new token deserves to be added to the enumNaming that is valid but unclear; proposals for new approved tokens

Generate names with the Cognite Toolkit (layer 1)

If you manage resources as code, the Cognite Toolkit produces correct external IDs by construction through variable substitution. Define shared tokens once in config.yaml and reference them in resource YAML files to assemble every name from the same approved values:
# config.yaml
variables:
  location: valhall
  source: sap
  prefix_ep: ep
  prefix_db: raw
The same approach works with SDK scripts or API calls — centralize token values and assemble external IDs programmatically. This eliminates most manual errors, but names that bypass the Cognite Toolkit (hand-edited YAML, ad-hoc API calls) still need the gates below.

Shape regex in CI/CD (layer 2a)

The regex column in each resource table validates prefix, charset, and casing only. It deliberately does not validate token values, order, or count — a trailing [a-z0-9_]+ group accepts any underscore-separated tokens. Treat regex as a fast structural filter, not a correctness guarantee. Implement it in CI by scanning resource configuration files for externalId fields and rejecting pull requests that fail the hard-gate regex.
Resource familyShape regex (hard gate)
Extraction pipeline^ep_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$
Hosted extractor^he_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_(source|job|destination|mapping)$
Transformation^tr_[a-z0-9]+_[a-z0-9]+_to_[a-z0-9_]+$
Function^fn_[a-z0-9]+_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$
Data workflow^wf_[a-z0-9]+_[a-z0-9_]+$
Data set^ds_[a-z0-9]+_[a-z0-9_]+$
Entity matching^em_[a-z0-9_]+_[a-z0-9_]+$
Raw database^raw_[a-z0-9]+_[a-z0-9]+_[a-z0-9_]+$
Access group / service principal^(producer|consumer|admin)(_[a-z0-9]+)+_(dev|test|prod)$
DM space^dm_(src|dom|sol)_[a-z0-9_]+$
Instance space^inst_[a-z0-9_]+$

Token check in CI/CD (layer 2b)

This is a recommended capability. The rules below describe what a token-validation step should assert; your team implements it as a small CI script. Until then, these checks fall back to review (layer 3).
Shape regex alone lets ep_files_newyork_sharepoint (unapproved location), ep_sharepoint_files_valhall (wrong order), and ep_a_b_c_d (wrong count) pass. The token check closes that gap by validating against the approved-list enum rather than a pattern. Maintain the approved list as machine-readable enums, one entry per token type, as the single source of truth for authors and CI:
approved_tokens.yaml
location:  [valhall, oid, all]
source:    [sap, pi, sharepoint, fileshare, workmate]
persona:   [producer, consumer, admin]
type:      [ep, pp, ep_pi, ep_fileshare, all]
segment:   [long, short]
data_type: [files, asset, timeseries]
A token check validates each externalId against its resource grammar: every token must exist in its type’s enum, and the token count must match the grammar. Any miss fails the build. New tokens are added only by a reviewed change to approved_tokens.yaml.

Approved-list review (layer 3)

With token membership, order, and count enforced at layer 2b, review focuses on what machines cannot judge: is the display name clear, and does a proposed new token belong in the enum? Use this checklist for the remaining human-judgment items.
CategoryChecklist item
NamesAre display names sentence case, space-separated, English, singular, and present tense?
NamesIs the name clear and unambiguous to someone outside the authoring team?
Source IDsFor time series and files, was the preserve/sanitize decision documented when non-conformant?
New tokensDoes each new token proposed for approved_tokens.yaml deserve to be added, and is it distinct from existing tokens?

Handling exceptions

Exceptions should be rare and documented. If a legacy or brownfield source system requires a naming format that violates the standard, document the deviation in the resource description or metadata field in CDF.

External standards

Align tokens with established international standards where possible:
StandardScopeRequirement
CFIHOSDisciplines, document typeUse specific codes from CFIHOS standards
ISO 3166Country codesUse standard 2-letter codes (no, us)
ISO 639-1/2Language codesUse standard language codes (en, no)
ISA-95Manufacturing hierarchyUse levels (site, area, line, work center) for equipment and location tokens

Further reading

Last modified on June 26, 2026