Search capabilities in data models

Use the data modeling search endpoint to search for text and property values in the knowledge graph. This article explains tokenization, matching rules, and endpoint differences to help you build efficient search capabilities. For example, you can use the search endpoint to implement full-text queries and multiple field search, and to rank results by relevance.

The search endpoint supports:

Full-text search across text fields.
Prefix-based word matching.
Boosting exact phrases.
Filtering on properties.
Converting search results to a different unit.

informācija

See the Query features article for information about filtering.

Queries are eventually consistent. Because of indexing delays, it may take a few seconds for new or updated data to become searchable.

Example search query

The search query below filters the results to Equipment with temperatures between 15 and 25 degrees Celsius, and a name or description that contains the word "temperature", or a word that starts with "sensor".

Example search query
{
  "view": {
    "type": "view",
    "space": "testSpace",
    "externalId": "Equipment",
    "version": "v1"
  },
  "query": "temperature sensor",
  "instanceType": "node",
  "properties": ["name", "description"],
  "targetUnits": [],
  "filter": {
    "range": {
      "property": "temperature",
      "lt": 25,
      "gt": 15
    }
  },
  "includeTyping": false,
  "sort": [
    {
      "property": ["externalId"],
      "direction": "ascending"
    }
  ],
  "limit": 100
}

To perform the same query with the GraphQL endpoint:

Example search query in GraphQL
searchEquipment(
  query: "temperature sensor",
  fields: ["name", "description"],
  filter: {
    range: {
      temperature: {
        lt: 25,
        gt: 15
      }
    }
  },
  sort: { externalId: ASC }
) {
  items {
    externalId
    name
    description
    // Include additional fields here
  }
}

Text analysis and tokenization

The data you ingest is automatically indexed asynchronously. This indexing process includes a text analysis step that breaks down the text into smaller components called tokens. The tokenization allows for efficient searching and matching of terms. The tokenization process includes:

Splitting text into tokens: the text is broken into words based on whitespace and punctuation.
Lower-casing tokens: all text is converted to lowercase to make searches case-insensitive.

Tokenization example

Input	Generated tokens
`Pump_123-ABC`	`pump_123`, `abc`
`Temperature Sensor`	`temperature`, `sensor`
`Temperature_Sensor`	`temperature_sensor` (single token)
`John's pump`	`john's`, `pump`
`example.com`	`example.com` (period preserved)
`system32.exe`	`system32`, `exe` (period splits)
`123.45`	`123.45` (period preserved)

During tokenization, the following rules apply:

Underscores _ and apostrophes ' don't split tokens. They're preserved as part of the token.
Most punctuation including hyphens -, commas ,, colons :, and spaces split tokens.
Periods (.) are a special case:
- Don't split token between letters (example.com remains one token)
- Don't split token between numbers (123.45 remains one token)
- Will split tokens in most other cases.
- Split token between letter and number (like system32.exe → system32, exe)

For example, searching for sensor finds Temperature Sensor (split into two tokens) but not Temperature_Sensor (remains a single token).

Bool prefix matching

The search endpoint uses this matching approach when you enter a multi-word search query:

Convert each word except the last one to an exact match.
Convert the last word to a prefix match, allowing for partial word matching.

This approach enables precise matching for complete words while supporting partial matching on the final word.

For example, if you search for "pressure valve ma", the system creates:

An exact match for "pressure".
An exact match for "valve".
A prefix match for "ma" (which could match "main", "maintenance", "manual", etc.)

For a query to match a document, at least one of the conditions must be met. Documents matching several conditions will rank higher in the results.

Matching details

Exact term matching: complete words are matched exactly (case-insensitive).
Prefix matching: the last term of the query matches the beginning of words in the document.
OR logic by default: any term match contributes to the document's relevance score.

For example, searching for pump fail, matches items such as:

pump failure (exact match on "pump", prefix match on "fail")
Pump Station (exact match on "pump" only)
Failure detection (prefix match on "fail" only)

Items matching both terms rank higher in the results.

Phrase matching (exact sequences)

Exact phrase matches boost relevance significantly.

For example, searching for heat exchanger:

Ranks Heat Exchanger higher (exact phrase match).
Ranks Exchanger for heat lower (individual term matches only).

Limitations on matching

Matching has these limitations:

No fuzzy or typo matching: queries require correct spelling and matching prefixes.
No synonym expansion: queries are matched literally. Synonyms or abbreviations must appear explicitly in the data.

Example query matches

`valve`

Matches: Valve control unit, Safety valve unit
Matches: Ball-valve (tokenized as ball and valve)
Does not match: Valvoline (different token) or Valve's (tokenized as valve's)

`pressure sensor`

Best matches: documents containing both "pressure" and "sensor"
Lower relevance: documents with either "pressure" or "sensor" alone
Example matches:
- High pressure sensor calibration (matches both terms)
- Pressure transmitter (matches only "pressure")
- Temperature sensor (matches only "sensor")
Does not match:
- Pressured equipment (pressure is not a prefix query)

`compressor fail`

Matches:
- Compressor failure log ("fail" is a prefix of "failure")
- Compressor failing to start ("fail" is a prefix of "failing")

`oil temp`

Matches:
- Oil temperature readings ("temp" is a prefix of "temperature")
- Oil temporary storage ("temp" is a prefix of "temporary")

`flow meter calibra`

Matches:
- Flow meter calibration procedure (highest rank - all terms match)
- Flow meter maintenance (medium rank - two exact terms match)
- Calibration of temperature meters (lower rank - only "meter" and "calibra" match)

`server1.example.com/v2.0`

Matches:
- Connect to server1.example.com using v2.0 protocol (highest rank - all terms match)
- server2.example.com documentation (matches example.com)
- API v2.0 reference (matches version number)
- server1 is down after v2.0.1 upgrade (matches "server1" and prefix on "v2.0")
Does not match:
- example.net ("example.net" is preserved as a single token)
- v2 (v2.0 is preserved as a single token)

Filtering differences between the `query` and `search` endpoints

Filters work mostly the same for both the query and search endpoints, but there are a few differences in the handling of empty arrays and prefix arrays.

Exists filter with empty array

Endpoint	Behavior	Example
Query	Empty array counted as existing	`exists([])` → true
Search/Aggregate	Empty array counted as non-existing	`exists([])` → false

Prefix filter on arrays

Endpoint	Behavior
Query	Checks array prefix sequence (ordered matching). Supports `text[]` and `int[]` arrays.
Search/Aggregate	Checks each array item separately. Supports only single-value text field prefix filters (no arrays).

Examples of prefix filter behavior

Prefix Condition	Query API	Search API	Note
`"pump"` prefix of `["pump", "valve"]`	✅	✅	Both APIs match single elements.
`"pump"` prefix of `["pumping", "valve"]`	❌	✅	Only Search matches `"pumping"` (element prefix exists).
`["pump","valve"]` prefix of `["pump","valve","sensor"]`	✅	❌	Search doesn't support array prefix.
`"pump"` prefix of `["valve","pump"]`	❌	✅	Query API checks start sequence. Search any element.

Nested filters are only supported for core data model assets

Nested filters aren't supported in the search and aggregation endpoints of Cognite data models, except when filtering direct relations to core data model assets.

If you need to apply nested filters on properties that are not directly related to core data model assets, use the Query API.

Supported nested filters

The following types and properties support nested filtering in the search and aggregation endpoints:

Core data model type	Core data model property
CogniteActivity	assets
CogniteFile	assets, category
CogniteTimeSeries	assets, unit
CogniteEquipment	asset
CogniteMaintenance	asset
CogniteNotification	asset
CogniteOperation	asset

Example search query​

Text analysis and tokenization​

Tokenization example​

Bool prefix matching​

Matching details​

Phrase matching (exact sequences)​

Limitations on matching​

Example query matches​

valve​

pressure sensor​

compressor fail​

oil temp​

flow meter calibra​

server1.example.com/v2.0​

Filtering differences between the query and search endpoints​

Exists filter with empty array​

Prefix filter on arrays​

Examples of prefix filter behavior​

Nested filters are only supported for core data model assets​

Supported nested filters​