- Full-text search across text fields.
- Configurable matching logic with
AND/ORoperators. - Prefix-based word matching.
- Boosting exact phrases.
- Filtering on properties.
- Converting search results to a different unit.
See the Query features article for information about filtering.
Search process overview
The search query is executed through three distinct steps:- Text analysis and tokenization: the search query text is analyzed and split into tokens.
- Instance matching: the tokens are matched against pre-indexed instance data in your Cognite Data Fusion (CDF) project.
- Instance ranking: matched instances are sorted by their relevance to the query.
Text analysis and tokenization
The instances you ingest are automatically indexed asynchronously. This indexing process includes a text analysis step that breaks down the text into smaller components called tokens. The system compares the tokens extracted from your search query against the tokens indexed from your instances to determine matches and ranking. The tokenization process includes:- Splitting text into tokens: the text is broken into words based on whitespace and punctuation.
- Lower-casing tokens: all text is converted to lowercase to make searches case-insensitive.
Tokenization rules
The smarter search results feature (beta) introduces additional tokenization for selected fields if enabled for your CDF project.For more information, see About smarter search results.
- Letters, numbers, and underscores (
_) don’t split tokens. For example,user_name,File1, andAH12, remain a single token. - Periods (
.), and apostrophes ('):- Don’t split token between letters (for example,
e.g.anddon't, remain one token). - Don’t split token between numbers (
3.14remains one token). - Will split tokens in other cases (for example, in
end., or between letter and number).
- Don’t split token between letters (for example,
- Commas (
,):- Don’t split tokens between two numbers (
1,000remains one token). - Will split tokens between letters.
- Don’t split tokens between two numbers (
- Colons (
:):- Don’t split tokens between two letters (
Scale:Linearremains one token). - Will split tokens between numbers.
- Don’t split tokens between two letters (
- Other non-standard characters, such as double quotation mark (
"), hyphen (-), whitespace, etc., will split tokens.
Refer to the formal specification outlined in Unicode Standard Annex #29 for implementation details.
Tokenization example
| Input | Generated tokens | Explanation |
|---|---|---|
Pump_123-ABC | pump_123, abc | Hyphen (non-standard character) splits |
Temperature Sensor | temperature, sensor | Whitespace (non-standard character) splits |
Temperature_Sensor_1 | temperature_sensor_1 | Underscores (standard characters) don’t split |
John's pump | john's, pump | Apostrophe doesn’t split the sequence of letters |
example.com | example.com | Period doesn’t split the letter-to-letter sequence |
system32.exe | system32, exe | Period splits the number-to-letter sequence |
first.last 5.10 | first.last, 5.10 | Period doesn’t split the letter or number sequences |
first,last 5,10 | first, last, 5,10 | Comma splits the letter sequence but not the number sequence |
first:last 5:10 | first:last, 5, 10 | Colon splits the number sequence but not the letter sequence |
John's 1st account has 1,000.5$ dollars | John's, 1st, account, has, 1,000.5, $, dollars | Combined rules |
Instance matching
After tokenization, the search compares your query tokens against indexed tokens from instances in your CDF project. The matching behavior depends on the token’s position in your query:- Standard tokens (exact match): all tokens in the query except the last one require an exact match (case-insensitive) with a token in the instance data.
- Final token (prefix match): the last token is treated as a prefix, allowing for search-as-you-type functionality. It matches any word that starts with those characters.
Example scenario
If you search forpressure valve ma, the system matches instances based on the following criteria:
pressureandvalve: require an exact match.ma: requires a prefix match (matchingmain,manifold,manual, etc.).
Search operators
The token matching rules above determine whether individual tokens match a given instance. The search operator determines which instances qualify as a match for the entire query.| Operator | Behavior | Example (query: “pressure sensor”) |
|---|---|---|
| OR (default) | Returns instances matching at least one token. | Matches instances containing “pressure” only, “sensor” only, or both. |
| AND | Returns only instances matching all tokens. | Matches only instances containing both “pressure” AND “sensor”. |
Effective November 2026, the default search operator will change from
OR to AND. To maintain your current search behavior, we recommend explicitly setting the operator in your search queries.Limitations on matching
Consider these limitations when designing your search:- No fuzzy or typo matching: queries require correct spelling and matching prefixes.
- No synonym expansion: queries are matched literally. Synonyms or abbreviations must appear explicitly in the data.
Instance ranking
When multiple instances match a query, they’re ordered by relevance. The following factors determine the ranking:Number of matching tokens
Instances that match more tokens are ranked higher. This is mainly relevant when using theOR operator since the AND operator requires all tokens to match.
For example, searching for Heat Exchanger 243 alpha using the OR operator:
- Ranks
Heat Exchanger 243higher (matches three of four tokens). - Ranks
Heat Exchangerlower (matches two of four tokens). - Ranks
Heatlowest (matches one of four tokens).
Phrase matching (exact sequences)
Exact phrase matches boost relevance significantly. For example, searching forheat exchanger:
- Ranks
Heat Exchangerhigher (exact phrase match). - Ranks
Heat For Exchangerlower (individual tokens matches only).
Example query matches
valve
With operator: "OR":
- Matches:
Valve control unit,Safety valve unit - Matches:
Ball-valve(tokenized asballandvalve) - Does not match:
Valvoline(different token) orValve's(tokenized asvalve's)
pressure sensor
With operator: "OR":
- Best matches: documents containing both “pressure” and “sensor”
- Lower relevance: documents with either “pressure” or “sensor” alone
- Example matches:
High pressure sensor calibration(matches both tokens)Pressure transmitter(matches only “pressure”)Temperature sensor(matches only “sensor”)
- Does not match:
Pressured equipment(pressure is not a prefix query) Withoperator: "AND":
- Requires that both “pressure” and “sensor” are present as exact matches.
- Matches:
High pressure sensor calibration. - Does not match:
Pressure transmitter(missing “sensor”) orTemperature sensor(missing “pressure”).
compressor fail
With operator: "OR":
- Matches:
Compressor failure log(“fail” is a prefix of “failure”)Compressor failing to start(“fail” is a prefix of “failing”) Withoperator: "AND":
- Requires an exact match for both “compressor” and “fail”.
- Does not match:
Compressor failure log, becausefailis not an exact match for the tokenfailure. - Does not match:
Compressor failing to start, becausefailis not an exact match for the tokenfailing.
oil temp
With operator: "OR":
- Matches:
Oil temperature readings(“temp” is a prefix of “temperature”)Oil temporary storage(“temp” is a prefix of “temporary”)
flow meter calibra
With operator: "OR":
- Matches:
Flow meter calibration procedure(highest rank - all tokens match)Flow meter maintenance(medium rank - two exact tokens match)Calibration of temperature meters(lower rank - only “meter” and “calibra” match)
server1.example.com/v2.0
With operator: "OR":
- Matches:
Connect to server1.example.com using v2.0 protocol(highest rank - all tokens match)server2.example.com documentation(matches example.com)API v2.0 reference(matches version number)server1 is down after v2.0.1 upgrade(matches “server1” and prefix on “v2.0”)
- Does not match:
example.net(“example.net” is preserved as a single token)v2(v2.0is preserved as a single token)
Example search query
The search query below filters the results toEquipment with temperatures between 15 and 25 degrees Celsius, and a name or description that contains the word “temperature”, or a word that starts with “sensor”.
Example search query
Example search query in GraphQL
Filtering differences between the query and search endpoints
Filters work mostly the same for both the query and search endpoints, but there are a few differences in the handling of empty arrays and prefix arrays.
Exists filter with empty array
| Endpoint | Behavior | Example |
|---|---|---|
| Query | Empty array counted as existing | exists([]) → true |
| Search/Aggregate | Empty array counted as non-existing | exists([]) → false |
Prefix filter on arrays
| Endpoint | Behavior |
|---|---|
| Query | Checks array prefix sequence (ordered matching). Supports text[] and int[] arrays. |
| Search/Aggregate | Checks each array item separately. Supports only single-value text field prefix filters (no arrays). |
Examples of prefix filter behavior
| Prefix Condition | Query API | Search API | Note |
|---|---|---|---|
"pump" prefix of ["pump", "valve"] | ✅ | ✅ | Both APIs match single elements. |
"pump" prefix of ["pumping", "valve"] | ❌ | ✅ | Only Search matches "pumping" (element prefix exists). |
["pump","valve"] prefix of ["pump","valve","sensor"] | ✅ | ❌ | Search doesn’t support array prefix. |
"pump" prefix of ["valve","pump"] | ❌ | ✅ | Query API checks start sequence. Search any element. |
Nested filters are only supported for core data model assets
Nested filters aren’t supported in the search and aggregation endpoints of Cognite data models, except when filtering direct relations to core data model assets. If you need to apply nested filters on properties that are not directly related to core data model assets, use the Query API.Supported nested filters
The following types and properties support nested filtering in the search and aggregation endpoints:| Core data model type | Core data model property |
|---|---|
| CogniteActivity | assets |
| CogniteFile | assets, category |
| CogniteTimeSeries | assets, unit |
| CogniteEquipment | asset |
| CogniteMaintenance | asset |
| CogniteNotification | asset |
| CogniteOperation | asset |