Hopp til hovedinnhold

Configuration settings

To configure the OPC UA extractor, you must edit the configuration file. The file is in YAML format, and the sample configuration file contains all valid options with default values.

You can leave many fields empty to let the extractor use the default values. The configuration file separates the settings by component, and you can remove an entire component to disable it or use the default values.

Sample configuration files

In the extractor installation folder, the /config subfolder contains sample complete and minimal configuration files. The values wrapped in ${} are replaced with environment variables with that name. For example ${COGNITE_PROJECT} will be replaced with the value of the environment variable called COGNITE_PROJECT.

The configuration file also contains the global parameter version, which holds the version of the configuration schema used in the configuration file. This document describes version 1 of the configuration schema.

Not that it is not recommended to use the config.example.yml as a basis for configuration files. This file contains all configuration options, which is both hard to read, and may cause issues. It is intended as a reference showing how each option is configured, not as a basis. Use config.minimal.yml instead.

Tip

You can set up extraction pipelines to use versioned extractor configuration files stored in the cloud.

Minimal YAML configuration file

version: 1

source:
# The URL of the OPC-UA server to connect to
endpoint-url: 'opc.tcp://localhost:4840'

cognite:
# The project to connect to in the API, uses the environment variable COGNITE_PROJECT.
project: '${COGNITE_PROJECT}'
# Cognite authentication
# This is for Microsoft as IdP. To use a different provider,
# set implementation: Basic, and use token-url instead of tenant.
# See the example config for the full list of options.
idp-authentication:
# Directory tenant
tenant: ${COGNITE_TENANT_ID}
# Application Id
client-id: ${COGNITE_CLIENT_ID}
# Client secret
secret: ${COGNITE_CLIENT_SECRET}
# List of resource scopes, ex:
# scopes:
# - scopeA
# - scopeB
scopes:
- ${COGNITE_SCOPE}

extraction:
# Global prefix for externalId in destinations. Should be unique to prevent name conflicts.
id-prefix: 'gp:'
# Map OPC-UA namespaces to prefixes in CDF. If not mapped, the full namespace URI is used.
# Saves space compared to using the full URL. Using the ns index is not safe as the order can change on the server.
# It is recommended to set this before extracting the node hierarchy.
# For example:
# NamespaceMap:
# "urn:cognite:net:server": cns
# "urn:freeopcua:python:server": fps
# "http://examples.freeopcua.github.io": efg

ProtoNodeId

You can provide an OPC UA NodeId in several places in the configuration file, these are YAML objects with the following structure:

node:
node-id: i=123
namespace-uri: opc.tcp://test.test

To find the node IDs we recommend using the UAExpert tool.

Locate the node you need the ID of in the hierarchy, the find the node ID on the right side under Attribute > NodeId. Find the Namespace Uri by matching the NamespaceIndex on the right to the dropdown on the left, in the node hierarchy view. The default value is No highlight.

Timestamps and intervals

In most places where time intervals are required, you can use a CDF-like syntax of [N][timeunit], for example 10m for 10 minutes or 1h for 1 hour. timeunit is one of d, h, m, s, ms. You can also use a cron expression in some places.

For history start and end times you can use a similar syntax. [N][timeunit] and [N][timeunit]-ago. 1d-ago means 1 day in the past from the time history starts, and 1h means 1 hour in the future. For instance, you can use this syntax to configure the extractor to read only recent history.

Using values from Azure Key Vault

The OPC UA extractor also supports loading values from Azure Key Vault. To load a configuration value from Azure Key Vault, use the !keyvault tag followed by the name of the secret you want to load. For example, to load the value of the my-secret-name secret in Key Vault into a password parameter, configure your extractor like this:

password: !keyvault my-secret-name

To use Key Vault, you also need to include the azure-keyvault section in your configuration, with the following parameters:

ParameterDescription
keyvault-nameName of Key Vault to load secrets from
authentication-methodHow to authenticate to Azure. Either default or client-secret. For default, the extractor will look at the user running the extractor, and look for pre-configured Azure logins from tools like the Azure CLI. For client-secret, the extractor will authenticate with a configured client ID/secret pair.
client-idRequired for using the client-secret authentication method. The client ID to use when authenticating to Azure.
secretRequired for using the client-secret authentication method. The client secret to use when authenticating to Azure.
tenant-idRequired for using the client-secret authentication method. The tenant ID of the Key Vault in Azure.

Example:

azure-keyvault:
keyvault-name: my-keyvault-name
authentication-method: client-secret
tenant-id: 6f3f324e-5bfc-4f12-9abe-22ac56e2e648
client-id: 6b4cc73e-ee58-4b61-ba43-83c4ba639be6
secret: 1234abcd

Configure the OPC UA extractor

OPC-UA extractor configuration

ParameterTypeDescription
versionintegerVersion of the config file, the extractor specifies which config file versions are accepted in each version of the extractor.
dry-runbooleanSet this to true to prevent the extractor from writing anything to CDF. This is useful for debugging the extractor configuration.
sourceobject
loggerobjectConfiguration for logging to console or file. Log entries are either Fatal, Error, Warning, Information, Debug, or Verbose, in order of decreasing priority. The extractor will log any messages at an equal or higher log level than the configured level for each sink.
metricsobjectConfiguration for prometheus metrics destinations.
cogniteobjectConfigure connection to Cognite Data Fusion (CDF)
mqttobjectPush data to CDF one-way over MQTT. This requires that the MQTT-CDF Bridge application is running somewhere with access to CDF.
influxobjectConfiguration for pushing to an InfluxDB database. Data points and events will be pushed, but no context or metadata.
extractionobjectConfiguration for general extraction options, such as data types, mapping, and filters.
eventsobjectConfiguration for extracting OPC UA events and alarams as CDF events or litedb time series
failure-bufferobjectIf the connection to CDF goes down, the OPC UA extractor supports buffering data points and events in a local file or InfluxDB. This is helpful if the connection is unstable, and the server does not provide its own historical data.
historyobjectConfiguration for reading historical datapoints and events from the server
state-storageobjectUse a local LiteDb database or a set of tables in CDF RAW to store persistent information between runs. This can be used to avoid loading large volumes of data from CDF on startup, which can greatly speed up the extractor.
subscriptionsobjectA few options for subscriptions to events and data points. Subscriptions in OPC UA consist of Subscription objects on the server, which contain a list of MonitoredItems. By default, the extractor produces a maximum of four subscriptions:
* DataChangeListener - handles data point subscriptions.
* EventListener - handles event subscriptions.
* AuditListener - which handles audit events.
* NodeMetrics - which handles subscriptions for use as metrics.

Each of these can contain a number of MonitoredItems.
pub-subobjectConfigure the extractor to read from MQTT using OPC-UA pubsub. This requires the server pubsub configuration to be exposed through the Server object. You should consider setting subscriptions: data-points: false to avoid duplicate datapoints if this is enabled.
high-availabilityobjectConfiguration to allow you to run multiple redundant extractors. Each extractor needs a unique index.

source

Global parameter.

ParameterTypeDescription
endpoint-urlstringThe URL of the OPC UA server to connect to.

In practice, this is the URL of the discovery server, where multiple levels of severity may be provided. The OPC UA extractor attempts to use the highest security possible based on the configuration.

Example:
opc.tcp://some-host:1883
alt-endpoint-urlslistList of alternative endpoint URLs the extractor can attempt when connecting to the server. Use this for non-transparent redundancy. See the OPC UA standard part 4, section 6.6.2.

We recommend setting force-restart to true. Otherwise, the extractor will reconnect to the same server each time.
endpoint-detailsobjectDetails used to override default endpoint behavior. This is used to make the client connect directly to an OPC UA endpoint, for example if the server is behind NAT (Network Address Translation), circumventing server discovery.
redundancyobjectAdditional configuration options related to redundant servers. The OPC UA extractor supports Cold redundancy, as described in the OPC UA standard part 4, section 6.6.2.
reverse-connect-urlstringThe local URL used for reverse connect, which means that the server is responsible for initiating connections, not the extractor. This lets the server be behind a firewall, forbidding incoming connections. You must also specify an endpoint-url, to indicate to the extractor where it should accept connections from.
auto-acceptbooleanSet to true to automatically accept server certificates.

If this is disabled, received server certificates will be placed in the rejected certificates folder (by default application_dir/certificates/pki/rejected), and you can manually move them to te accepted certificates folder (application_dir/certificates/pki/accepted). Setting this to true makes the extractor move certificates automatically.

A simple solution would be to set this to true for the first connection, then change it to false.

Warning: This should be disabled if the extractor is running on an untrusted network. Default value is True.
usernamestringOPC UA server username, leave empty to disable username/password authentication.
passwordstringOPC UA server password.
x509-certificateobjectSpecifies the configuration for using a signed x509 certificate to connect to the server. Note that this is highly server specific. The extractor uses the provided certificate to sign requests sent to the server. The server must have a mechanism to validate this signature. Typically the certificate must be provided by the server.
securebooleanSet this to true to make the extractor try to connect to an endpoint with security above None. If this is enabled, the extractor will try to pick the most secure endpoint, meaning the endpoint with the longest of the most modern cipher types.
ignore-certificate-issuesbooleanIgnore all suppressible certificate errors on the server certificate. You can use this setting if you receive errors such as Certificate use not allowed.

CAUTION: This is potentially a security risk. Bad certificates can open the extractor to man-in-the-middle attacks or similar. If the server is secured in other ways (it is running locally, over a secure VPN, or similar), it is most likely fairly harmless.

Some errors are not suppressible and must be remedied on the server. Note that enabling this is always a workaround for the server violating the OPC UA standard in some way.
publishing-intervalintegerSets the interval (in milliseconds) between publish requests to the server, which is when the extractor asks the server for updates to any active subscriptions.

This limits the maximum frequency of points pushed to CDF, but not the maximum frequency of points on the server. In most cases, this can be set to the same as extraction.data-push-delay. If set to 0 the server chooses the interval to be as low as it supports. Be aware that some servers set this lower limit very low, which may create considerable load on the server. Default value is 500.
force-restartbooleanIf true, the extractor will not attempt to reconnect using the OPC UA reconnect protocol if connection is lost, but instead always create a new connection. Only enable this if reconnect is causing issues with the server. Even if this is disabled, the extractor will generally fall back on regular reconnects if the server produces unexpected errors on reconnect.
exit-on-failurebooleanIf true, the OPC UA extractor will be restarted completely on reconnect. Enable this if the server is expected to change dramatically while running, and the extractor cannot keep using state from previous runs.
keep-alive-intervalintegerSpecifies the interval in milliseconds between each keep-alive request to the server. The connection times out if a keep-alive request fails twice (2 * interval + 100ms). This typically happens if the server is down, or if it is hanging on a heavy operation and doesn't manage to respond to keep alive requests. Set this higher if keep alives often time out without the server being down. Default value is 5000.
restart-on-reconnectbooleanIf true, the OPC UA extractor will be restarted after reconnecting to the server. This may not be required if the server is the server is expected to not change much, and that it handles reconnects well.
node-set-sourceobjectRead from NodeSet2 files instead of browsing the OPC UA node hierarchy. This is useful for certain smaller servers, where the full node hierarchy is known before-hand. In general, it can be used to lower the load on the server.
alt-source-background-browsebooleanIf true, browses the OPC UA node hierarchy in the background when obtaining nodes from an alternative source, such as CDF Raw or NodeSet2 files.
limit-to-server-configbooleanUses the Server/ServerCapabilities node in the OPC UA server to limit chunk sizes. Set this to false only if you know the server reports incorrect limits and you want to set them higher. If the real server limits are exceeded, the extractor will typically crash. Default value is True.
browse-nodes-chunkintegerSets the maximum number of nodes per call to the Browse service. Large numbers are likely to exceed the server's tolerance. Lower numbers greatly increase startup time. Default value is 1000.
browse-chunkintegerSets the maximum requested results per node for each call to the Browse service. The server may decide to return fewer. Setting this lower increases startup times. Setting it to 0 leaves the decision up to the server. Default value is 1000.
attributes-chunkintegerSpecifies the maximum number of attributes to fetch per call to the Read service. If the server fails with TooManyOperations during attribute read, it may help to lower this value. This should be set as high as possible for large servers. Default value is 10000.
subscription-chunkintegerSets the maximum number of new MonitoredItems to create per operation. If the server fails with TooManyOperations when creating monitored items, try lowering this value. Default value is 1000.
browse-throttlingobjectSettings for throttling browse operations.
certificate-expiryintegerSpecifies the default application certificate expiration time in months. You can also replace the certificate manually by modifying the opc.ua.net.extractor.Config.xml configuration file. Note that the default values was changed as of version 2.5.3. Default value is 60.
retriesobjectConfiguration for retries towards the source.

alt-endpoint-urls

Part of source configuration.

List of alternative endpoint URLs the extractor can attempt when connecting to the server. Use this for non-transparent redundancy. See the OPC UA standard part 4, section 6.6.2.

We recommend setting force-restart to true. Otherwise, the extractor will reconnect to the same server each time.

Each element of this list should be a string.

endpoint-details

Part of source configuration.

Details used to override default endpoint behavior. This is used to make the client connect directly to an OPC UA endpoint, for example if the server is behind NAT (Network Address Translation), circumventing server discovery.

ParameterTypeDescription
override-endpoint-urlstringEndpoint URL to override URLs returned from discovery. This can be used if the server is behind NAT, or similar URL rewrites.

redundancy

Part of source configuration.

Additional configuration options related to redundant servers. The OPC UA extractor supports Cold redundancy, as described in the OPC UA standard part 4, section 6.6.2.

ParameterTypeDescription
service-level-thresholdintegerServers above this threshold are considered live. If the server drops below this level, the extractor will switch, provided monitor-service-level is set to true. Default value is 200.
reconnect-intervalstringIf using redundancy, the extractor will attempt to find a better server with this interval if service level is below threshold. Format is as given in Timestamps and intervals. Default value is 10m.
monitor-service-levelbooleanIf true, the extractor will subscribe to changes in ServiceLevel and attempt to change server once it drops below service-level-threshold.

This also prevents the extractor from updating states while service level is below the threshold, letting servers inform the extractor that they are not receiving data from all sources, and history should not be trusted. Once the service level goes back above the threshold, history will be read to fill any gaps.

x509-certificate

Part of source configuration.

Specifies the configuration for using a signed x509 certificate to connect to the server. Note that this is highly server specific. The extractor uses the provided certificate to sign requests sent to the server. The server must have a mechanism to validate this signature. Typically the certificate must be provided by the server.

ParameterTypeDescription
file-namestringPath to local x509-certificate
passwordstringPassword for local x509-certificate file
storeeither None, Local or UserLocal certificate store to use. One of None (to use a file), Local (for LocalMachine) or User for the User store. Default value is None.
cert-namestringName of certificate in store. Required to use store

Example:
CN=MyCertificate

node-set-source

Part of source configuration.

Read from NodeSet2 files instead of browsing the OPC UA node hierarchy. This is useful for certain smaller servers, where the full node hierarchy is known before-hand. In general, it can be used to lower the load on the server.

ParameterTypeDescription
node-setslistRequired. List of nodesets to read. Specified by URL, file name, or both. If no name is specified, the last segment of the URL is used as file name. File name is where downloaded files are saved, and where the extractor looks for existing files.

Note that typically, you will need to define all schemas your server schema depends on. All servers should depend on the base OPC UA schema, so you should always include https://files.opcfoundation.org/schemas/UA/1.04/Opc.Ua.NodeSet2.xml

Example:
[{'file-name': 'Server.NodeSet2.xml'}, {'url': 'https://files.opcfoundation.org/schemas/UA/1.04/Opc.Ua.NodeSet2.xml'}]
instancebooleanIf true, the instance hierarchy is not obtained from the server, but instead read from the NodeSet2 files.
typesbooleanIf true, event types, reference types, object types, and variable types are obtained from NodeSet2 files instead of the server.

node-sets

Part of node-set-source configuration.

List of nodesets to read. Specified by URL, file name, or both. If no name is specified, the last segment of the URL is used as file name. File name is where downloaded files are saved, and where the extractor looks for existing files.

Note that typically, you will need to define all schemas your server schema depends on. All servers should depend on the base OPC UA schema, so you should always include https://files.opcfoundation.org/schemas/UA/1.04/Opc.Ua.NodeSet2.xml

Each element of this list should be a configuration specifying a node set file.

ParameterTypeDescription
file-namestringPath to nodeset file. This is either the place where the downloaded file is saved, or a previously downloaded file.
urlstringURL of publicly available nodeset file.

Example:

[{'file-name': 'Server.NodeSet2.xml'}, {'url': 'https://files.opcfoundation.org/schemas/UA/1.04/Opc.Ua.NodeSet2.xml'}]

browse-throttling

Part of source configuration.

Settings for throttling browse operations.

ParameterTypeDescription
max-per-minuteintegerMaximum number of requests per minute, approximately.
max-parallelismintegerMaximum number of parallel requests.
max-node-parallelismintegerMaximum number of concurrent nodes accross all parallel requests.

retries

Part of source configuration.

Configuration for retries towards the source.

ParameterTypeDescription
timeouteither string or integerGlobal timeout. After this much time has passed, new retries will not be created. Set this to zero for no timeout. Syntax is N[timeUnit] where timeUnit is d, h, m, s or ms. Default value is 0s.
max-triesintegerMaximum number of attempts. 1 means that only the initial attempt will be made, 0 or less retries forever. Default value is 5.
max-delayeither string or integerMaximum delay between attempts, incremented using exponential backoff. Set this to 0 for no upper limit. Syntax is N[timeUnit] where timeUnit is d, h, m, s or ms. Default value is 0s.
initial-delayeither string or integerInitial delay used for exponential backoff. Time between each retry is calculated as min(max-delay, initial-delay * 2 ^ retry), where 0 is treated as infinite for max-delay. The maximum delay is about 10 minutes (13 retries). Syntax is N[timeUnit] where timeUnit is d, h, m, s or ms. Default value is 500ms.
retry-status-codeslistList of additional OPC-UA status codes to retry on. In additional to defaults. Should be integer values from http://www.opcfoundation.org/UA/schemas/StatusCode.csv, or symbolic names as shown in the same .csv file.

retry-status-codes

Part of retries configuration.

List of additional OPC-UA status codes to retry on. In additional to defaults. Should be integer values from http://www.opcfoundation.org/UA/schemas/StatusCode.csv, or symbolic names as shown in the same .csv file.

Each element of this list should be a string.

logger

Global parameter.

Configuration for logging to console or file. Log entries are either Fatal, Error, Warning, Information, Debug, or Verbose, in order of decreasing priority. The extractor will log any messages at an equal or higher log level than the configured level for each sink.

ParameterTypeDescription
consoleobjectConfiguration for logging to the console.
fileobjectConfiguration for logging to a rotating log file.
trace-listenerobjectAdds a listener that uses the configured logger to output messages from System.Diagnostics.Trace
ua-trace-leveleither verbose, debug, information, warning, error or fatalCapture OPC UA tracing at this level or above.
ua-session-tracingstringLog data sent to and received from the OPC-UA server.

WARNING: This produces an enormous amount of logs, only use this when running against a small number of nodes, producing a limited number of datapoints, and make sure it is not turned on in production.

console

Part of logger configuration.

Configuration for logging to the console.

ParameterTypeDescription
leveleither verbose, debug, information, warning, error or fatalRequired. Minimum level of log events to write to the console. If not present, or invalid, logging to console is disabled.
stderr-leveleither verbose, debug, information, warning, error or fatalLog events at this level or above are redirected to standard error.

file

Part of logger configuration.

Configuration for logging to a rotating log file.

ParameterTypeDescription
leveleither verbose, debug, information, warning, error or fatalRequired. Minimum level of log events to write to file.
pathstringRequired. Path to the files to be logged. If this is set to logs/log.txt, logs on the form logs/log[date].txt will be created, depending on rolling-interval.
retention-limitintegerMaximum number of log files that are kept in the log folder. Default value is 31.
rolling-intervaleither day or hourRolling interval for log files. Default value is day.

trace-listener

Part of logger configuration.

Adds a listener that uses the configured logger to output messages from System.Diagnostics.Trace

ParameterTypeDescription
leveleither verbose, debug, information, warning, error or fatalRequired. Level to output trace messages at

metrics

Global parameter.

Configuration for prometheus metrics destinations.

ParameterTypeDescription
serverobjectConfiguration for a prometheus scrape server.
push-gatewayslistA list of push gateway destinations to push metrics to
nodesobjectConfiguration to treat OPC-UA nodes as metrics. Values will be mapped to opcua_nodes_[NODE-DISPLAY-NAME] in prometheus

server

Part of metrics configuration.

Configuration for a prometheus scrape server.

ParameterTypeDescription
hoststringRequired. Host name for the server.

Example:
localhost
portintegerRequired. Port to host the prometheus scrape server on

push-gateways

Part of metrics configuration.

A list of push gateway destinations to push metrics to

ParameterTypeDescription
hoststringRequired. URI of the pushgateway host

Example:
http://localhost:9091
jobstringRequired. Name of the job
usernamestringUsername for basic authentication
passwordstringPassword for basic authentication
push-intervalintegerInterval in seconds between each push to the gateway. Default value is 1.

nodes

Part of metrics configuration.

Configuration to treat OPC-UA nodes as metrics. Values will be mapped to opcua_nodes_[NODE-DISPLAY-NAME] in prometheus

ParameterTypeDescription
server-metricsbooleanMap a few relevant static diagnostics contained in the Server/ServerDiagnosticsSummary node to prometheus metrics.
other-metricslistList of additional nodes to read as metrics.

other-metrics

Part of nodes configuration.

List of additional nodes to read as metrics.

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

cognite

Global parameter.

Configure connection to Cognite Data Fusion (CDF)

ParameterTypeDescription
projectstringCDF project to connect to.
idp-authenticationobjectThe idp-authentication section enables the extractor to authenticate to CDF using an external identity provider (IdP), such as Microsoft Entra ID (formerly Azure Active Directory).
See OAuth 2.0 client credentials flow
hoststringInsert the base URL of the CDF project. Default value is https://api.cognitedata.com.
cdf-retriesobjectConfigure automatic retries on requests to CDF.
cdf-chunkingobjectConfigure chunking of data on requests to CDF. Note that increasing these may cause requests to fail due to limits in the API itself
cdf-throttlingobjectConfigure the maximum number of parallel requests for different CDF resources.
sdk-loggingobjectConfigure logging of requests from the SDK
nan-replacementeither number or nullReplacement for NaN values when writing to CDF. If left out, NaN values are skipped.
extraction-pipelineobjectConfigure an associated extraction pipeline
certificatesobjectConfigure special handling of SSL certificates. This should never be considered a permanent solution to certificate problems
data-setobjectData set used for new time series, assets, events, and relationships. Existing objects will not be updated
read-extracted-rangesbooleanSpecifies whether to read start/end-points for datapoints on startup, where possible. It is generally recommended to use the state-storage instead of this. Default value is True.
metadata-targetsobjectConfigure targets for node metadata. This configures which resources other than time series datapoints are created. By default, if this is left out, data is written to assets and time series metadata. Note that this behavior is deprecated, in the future leaving this section out will result in no metadata being written at all.
metadata-mappingobjectDefine mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have a property named EngineeringUnits, which ideally should be mapped to unit in CDF. This property lets you do that.

Example:
{'timeseries': {'EngineeringUnits': 'unit', 'EURange': 'description'}, 'assets': {'Name': 'name'}}
raw-node-bufferobjectRead from CDF instead of OPC-UA when starting, to speed up start on slow servers. Requires extraction.data-types.expand-node-ids and extraction.data-types.append-internal-values to be set to true.

This should generaly be enabled along with metadata-targets.raw or with no metadata targets at all.

If browse-on-empty is set to true and metadata-targets.raw is configured with the same database and tables the extractor will read into raw on first run, then use raw as source for later runs. The Raw database can be deleted, and it will be re-created on extractor restart.
browse-callbackobjectSpecify a CDF function that is called after nodes are pushed to CDF, reporting the number of changed and created nodes. The function is called with a JSON object.
The counts for assets, time series, and relationships also include rows created in CDF Raw tables if raw is enabled as a metadata target.
delete-relationshipsbooleanIf this is set to true, relationships deleted from the source will be hard-deleted in CDF. Relationships do not have metadata, so soft-deleting them is not possible.

idp-authentication

Part of cognite configuration.

The idp-authentication section enables the extractor to authenticate to CDF using an external identity provider (IdP), such as Microsoft Entra ID (formerly Azure Active Directory). See OAuth 2.0 client credentials flow

ParameterTypeDescription
authoritystringAInsert the authority together with tenant to authenticate against Azure tenants. Default value is https://login.microsoftonline.com/.
client-idstringRequired. Enter the service principal client id from the IdP.
tenantstringEnter the Azure tenant.
token-urlstringInsert the URL to fetch tokens from.
secretstringEnter the service principal client secret from the IdP.
resourcestringResource parameter passed along with token requests.
audiencestringAudience parameter passed along with token requests.
scopesconfiguration for either list or string
min-ttlintegerInsert the minimum time in seconds a token will be valid. If the cached token expires in less than min-ttl seconds, it will be refreshed even if it is still valid. Default value is 30.
certificateobjectAuthenticate with a client certificate

certificate

Part of idp-authentication configuration.

Authenticate with a client certificate

ParameterTypeDescription
authority-urlstringAuthentication authority URL
pathstringRequired. Enter the path to the .pem or .pfx certificate to be used for authentication
passwordstringEnter the password for the key file, if it is encrypted.

cdf-retries

Part of cognite configuration.

Configure automatic retries on requests to CDF.

ParameterTypeDescription
timeoutintegerTimeout in milliseconds for each individual request to CDF. Default value is 80000.
max-retriesintegerMaximum number of retries on requests to CDF. If this is less than 0, retry forever. Default value is 5.
max-delayintegerMax delay in milliseconds between each retry. Base delay is calculated according to 125*2^retry milliseconds. If less than 0, there is no maximum. Default value is 5000.

cdf-chunking

Part of cognite configuration.

Configure chunking of data on requests to CDF. Note that increasing these may cause requests to fail due to limits in the API itself

ParameterTypeDescription
time-seriesintegerMaximum number of timeseries per get/create timeseries request. Default value is 1000.
assetsintegerMaximum number of assets per get/create assets request. Default value is 1000.
data-point-time-seriesintegerMaximum number of timeseries per datapoint create request. Default value is 10000.
data-point-deleteintegerMaximum number of ranges per delete datapoints request. Default value is 10000.
data-point-listintegerMaximum number of timeseries per datapoint read request. Used when getting the first point in a timeseries. Default value is 100.
data-pointsintegerMaximum number of datapoints per datapoints create request. Default value is 100000.
data-points-gzip-limitintegerMinimum number of datapoints in request to switch to using gzip. Set to -1 to disable, and 0 to always enable (not recommended). The minimum HTTP packet size is generally 1500 bytes, so this should never be set below 100 for numeric datapoints. Even for larger packages gzip is efficient enough that packages are compressed below 1500 bytes. At 5000 it is always a performance gain. It can be set lower if bandwidth is a major issue. Default value is 5000.
raw-rowsintegerMaximum number of rows per request to cdf raw. Default value is 10000.
raw-rows-deleteintegerMaximum number of row keys per delete request to raw. Default value is 1000.
data-point-latestintegerMaximum number of timeseries per datapoint read latest request. Default value is 100.
eventsintegerMaximum number of events per get/create events request. Default value is 1000.
sequencesintegerMaximum number of sequences per get/create sequences request. Default value is 1000.
sequence-row-sequencesintegerMaximum number of sequences per create sequence rows request. Default value is 1000.
sequence-rowsintegerMaximum number of sequence rows per sequence when creating rows. Default value is 10000.

cdf-throttling

Part of cognite configuration.

Configure the maximum number of parallel requests for different CDF resources.

ParameterTypeDescription
time-seriesintegerMaximum number of parallel requests per timeseries operation. Default value is 20.
assetsintegerMaximum number of parallel requests per assets operation. Default value is 20.
data-pointsintegerMaximum number of parallel requests per datapoints operation. Default value is 10.
rawintegerMaximum number of parallel requests per raw operation. Default value is 10.
rangesintegerMaximum number of parallel requests per get first/last datapoint operation. Default value is 20.
eventsintegerMaximum number of parallel requests per events operation. Default value is 20.
sequencesintegerMaximum number of parallel requests per sequences operation. Default value is 10.

sdk-logging

Part of cognite configuration.

Configure logging of requests from the SDK

ParameterTypeDescription
disablebooleanTrue to disable logging from the SDK, it is enabled by default
leveleither trace, debug, information, warning, error, critical or noneLog level to log messages from the SDK at. Default value is debug.
formatstringFormat of the log message. Default value is CDF ({Message}): {HttpMethod} {Url} {ResponseHeader[X-Request-ID]} - {Elapsed} ms.

extraction-pipeline

Part of cognite configuration.

Configure an associated extraction pipeline

ParameterTypeDescription
external-idstringExternal ID of the extraction pipeline
frequencyintegerFrequency to report Seen to the extraction pipeline in seconds. Less than or equal to zero will not report automatically. Default value is 600.

certificates

Part of cognite configuration.

Configure special handling of SSL certificates. This should never be considered a permanent solution to certificate problems

ParameterTypeDescription
accept-allbooleanAccept all remote SSL certificates. This introduces a severe risk of man-in-the-middle attacks
allow-listlistList of certificate thumbprints to automatically accept. This is a much smaller risk than accepting all certificates

allow-list

Part of certificates configuration.

List of certificate thumbprints to automatically accept. This is a much smaller risk than accepting all certificates

Each element of this list should be a string.

data-set

Part of cognite configuration.

Data set used for new time series, assets, events, and relationships. Existing objects will not be updated

ParameterTypeDescription
idintegerData set internal ID
external-idstringData set external ID

metadata-targets

Part of cognite configuration.

Configure targets for node metadata. This configures which resources other than time series datapoints are created. By default, if this is left out, data is written to assets and time series metadata. Note that this behavior is deprecated, in the future leaving this section out will result in no metadata being written at all.

ParameterTypeDescription
rawobjectWrite metadata to the CDF staging area (Raw).
cleanobjectWrite metadata to CDF clean, assets, time series, and relationships.
data-modelsobjectALPHA: Write metadata to CDF Data Models.

This will create CDF data models based on the OPC UA type hierarchy, then populate them with data from the OPC UA node hierarchy. Note that this requires extraction.relationships.enabled and extraction.relationships.hierarchical to be set to true, and there must be exactly one root node with ID i=84.

Note that this feature is in alpha there may be changes that require you to delete the data model from CDF, and breaking changes to the configuration schema. These changes will not be considered breaking changes to the extractor.

raw

Part of metadata-targets configuration.

Write metadata to the CDF staging area (Raw).

ParameterTypeDescription
databasestringRequired. The CDF Raw database to write to.
assets-tablestring
timeseries-tablestringName of the Raw table to write time series metadata to.
relationships-tablestringName of the Raw table to write relationships metadata to.

clean

Part of metadata-targets configuration.

Write metadata to CDF clean, assets, time series, and relationships.

ParameterTypeDescription
assetsbooleanSet to true to enable creating CDF assets from OPC UA nodes.
timeseriesbooleanSet to true to enable adding metadata to CDF time series based on OPC UA properties.
relationshipsbooleanSet to true to enable creating relationships from OPC UA references. Requires extraction.relationships to be enabled.

data-models

Part of metadata-targets configuration.

ALPHA: Write metadata to CDF Data Models.

This will create CDF data models based on the OPC UA type hierarchy, then populate them with data from the OPC UA node hierarchy. Note that this requires extraction.relationships.enabled and extraction.relationships.hierarchical to be set to true, and there must be exactly one root node with ID i=84.

Note that this feature is in alpha there may be changes that require you to delete the data model from CDF, and breaking changes to the configuration schema. These changes will not be considered breaking changes to the extractor.

ParameterTypeDescription
enabledbooleanRequired. Set this to true to enable writing to CDF Data Models.
model-spacestringRequired. Set the space to create data models in. The space will be created if it does not exist.
instance-spacestringRequired. Set the space instances will be created in. The space will be created if it does not exist. May be the same as model-space.
model-versionstringRequired. Version used for created data model and all created views.
types-to-mapeither Referenced, Custom or AllConfigure which types to map to Data Models.

Referenced means that only types that are referenced by instance nodes will be created.
Custom means that all types not in the base namespace will be created.
All means that all types will be created.

Note: Setting this to All is rarely useful, and may produce impractically large models. Default value is Custom.
skip-simple-typesbooleanDo not create views without their own connections or properties. Simplifies the model greatly, but reduces the number of distinct types in your model.
ignore-mandatorybooleanLet mandatory options be nullable. Many servers do not obey Mandatory requirements in their own models, which breaks when they are ingested into CDF, where nullable constraints are enforced.
connection-target-mapobjectTarget connections on the form "Type.Property": "Target". This is useful for certain schemas. This overrides the expected type of specific CDF Connections, letting you override incorrect schemas. For example, the published nodeset file for ISA-95 incorrectly states that the EquipmentClass reference for EquipmentType is an Object, while it should be an ObjectClass.

Example:
{'EquipmentType.EquipmentClass': 'ObjectType'}
enable-deletesboolean
connection-target-map

Part of data-models configuration.

Target connections on the form "Type.Property": "Target". This is useful for certain schemas. This overrides the expected type of specific CDF Connections, letting you override incorrect schemas. For example, the published nodeset file for ISA-95 incorrectly states that the EquipmentClass reference for EquipmentType is an Object, while it should be an ObjectClass.

Example:

EquipmentType.EquipmentClass: ObjectType
ParameterTypeDescription
Any string matching [A-z0-9-_.]+either ObjectType, Object, VariableType, Variable, ReferenceType, DataType, View or MethodNodeClass to override connection with.

metadata-mapping

Part of cognite configuration.

Define mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have a property named EngineeringUnits, which ideally should be mapped to unit in CDF. This property lets you do that.

Example:

timeseries:
EngineeringUnits: unit
EURange: description
assets:
Name: name
ParameterTypeDescription
assetsobjectMap metadata for assets.
timeseriesobjectMap metadata for time series.

assets

Part of metadata-mapping configuration.

Map metadata for assets.

ParameterTypeDescription
Any stringeither description, name or parentIdTarget asset attribute

timeseries

Part of metadata-mapping configuration.

Map metadata for time series.

ParameterTypeDescription
Any stringeither description, name, parentId or unitTarget time series attribute

raw-node-buffer

Part of cognite configuration.

Read from CDF instead of OPC-UA when starting, to speed up start on slow servers. Requires extraction.data-types.expand-node-ids and extraction.data-types.append-internal-values to be set to true.

This should generaly be enabled along with metadata-targets.raw or with no metadata targets at all.

If browse-on-empty is set to true and metadata-targets.raw is configured with the same database and tables the extractor will read into raw on first run, then use raw as source for later runs. The Raw database can be deleted, and it will be re-created on extractor restart.

ParameterTypeDescription
enablebooleanRequired. Set to true to enable loading nodes from CDF Raw.
databasestringRequired. CDF RAW database to read from.
assets-tablestringCDF RAW table to read assets from, used for events. This is not useful if there are no custom nodes generating events in the server.
timeseries-tablestringCDF RAW table to read time series from.
browse-on-emptybooleanRun normal browse if nothing is found when reading from CDF, either because the tables are empty, or because they do not exist. Note that nodes may be present in the CDF RAW tables. Browse will run if no valid nodes are found, even if there are nodes present in RAW.

browse-callback

Part of cognite configuration.

Specify a CDF function that is called after nodes are pushed to CDF, reporting the number of changed and created nodes. The function is called with a JSON object. The counts for assets, time series, and relationships also include rows created in CDF Raw tables if raw is enabled as a metadata target.

The function will be called with the following JSON object:

{
"idPrefix": "Configured ID prefix",
"assetsCreated": "The number of assets reported to CDF",
"assetsUpdated": "The number of assets updated in CDF",
"timeSeriesCreated": "The number of time series created in CDF",
"timeSeriesUpdated": "The number of time series created in CDF",
"minimalTimeSeriesCreated": "The number of time series created with no metadata. Only used if time series metadata is not written to CDF clean",
"relationshipsCreated": "The number of new relationships created in CDF",
"rawDatabase": "Name of the configured CDF RAW database",
"assetsTable": "Name of the configured CDF RAW table for assets",
"timeSeriesTable": "Name of the configured CDF RAW table for time series",
"relationshipsTable": "Name of the configured CDF RAW table for relationships."
}
ParameterTypeDescription
idintegerInternal ID of function to call.
external-idstringExternal ID of function to call
report-on-emptybooleanCall callback even if zero items are created or updated.

mqtt

Global parameter.

Push data to CDF one-way over MQTT. This requires that the MQTT-CDF Bridge application is running somewhere with access to CDF.

ParameterTypeDescription
hoststringRequired. The address of the MQTT broker.

Example:
localhost
portintegerRequired. Port to connect to on the MQTT broker.

Example:
1883
usernamestringThe MQTT broker username. Leave empty to connect without authentication.
passwordstringThe MQTT broker password. Leave empty to connect without authentication.
use-tlsbooleanSet this to true to enable Transport Level Security when communicating with the broker.
allow-untrusted-certificatesbooleanSet this to true to allow untrusted SSL certificates when communicating with the broker. This is a security risk, we recommend using custom-certificate-authority instead.
custom-certificate-authoritystringPath to certificate file for a certificate authority the broker SSL certificate will be verified against.
client-idstringMQTT client id. Should be unique for a given broker. Default value is cognite-opcua-extractor.
data-set-idintegerData set to use for new assets, relationships, events, and time series. Existing objects will not be updated.
asset-topicstringTopic to publish assets on. Default value is cognite/opcua/assets.
ts-topicstringTopic to publish timeseries on. Default value is cognite/opcua/timeseries.
event-topicstringTopic to publish events on. Default value is cognite/opcua/events.
datapoint-topicstringTopic to publish datapoints on. Default value is cognite/opcua/datapoints.
raw-topicstringTopic to publish raw rows on. Default value is cognite/opcua/raw.
relationship-topicstringTopic to publish relationships on. Default value is cognite/opcua/relationships.
local-statestringSet to enable storing a list of created assets/timeseries to local litedb. Requires state-storage.location to be set. If this is left empty, metadata will have to be read each time the extractor restarts.
invalidate-beforeintegerTimestamp in ms since epoch to invalidate stored mqtt states. On extractor restart, assets/timeseries created before this will be re-created in CDF. They will not be deleted or updated. Requires the state-storage to be enabled.
skip-metadatabooleanDo not push any metadata at all. If this is true, plan timeseries without metadata will be created, like when using raw-metadata, and datapoints will be pushed. Nothing will be written to raw and no assets will be created. Events will be created, but without asset context
raw-metadataobjectStore assets/timeseries metadata and relationships in raw. Assets will not be created at all, timeseries will be created with just externalId, isStep, and isString. Both timeseries and assets will be persisted in their entirity to CDF Raw. Datapoints are not affected.
Events will be created but without being contextualized to assets. The external ID of the source node is added to metadata if applicable
metadata-mappingobjectDefine mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have a property named EngineeringUnits, which ideally should be mapped to unit in CDF. This property lets you do that.

Example:
{'timeseries': {'EngineeringUnits': 'unit', 'EURange': 'description'}, 'assets': {'Name': 'name'}}

raw-metadata

Part of mqtt configuration.

Store assets/timeseries metadata and relationships in raw. Assets will not be created at all, timeseries will be created with just externalId, isStep, and isString. Both timeseries and assets will be persisted in their entirity to CDF Raw. Datapoints are not affected. Events will be created but without being contextualized to assets. The external ID of the source node is added to metadata if applicable

ParameterTypeDescription
databasestringRequired. Raw database to write metadata to.
assets-tablestringRaw table to use for assets.
timeseries-tablestringRaw table to use for timeseries.
relationships-tablestringRaw table to use for relationships.

metadata-mapping

Part of mqtt configuration.

Define mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have a property named EngineeringUnits, which ideally should be mapped to unit in CDF. This property lets you do that.

Example:

timeseries:
EngineeringUnits: unit
EURange: description
assets:
Name: name
ParameterTypeDescription
assetsobjectMap metadata for assets.
timeseriesobjectMap metadata for time series.

assets

Part of metadata-mapping configuration.

Map metadata for assets.

ParameterTypeDescription
Any stringeither description, name or parentIdTarget asset attribute

timeseries

Part of metadata-mapping configuration.

Map metadata for time series.

ParameterTypeDescription
Any stringeither description, name, parentId or unitTarget time series attribute

influx

Global parameter.

Configuration for pushing to an InfluxDB database. Data points and events will be pushed, but no context or metadata.

ParameterTypeDescription
hoststringRequired. URL of the host InfluxDB server
usernamestringThe username for connecting to the InfluxDB database.
passwordstringThe password for connecting to the InfluxDB database.
databasestringRequired. The database to connect to on the InfluxDB server. The database will not be created automatically.
point-chunk-sizeintegerMaximum number of points to send in each request to InfluxDB. Default value is 100000.
read-extracted-rangesbooleanWhether to read start/end points on startup, where possible. It is recommended that you use state-storage instead.
read-extracted-event-rangesbooleanWhether to read start/end points for events on startup, where possible. It is recommended that you use state-storage instead.

extraction

Global parameter.

Configuration for general extraction options, such as data types, mapping, and filters.

External ID generation

IDs used in OPC UA are special nodeId objects with an identifier and a namespace that must be converted to a string before they are written to CDF. A direct conversion, however, has several potential problems.

  • The namespace index is by default part of the node, but it may change between server restarts. Only the namespace itself is fixed.
  • The namespace table may be modified, in which case all old node IDs are invalidated.
  • Node IDs are not unique between different OPC UA servers.
  • Node identifiers can be duplicated accross namespaces.

The solution is to create a node ID on the following form: [id-prefix][namespace][identifierType]=[identifier as string]([optional array index]). For example, the node with node ID ns=1;i=123 with ID prefix gp: would be mapped to gp:http://my.namespace.url:i=123.

You can optionally override this behavior for individual nodes by using node-map(#extraction.node-map).

ParameterTypeDescription
id-prefixstringGlobal prefix for externalIds in destinations. Should be unique for each extractor deployment to prevent name conflicts.
root-nodeobjectRoot node. Defaults to the Objects node. Default value is {'node-id': 'i=86'}.
root-nodeslistList of root nodes. The extractor will start exploring from these. Specifying nodes connected with hierarchical references can result in some strange behavior and should be avoided
node-mapobjectMap from external IDs to OPC UA node IDs. This can, for example, be used to place the node hierarchy as a child of a specific asset in the asset hierarchy.
namespace-mapobjectMap OPC-UA namespaces to prefixes in CDF. If not mapped, the full namespace URI is used. This saves space compared to using the full URI, and might make IDs more readable.
auto-rebrowse-periodstringTime in minutes between each automatic re-browse of the node hierarchy. Format is as given in Timestamps and intervals, this option accepts cron expressions. Set this to 0 to disable automatic re-browsing of the server. Default value is 0m.
enable-audit-discoverybooleanEnable this to make the extractor listen to AuditAddNodes and AuditAddReferences events from the server, and use that to identify when new nodes are added to the server. This is more efficient than browsing the node hierarchy, but does not work with data models and requires that the server supports auditing.
data-push-delaystringTime between each push to destinations. Format is as given in Timestamps and intervals. Default value is 1s.
updateobjectUpdate data in destinations on re-browse or restart. Set auto-rebrowse-period to do this periodically.
data-typesobjectConfiguration related to how data types and arrays should be handled by the OPC UA extractor.
relationshipsobjectMap OPC UA references to relationships in CDF, or edges in CDF data models. Generated relationships will have external ID on the form [prefix][reference type];[source][target]

Only relationships between mapped nodes are extracted.
node-typesobjectConfiguration for mapping OPC UA types to CDF in some way.
map-variable-childrenbooleanSet to true to make the extractor read children of variables and potentially map those to timeseries as well.
transformationslistA list of transformations to be applied to the source nodes before pushing. Transformations are applied sequentially, so it can help performance to put Ignore filters first, and TimeSeries filters can undo Property transformations.
deletesobjectConfigure soft deletes. When this is enabled, all read nodes are written to a state store after browse, and nodes that are missing on subsequent browses are marked as deleted from CDF, with a configurable marker. A notable exception is relationships in CDF, which has no metadata, so these are hard deleted if cognite.delete-relationships is enabled.

root-node

Part of extraction configuration.

Root node. Defaults to the Objects node

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

root-nodes

Part of extraction configuration.

List of root nodes. The extractor will start exploring from these. Specifying nodes connected with hierarchical references can result in some strange behavior and should be avoided

Each element of this list should be a root node.

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

node-map

Part of extraction configuration.

Map from external IDs to OPC UA node IDs. This can, for example, be used to place the node hierarchy as a child of a specific asset in the asset hierarchy.

ParameterTypeDescription
Any stringobjectTarget node ID for mapping external ID. Default value is {'node-id': 'i=86'}.

proto_node_id

Part of node-map configuration.

Target node ID for mapping external ID.

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

namespace-map

Part of extraction configuration.

Map OPC-UA namespaces to prefixes in CDF. If not mapped, the full namespace URI is used. This saves space compared to using the full URI, and might make IDs more readable.

ParameterTypeDescription
Any stringstringNamespace URI to map.

update

Part of extraction configuration.

Update data in destinations on re-browse or restart. Set auto-rebrowse-period to do this periodically.

ParameterTypeDescription
objectsobjectConfiguration for updating objects and object types.
variablesobjectConfiguration for updating variables and variable types.

objects

Part of update configuration.

Configuration for updating objects and object types.

ParameterTypeDescription
descriptionbooleanSet to true to update description
contextbooleanSet to true to update context, i.e. the position of the node in the node hierarchy
metadatabooleanSet to true to update metadata, including fields like unit.
namebooleanSet to true to update name.

variables

Part of update configuration.

Configuration for updating variables and variable types.

ParameterTypeDescription
descriptionbooleanSet to true to update description
contextbooleanSet to true to update context, i.e. the position of the node in the node hierarchy
metadatabooleanSet to true to update metadata, including fields like unit.
namebooleanSet to true to update name.

data-types

Part of extraction configuration.

Configuration related to how data types and arrays should be handled by the OPC UA extractor.

ParameterTypeDescription
custom-numeric-typeslistSet a list of manually defined numeric OPC UA data types. This can be used to make custom types be treated as numbers. Conversion from the actual returned values is done using the C# Convert functionality. If no valid conversion exists, this will fail.
ignore-data-typeslistList of node IDs of data types that should not be extracted to CDF. Time series with one of these data types are not extracted.
unknown-as-scalarbooleanAssume non-specific ValueRanks in OPC UA (ScalarOrOneDimension or Any) are scalar if they do not have an ArrayDimensions attribute. If such a variable produces an array, only the first element will be extracted.
max-array-sizeintegerMaximum length of arrays mapped to CDF, if this is set to 0, only scalar variables are mapped. Set this to -1 to indicate that there is no upper limit.

WARNING: If you set this to -1, and encounter a very large array, such as an image, the extractor might create hundreds of thousands of time series in CDF.

In general, extracting arrays require variables to have an ArrayDimensions attribute with length equal to 1. See estimate-array-sizes for workarounds.
allow-string-variablesbooleanSet to true to allow fetching string-typed variables. This means that all variables with a non-numeric type is converted to string.
auto-identify-typesbooleanMap out the data type hierarchy before starting. This is useful if there are custom numeric types, or any enum types are used. This must be enabled for enum metadata and enums-as-strings to work. If this is false, any custom numeric types have to be added manually.
enums-as-stringsbooleanIf this is false and auto-identify-types is true, or there are manually added enums in custom-numeric-types, enums will be mapped to numeric time series, and labels are added as metadata fields.

If this is true, labels are not mapped to metadata, and enums will be mapped to string time series with values equal to mapped label values.
data-type-metadatabooleanAdd a metadata property dataType, which contains the node ID of the OPC-UA data type.
null-as-numericbooleanSet this to true to treat null node IDs as numeric instead of ignoring them. Note that null node IDs violate the OPC UA standard.
expand-node-idsbooleanAdd attributes such as NodeId, ParentNodeId, and TypeDefinitionId to nodes written to CDF Raw as full node IDs with reversible encoding.
append-internal-valuesbooleanAdd attributes generally used internally like AccessLevel, Historizing, ArrayDimensions, and ValueRank to data pushed to CDF Raw.
estimate-array-sizesbooleanIf max-array-size is set, this looks for the MaxArraySize property on each node with one-dimensional ValueRank. If it is not found, the extractor tries to read the value of the node, and look at the current array length.

ArrayDimensions is still the preferred way to identify array sizes, this is not guaranteed to generate reasonable or useful values.

custom-numeric-types

Part of data-types configuration.

Set a list of manually defined numeric OPC UA data types. This can be used to make custom types be treated as numbers. Conversion from the actual returned values is done using the C# Convert functionality. If no valid conversion exists, this will fail.

Each element of this list should be a definition of an OPC UA data type.

ParameterTypeDescription
node-idobjectNode ID of the data type. Default value is {'node-id': 'i=86'}.
is-stepbooleanSet to true if is-step should be set on timeseries in CDF
enumbooleanSet to true if this is data type is an enumeration
node-id

Part of custom-numeric-types configuration.

Node ID of the data type

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

ignore-data-types

Part of data-types configuration.

List of node IDs of data types that should not be extracted to CDF. Time series with one of these data types are not extracted.

Each element of this list should be a node ID of the data type

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

relationships

Part of extraction configuration.

Map OPC UA references to relationships in CDF, or edges in CDF data models. Generated relationships will have external ID on the form [prefix][reference type];[source][target]

Only relationships between mapped nodes are extracted.

ParameterTypeDescription
enabledbooleanRequired. Set to true to enable mapping OPC-UA references to relationships in CDF
hierarchicalbooleanSet to true to enable also mapping the hierarchical references over. These are normally mapped to assetId/parentId relations in CDF, but when doing that the type of the reference is lost. Requires enabled to be true.
inverse-hierarchicalbooleanSet to true to create inverse relationships for each fo the hierarchical references. For efficiency these are not read, but inferred from forward references. Does nothing if hierarchical is false.

node-types

Part of extraction configuration.

Configuration for mapping OPC UA types to CDF in some way.

ParameterTypeDescription
metadatabooleanAdd the TypeDefinition as a metadata field to all objects and variables.
as-nodesbooleanAllow reading object- and variable types as normal nodes and map them to assets or data model nodes in CDF. They will need to be in the mapped hierarchy, so you will need to add a root node that they are descended from.

transformations

Part of extraction configuration.

A list of transformations to be applied to the source nodes before pushing. Transformations are applied sequentially, so it can help performance to put Ignore filters first, and TimeSeries filters can undo Property transformations.

Each element of this list should be an a single transformation.

ParameterTypeDescription
typeeither Ignore, DropSubscriptions, Property or TimeSeriesTransformation type. Ignore ignores the node and all descendants, DropSubscriptions prevents the extractor from subscribing to this node, Property converts time series and their descendants into metadata, TimeSeries converts metadata into time series, DropSubscriptions prevents the extractor from subscribing to events or data points on matching nodes.
filterobjectFilter to match. All non-null filters must match each node for the transformation to be applied.

filter

Part of transformations configuration.

Filter to match. All non-null filters must match each node for the transformation to be applied.

ParameterTypeDescription
namestringRegex on node DisplayName.
descriptionstringRegex on node Description.
idstringRegex on node ID. IDs on the form i=123 or s=string are matched.
is-arraybooleanMatch on whether a node is an array. If this is set to true or false, the filter will only match variables.
namespacestringRegex on the full namespace of the node ID.
type-definitionstringRegex on the node ID of the type definition. On the form i=123 or s=string.
historizingbooleanWhether the node is historizing. If this is set, the filter will only match variables.
node-classeither Object, ObjectType, Variable, VariableType, DataType or ReferenceTypeThe OPC UA node class, exact match.
parentobjectAnother filter instance which is applied to the parent node.

deletes

Part of extraction configuration.

Configure soft deletes. When this is enabled, all read nodes are written to a state store after browse, and nodes that are missing on subsequent browses are marked as deleted from CDF, with a configurable marker. A notable exception is relationships in CDF, which has no metadata, so these are hard deleted if cognite.delete-relationships is enabled.

ParameterTypeDescription
enabledbooleanSet this to true to enable deletes. This requires state-storage to be configured.
delete-markerstringName of marker used to indicate that a node is deleted. Added to metadata, or as a column in Raw. Default value is deleted.

events

Global parameter.

Configuration for extracting OPC UA events and alarams as CDF events or litedb time series

ParameterTypeDescription
enabledbooleanRequired. Set to true to enable reading events from the server
historybooleanSet to true to enable reading event history
all-eventsbooleanSet to true to read base OPC UA events, in addition to custom events. Default value is True.
read-serverbooleanSet to true to enable checking the server node when looking for event emitters. Default value is True.
destination-name-mapobjectMap source browse names to other values in the destination. For CDF, internal properties may be overwritten. By default Message is mapped to description, SourceNode is used to set the assetIds, and EventType is mapped to type. These may be replaced by overrides in destination-name-map.

If StartTime, EndTime, or SubType are specified, either on the event in OPC-UA itself, or translated through this map, they are set directly on the CDF event. StartTime and EndTime should be either DateTime, or a number corresponding to the number of milliseconds since 01/01/1970. If no StartTime or EndTime are specified, both are set to the Time property.

Example:
{'MyProperty': 'SubType', 'MyEndTime': 'EndTime'}
event-idslistAllow-list of event type IDs to map. If this is included, only the events with type equal to one of these node IDs will be included.
exclude-event-filterstringRegex filter on event type DisplayName, matches will not be extracted.
emitter-idslistList of event emitters to extract from. The default behavior is to extract events from nodes based on the EventNotifier attribute. This option ignores EventNotifier and extracts events from the given list of nodes.
historizing-emitter-idslistList of node IDs that must be a subset of the emitter-ids property. These emitters will have their event history read. The server must support historical events. The events.history property must be true for this to work.
discover-emittersbooleanAutomatically treat nodes with suitable EventNotifier as event emitters when they are discovered in the node hierarchy. Default value is True.
exclude-propertieslistList of BrowseNames of event properties that are to be excluded from automatic mapping to destination metadata. By default, LocalTime and ReceiveTime are excluded. Be aware that a maximum of 16 metadata entries are allowed in CDF.

destination-name-map

Part of events configuration.

Map source browse names to other values in the destination. For CDF, internal properties may be overwritten. By default Message is mapped to description, SourceNode is used to set the assetIds, and EventType is mapped to type. These may be replaced by overrides in destination-name-map.

If StartTime, EndTime, or SubType are specified, either on the event in OPC-UA itself, or translated through this map, they are set directly on the CDF event. StartTime and EndTime should be either DateTime, or a number corresponding to the number of milliseconds since 01/01/1970. If no StartTime or EndTime are specified, both are set to the Time property.

Example:

MyProperty: SubType
MyEndTime: EndTime
ParameterTypeDescription
Any stringeither StartTime, EndTime, SubType or TypeTarget field to map to.

event-ids

Part of events configuration.

Allow-list of event type IDs to map. If this is included, only the events with type equal to one of these node IDs will be included.

Each element of this list should be a node ID of the data type

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

emitter-ids

Part of events configuration.

List of event emitters to extract from. The default behavior is to extract events from nodes based on the EventNotifier attribute. This option ignores EventNotifier and extracts events from the given list of nodes.

Each element of this list should be a node ID of the data type

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

historizing-emitter-ids

Part of events configuration.

List of node IDs that must be a subset of the emitter-ids property. These emitters will have their event history read. The server must support historical events. The events.history property must be true for this to work.

Each element of this list should be a node ID of the data type

ParameterTypeDescription
namespace-uristringFull URI of the node namespace. If left out it is assumed to be the base namespace
node-idstringIdentifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard

exclude-properties

Part of events configuration.

List of BrowseNames of event properties that are to be excluded from automatic mapping to destination metadata. By default, LocalTime and ReceiveTime are excluded. Be aware that a maximum of 16 metadata entries are allowed in CDF.

Each element of this list should be a string.

failure-buffer

Global parameter.

If the connection to CDF goes down, the OPC UA extractor supports buffering data points and events in a local file or InfluxDB. This is helpful if the connection is unstable, and the server does not provide its own historical data.

ParameterTypeDescription
enabledbooleanSet to true to enable the failure buffer.
datapoint-pathstringPath to a binary file where data points are buffered. Buffering to a file is very fast, and can handle a large amount of data. The file will be created for you if it does not already exist.

Example:
buffer.bin
event-pathstringPath to a binary file for storing events. Buffering to a file is very fast and can handle a large amount of data. The file will be created for you if it does not already exist. Must be different from datapoint-path.
max-buffer-sizeintegerSet the maximum size in bytes of the buffer file. If the size of the file exceeds this number, no new datapoints or events will be written to their respective buffer files, and any further ephemeral data is lost. Note that if both datapoint and event buffers are enabled, the potential disk usage is twice this number.
influx-state-storebooleanSet to true to enable storing the state of the InfluxDB buffer. This makes the InfluxDB buffer persistent even if the OPC UA extractor stops before it is emptied. Requires state-storage to be configured.
influxbooleanSet to true to enable buffering data in InfluxDB. This requires a configured influx pusher. See influx. This serves as an alternative to a local file, but should only be used if pushing to InfluxDB is already an requirement.

history

Global parameter.

Configuration for reading historical datapoints and events from the server

ParameterTypeDescription
enabledbooleanSet to true to enable reading history from the server. If this is false no history will be read, overriding configuration set elsewhere.
databooleanSet to false to disable reading history for data points. Default value is True.
backfillbooleanEnable backfill, meaning that data is read backward and forward through history. This makes it so that the extractor will prioritize reading recent values, then keep reading old data in the background while loading live data. This is potentially useful if there is a lot of history.
require-historizingbooleanSet this to true to require the Historizing attribute to be true on OPC UA variables in order to read history from them.
restart-periodstringTime in seconds between restarts of history. Setting this too low may impact performance. Leave at 0 to disable periodic restarts. Format is as given in Timestamps and intervals. This option also allows using cron expressions.
data-chunkintegerMaximum number of datapoints per variable for each HistoryRead service call Generally, this is limited by the server, so it can safely be set to 0, which means the server decides the number of points to return. Default value is 1000.
data-nodes-chunkintegerMaximum number of variables to query in each HistoryRead service call. If granularity is set, this is applied after calculating chunks based on history granularity. Default value is 100.
event-chunkintegerMaximum number of events per node for each HistoryRead service call for events. Generally, this is limited by the server, so it can safely be set to 0, which means the server decides the number of events to return. Default value is 1000.
event-nodes-chunkintegerMaximum number of nodes to query in each HistoryRead service call for events. Default value is 100.
granularitystringGranularity in seconds for chunking history read operations. Variables with latest timestamp within the same granularity chunk will have their history read together. Reading more variables per operation is more efficient, but if the granularity is set too high, then the same history may be read multiple times. A good choice for this value is 60 * average_update_frequency.

Format is as given in Timestamps and intervals. Default value is 600s.
start-timestringEarliest timestamp to read history from in milliseconds since January 1, 1970. Format is as given in Timestamps and intervals, -ago can be added to make a relative timestamp in the past. Default value is 0.

Example:
3d-ago
end-timestringThe latest timestamp history will be read from. In milliseconds since 01/01/1970. Alternatively use syntax N[timeunit](-ago) where timeunit is one of w, d, h, m, s, or ms. -ago indicates that this is set in the past, if left out it will be that duration in the future.
max-read-lengthstringMaximum length in time of each history read. If this is greater than zero, history will be read in chunks until the end. This is a workaround for server issues, do not use this unless you have concrete issues with continuation points. Format is as given in Timestamps and intervals. Default value is 0.
ignore-continuation-pointsbooleanSet to true to read history without using ContinuationPoints, instead using the Time field on events, and SourceTimestamp on datapoints to incrementally increase the start time of the request until no data is returned. This is a workaround for server issues, do not use this unless you have concrete issues with continuation points.
throttlingobjectConfiguration for throttling history
log-bad-valuesbooleanLog the number of bad datapoints per HistoryRead call at Debug level, and each individual bad datapoint at Verbose level. Default value is True.
error-thresholdintegerThreshold in percent for a history run to be considered failed. Example: 10.0 -> History read operation would be considered failed if more than 10% of nodes fail to read at some point. Retries still apply, this only applies to nodes that fail even after retries.

This is safe in terms data loss. A node that has failed during history will not receive state updates from live values, next time history is read, the extractor will continue from where it last successfully read history. Default value is 10.

throttling

Part of history configuration.

Configuration for throttling history

ParameterTypeDescription
max-per-minuteintegerMaximum number of requests per minute, approximately.
max-parallelismintegerMaximum number of parallel requests.
max-node-parallelismintegerMaximum number of concurrent nodes accross all parallel requests.

state-storage

Global parameter.

Use a local LiteDb database or a set of tables in CDF RAW to store persistent information between runs. This can be used to avoid loading large volumes of data from CDF on startup, which can greatly speed up the extractor.

ParameterTypeDescription
locationstringRequired. Path to .db file used for storage, or name of a CDF RAW database.
databaseeither None, LiteDb or RawWhich type of database to use. Default value is None.
intervalstringInterval between each write to the buffer file, in seconds. 0 or less disables the state store. Format is as given in Timestamps and intervals. Default value is 0s.
variable-storestringName of raw table or litedb collection to store information about extracted OPC UA variables. Default value is variable_states.
event-storestringName of raw table or litedb collection to store information about extracted events. Default value is event_states.
influx-variable-storestringName of raw table or litedb collection to store information about variable ranges in the InfluxDB failure buffer. Default value is influx_variable_states.
influx-event-storestringName of raw table or litedb collection to store information about events in the InfluxDB failure buffer. Default value is influx_event_states.

subscriptions

Global parameter.

A few options for subscriptions to events and data points. Subscriptions in OPC UA consist of Subscription objects on the server, which contain a list of MonitoredItems. By default, the extractor produces a maximum of four subscriptions:

  • DataChangeListener - handles data point subscriptions.
  • EventListener - handles event subscriptions.
  • AuditListener - which handles audit events.
  • NodeMetrics - which handles subscriptions for use as metrics.

Each of these can contain a number of MonitoredItems.

ParameterTypeDescription
data-change-filterobjectModify the DataChangeFilter used for datapoint subscriptions. See the OPC UA reference part 4 7.17.2 for details. These are just passed to the server, they have no further effect on extractor behavior. Filters are applied to all nodes, but deadband should only affect some, according to the standard.
sampling-intervalintegerRequested sample interval per variable on the server. This is how often the extractor requests the server sample changes to values. The server has no obligation to use this value, or to use any form of sampling at all, but according to the standard this should limit the lowest allowed rate of updates. 0 tells the server to sample as fast as possible. Default value is 100.
queue-lengthintegerRequested length of queue for each variable on the server. This is how many data points the server will buffer. It should in general be set to at least 2 * publishing-interval / sampling-interval

This setting generally sets the maximum rate of points from the server (in milliseconds). On many servers, sampling is an internal operation, but on some, this may access external systems. Setting this very low can increase the load on the server significantly. Default value is 100.
data-pointsbooleanEnable subscriptions on datapoints. Default value is True.
eventsbooleanEnable subscriptions on events. Requires events.enabled to be set to true. Default value is True.
ignore-access-levelbooleanIgnore the access level parameter for history and datapoints. This means using the Historizing parameter for history, and subscribing to all timeseries.
log-bad-valuesbooleanLog bad subscription datapoints. Default value is True.
keep-alive-countintegerThe number of publish requests without a response before the server should send a keep alive message. Default value is 10.
lifetime-countintegerThe number of publish requests without a response before the server should close the subscription. Must be at least 3 * keep-alive-count. Default value is 1000.
recreate-stopped-subscriptionsbooleanRecreate subscriptions that have stopped publishing. Default value is True.
recreate-subscription-grace-periodstringGrace period for recreating stopped subscriptions. If this is negative, default to 8 * publishing-interval. Format is as given in Timestamps and intervals. Default value is -1.
alternative-configslistList of alternative subscription configurations. The first entry with a matching filter will be used for each node.

data-change-filter

Part of subscriptions configuration.

Modify the DataChangeFilter used for datapoint subscriptions. See the OPC UA reference part 4 7.17.2 for details. These are just passed to the server, they have no further effect on extractor behavior. Filters are applied to all nodes, but deadband should only affect some, according to the standard.

ParameterTypeDescription
triggereither Status, StatusValue or StatusValueTimestampWhat changes to a variable trigger an update. One of Status, StatusValue or StatusValueTimestamp. Default value is StatusValue.
deadband-typeeither None, Absolute or PercentEnable deadband for numeric nodes. One of None, Absolute or Percent. Default value is None.
deadband-valueintegerDeadband value, effect depends on deadband type.

alternative-configs

Part of subscriptions configuration.

List of alternative subscription configurations. The first entry with a matching filter will be used for each node.

Each element of this list should be an alternative subscription configuration

ParameterTypeDescription
filterobjectFilter on node, if this matches or is null, the config will be applied.
data-change-filterobjectModify the DataChangeFilter used for datapoint subscriptions. See the OPC UA reference part 4 7.17.2 for details. These are just passed to the server, they have no further effect on extractor behavior. Filters are applied to all nodes, but deadband should only affect some, according to the standard.
sampling-intervalintegerRequested sample interval per variable on the server. This is how often the extractor requests the server sample changes to values. The server has no obligation to use this value, or to use any form of sampling at all, but according to the standard this should limit the lowest allowed rate of updates. 0 tells the server to sample as fast as possible. Default value is 100.
queue-lengthintegerRequested length of queue for each variable on the server. This is how many data points the server will buffer. It should in general be set to at least 2 * publishing-interval / sampling-interval

This setting generally sets the maximum rate of points from the server (in milliseconds). On many servers, sampling is an internal operation, but on some, this may access external systems. Setting this very low can increase the load on the server significantly. Default value is 100.

filter

Part of alternative-configs configuration.

Filter on node, if this matches or is null, the config will be applied.

ParameterTypeDescription
idstring
data-typestringRegex match on node data type, if it is a variable
is-event-stateeither boolean or nullMatch on whether this subscription is for data points or events

data-change-filter

Part of alternative-configs configuration.

Modify the DataChangeFilter used for datapoint subscriptions. See the OPC UA reference part 4 7.17.2 for details. These are just passed to the server, they have no further effect on extractor behavior. Filters are applied to all nodes, but deadband should only affect some, according to the standard.

ParameterTypeDescription
triggereither Status, StatusValue or StatusValueTimestampWhat changes to a variable trigger an update. One of Status, StatusValue or StatusValueTimestamp. Default value is StatusValue.
deadband-typeeither None, Absolute or PercentEnable deadband for numeric nodes. One of None, Absolute or Percent. Default value is None.
deadband-valueintegerDeadband value, effect depends on deadband type.

pub-sub

Global parameter.

Configure the extractor to read from MQTT using OPC-UA pubsub. This requires the server pubsub configuration to be exposed through the Server object. You should consider setting subscriptions: data-points: false to avoid duplicate datapoints if this is enabled.

ParameterTypeDescription
enabledbooleanEnable PubSub
prefer-uadpbooleanPrefer using the UADP binary format. If false JSON is preferred. Default value is True.
file-namestringSave or read configuration from a file. If the file does not exist, it will be created from server configuration. If this is pre-created manually, the server does not need to expose pubsub configuration

high-availability

Global parameter.

Configuration to allow you to run multiple redundant extractors. Each extractor needs a unique index.

ParameterTypeDescription
indexintegerRequired. Unique index of this extractor. Each redundant extractor must have a unique index
rawobjectConfiguration to use Raw as backend for high availability
redisobjectConfiguration to use a Redis store as backend for high availability

raw

Part of high-availability configuration.

Configuration to use Raw as backend for high availability

ParameterTypeDescription
database-namestringRequired. Raw database to store high availability states in
table-namestringRequired. Raw table to store high availability states in

redis

Part of high-availability configuration.

Configuration to use a Redis store as backend for high availability

ParameterTypeDescription
connection-stringstringRequired. Connection string to connect to redis instance
table-namestringRequired. Redis table name to store high availability states in
  • Sample configuration files
  • Minimal YAML configuration file
  • ProtoNodeId
  • Timestamps and intervals
  • Using values from Azure Key Vault
  • Configure the OPC UA extractor
  • source
    • alt-endpoint-urls
    • endpoint-details
    • redundancy
    • x509-certificate
    • node-set-source
    • browse-throttling
    • retries
  • logger
    • console
    • file
    • trace-listener
  • metrics
    • server
    • push-gateways
    • nodes
  • cognite
    • idp-authentication
    • cdf-retries
    • cdf-chunking
    • cdf-throttling
    • sdk-logging
    • extraction-pipeline
    • certificates
    • data-set
    • metadata-targets
    • metadata-mapping
    • raw-node-buffer
    • browse-callback
  • mqtt
    • raw-metadata
    • metadata-mapping
  • influx
  • extraction
    • External ID generation
    • root-node
    • root-nodes
    • node-map
    • namespace-map
    • update
    • data-types
    • relationships
    • node-types
    • transformations
    • deletes
  • events
    • destination-name-map
    • event-ids
    • emitter-ids
    • historizing-emitter-ids
    • exclude-properties
  • failure-buffer
  • history
    • throttling
  • state-storage
  • subscriptions
    • data-change-filter
    • alternative-configs
  • pub-sub
  • high-availability
    • raw
    • redis