Configuration settings
To configure the OPC UA extractor, you must edit the configuration file. The file is in YAML format, and the sample configuration file contains all valid options with default values.
You can leave many fields empty to let the extractor use the default values. The configuration file separates the settings by component, and you can remove an entire component to disable it or use the default values.
Sample configuration files
In the extractor installation folder, the /config
subfolder contains sample complete and minimal configuration files. The values wrapped in ${}
are replaced with environment variables with that name. For example ${COGNITE_PROJECT}
will be replaced with the value of the environment variable called COGNITE_PROJECT
.
The configuration file also contains the global parameter version
, which holds the version of the configuration schema used in the configuration file. This document describes version 1 of the configuration schema.
Not that it is not recommended to use the config.example.yml
as a basis for configuration files. This file contains all configuration options, which is both hard to read, and may cause issues. It is intended as a reference showing how each option is configured, not as a basis. Use config.minimal.yml
instead.
You can set up extraction pipelines to use versioned extractor configuration files stored in the cloud.
Minimal YAML configuration file
version: 1
source:
# The URL of the OPC-UA server to connect to
endpoint-url: 'opc.tcp://localhost:4840'
cognite:
# The project to connect to in the API, uses the environment variable COGNITE_PROJECT.
project: '${COGNITE_PROJECT}'
# Cognite authentication
# This is for Microsoft as IdP. To use a different provider,
# set implementation: Basic, and use token-url instead of tenant.
# See the example config for the full list of options.
idp-authentication:
# Directory tenant
tenant: ${COGNITE_TENANT_ID}
# Application Id
client-id: ${COGNITE_CLIENT_ID}
# Client secret
secret: ${COGNITE_CLIENT_SECRET}
# List of resource scopes, ex:
# scopes:
# - scopeA
# - scopeB
scopes:
- ${COGNITE_SCOPE}
extraction:
# Global prefix for externalId in destinations. Should be unique to prevent name conflicts.
id-prefix: 'gp:'
# Map OPC-UA namespaces to prefixes in CDF. If not mapped, the full namespace URI is used.
# Saves space compared to using the full URL. Using the ns index is not safe as the order can change on the server.
# It is recommended to set this before extracting the node hierarchy.
# For example:
# NamespaceMap:
# "urn:cognite:net:server": cns
# "urn:freeopcua:python:server": fps
# "http://examples.freeopcua.github.io": efg
ProtoNodeId
You can provide an OPC UA NodeId
in several places in the configuration file, these are YAML objects with the following structure:
node:
node-id: i=123
namespace-uri: opc.tcp://test.test
To find the node IDs we recommend using the UAExpert tool.
Locate the node you need the ID of in the hierarchy, the find the node ID on the right side under Attribute > NodeId. Find the Namespace Uri by matching the NamespaceIndex on the right to the dropdown on the left, in the node hierarchy view. The default value is No highlight.
Timestamps and intervals
In most places where time intervals are required, you can use a CDF-like syntax of [N][timeunit]
, for example 10m
for 10 minutes or 1h
for 1 hour. timeunit
is one of d
, h
, m
, s
, ms
. You can also use a cron expression in some places.
For history start and end times you can use a similar syntax. [N][timeunit]
and [N][timeunit]-ago
. 1d-ago
means 1 day in the past from the time history starts, and 1h
means 1 hour in the future. For instance, you can use this syntax to configure the extractor to read only recent history.
Using values from Azure Key Vault
The OPC UA extractor also supports loading values from Azure Key Vault. To load a configuration value from Azure Key Vault, use the !keyvault
tag followed by the name of the secret you want to load. For example, to load the value of the my-secret-name
secret in Key Vault into a password
parameter, configure your extractor like this:
password: !keyvault my-secret-name
To use Key Vault, you also need to include the key-vault
section in your configuration, with the following parameters:
Parameter | Description |
---|---|
keyvault-name | Name of Key Vault to load secrets from |
authentication-method | How to authenticate to Azure. Either default or client-secret . For default , the extractor will look at the user running the extractor, and look for pre-configured Azure logins from tools like the Azure CLI. For client-secret , the extractor will authenticate with a configured client ID/secret pair. |
client-id | Required for using the client-secret authentication method. The client ID to use when authenticating to Azure. |
secret | Required for using the client-secret authentication method. The client secret to use when authenticating to Azure. |
tenant-id | Required for using the client-secret authentication method. The tenant ID of the Key Vault in Azure. |
Example:
azure-keyvault:
keyvault-name: my-keyvault-name
authentication-method: client-secret
tenant-id: 6f3f324e-5bfc-4f12-9abe-22ac56e2e648
client-id: 6b4cc73e-ee58-4b61-ba43-83c4ba639be6
secret: 1234abcd
Configure the OPC UA extractor
OPC-UA extractor configuration
Parameter | Type | Description |
---|---|---|
version | integer | Version of the config file, the extractor specifies which config file versions are accepted in each version of the extractor. |
dry-run | boolean | Set this to true to prevent the extractor from writing anything to CDF. This is useful for debugging the extractor configuration. |
source | object | |
logger | object | Configure logging to console or file |
metrics | object | Configure logging to console or file |
cognite | object | Configuration for pushing data to Cognite Data Fusion (CDF) |
mqtt | object | Push data to CDF one-way over MQTT. This requires that the MQTT-CDF Bridge application is running somewhere with access to CDF. |
influx | object | Configuration for pushing to an InfluxDB database. Data points and events will be pushed, but no context or metadata. |
extraction | object | Configuration for general extraction options, such as data types, mapping, and filters. |
events | object | Configuration for extracting OPC UA events and alarams as CDF events or litedb time series |
failure-buffer | object | If the connection to CDF goes down, the OPC UA extractor supports buffering data points and events in a local file or InfluxDB. This is helpful if the connection is unstable, and the server does not provide its own historical data. |
history | object | Configuration for reading historical datapoints and events from the server |
state-storage | object | A local LiteDB database or a database in CDF RAW that store various persistent information between extractor runs. This is used to replace reading first/last data points from CDF, and also allows storing first/last times for events. Enabling this is highly recommended, and will be required in a future version of the extractor. |
subscriptions | object | A few options for subscriptions to events and data points. Subscriptions in OPC UA consist of Subscription objects on the server, which contain a list of MonitoredItems. By default, the extractor produces a maximum of four subscriptions: * DataChangeListener - handles data point subscriptions. * EventListener - handles event subscriptions. * AuditListener - which handles audit events. * NodeMetrics - which handles subscriptions for use as metrics. Each of these can contain a number of MonitoredItems. |
pub-sub | object | Configure the extractor to read from MQTT using OPC-UA pubsub. This requires the server pubsub configuration to be exposed through the Server object. You should consider setting subscriptions: data-points: false to avoid duplicate datapoints if this is enabled. |
high-availability | object | Configuration to allow you to run multiple redundant extractors. Each extractor needs a unique index. |
source
Global parameter.
Parameter | Type | Description |
---|---|---|
endpoint-url | string | The URL of the OPC UA server to connect to. In practice, this is the URL of the discovery server, where multiple levels of severity may be provided. The OPC UA extractor attempts to use the highest security possible based on the configuration. Example: opc.tcp://some-host:1883 |
alt-endpoint-urls | list | List of alternative endpoint URLs the extractor can attempt when connecting to the server. Use this for non-transparent redundancy. See the OPC UA standard part 4, section 6.6.2. We recommend setting force-restart to true . Otherwise, the extractor will reconnect to the same server each time. |
endpoint-details | object | Details used to override default endpoint behavior. This is used to make the client connect directly to an OPC UA endpoint, for example if the server is behind NAT (Network Address Translation), circumventing server discovery. |
redundancy | object | Additional configuration options related to redundant servers. The OPC UA extractor supports Cold redundancy, as described in the OPC UA standard part 4, section 6.6.2. |
reverse-connect-url | string | The local URL used for reverse connect, which means that the server is responsible for initiating connections, not the extractor. This lets the server be behind a firewall, forbidding incoming connections. You must also specify an endpoint-url , to indicate to the extractor where it should accept connections from. |
auto-accept | boolean | Set to true to automatically accept server certificates.If this is disabled, received server certificates will be placed in the rejected certificates folder (by default application_dir/certificates/pki/rejected ), and you can manually move them to te accepted certificates folder (application_dir/certificates/pki/accepted ). Setting this to true makes the extractor move certificates automatically.A simple solution would be to set this to true for the first connection, then change it to false .Warning: This should be disabled if the extractor is running on an untrusted network. Default value is True . |
username | string | OPC UA server username, leave empty to disable username/password authentication. |
password | string | OPC UA server password. |
x509-certificate | object | Specifies the configuration for using a signed x509 certificate to connect to the server. Note that this is highly server specific. The extractor uses the provided certificate to sign requests sent to the server. The server must have a mechanism to validate this signature. Typically the certificate must be provided by the server. |
secure | boolean | Set this to true to make the extractor try to connect to an endpoint with security above None . If this is enabled, the extractor will try to pick the most secure endpoint, meaning the endpoint with the longest of the most modern cipher types. |
ignore-certificate-issues | boolean | Ignore all suppressible certificate errors on the server certificate. You can use this setting if you receive errors such as Certificate use not allowed. CAUTION: This is potentially a security risk. Bad certificates can open the extractor to man-in-the-middle attacks or similar. If the server is secured in other ways (it is running locally, over a secure VPN, or similar), it is most likely fairly harmless. Some errors are not suppressible and must be remedied on the server. Note that enabling this is always a workaround for the server violating the OPC UA standard in some way. |
publishing-interval | integer | Sets the interval (in milliseconds) between publish requests to the server, which is when the extractor asks the server for updates to any active subscriptions. This limits the maximum frequency of points pushed to CDF, but not the maximum frequency of points on the server. In most cases, this can be set to the same as extraction.data-push-delay . If set to 0 the server chooses the interval to be as low as it supports. Be aware that some servers set this lower limit very low, which may create considerable load on the server. Default value is 500 . |
force-restart | boolean | If true , the extractor will not attempt to reconnect using the OPC UA reconnect protocol if connection is lost, but instead always create a new connection. Only enable this if reconnect is causing issues with the server. Even if this is disabled, the extractor will generally fall back on regular reconnects if the server produces unexpected errors on reconnect. |
exit-on-failure | boolean | If true , the OPC UA extractor will be restarted completely on reconnect. Enable this if the server is expected to change dramatically while running, and the extractor cannot keep using state from previous runs. |
keep-alive-interval | integer | Specifies the interval in milliseconds between each keep-alive request to the server. The connection times out if a keep-alive request fails twice (2 * interval + 100ms ). This typically happens if the server is down, or if it is hanging on a heavy operation and doesn't manage to respond to keep alive requests. Set this higher if keep alives often time out without the server being down. Default value is 5000 . |
restart-on-reconnect | boolean | If true , the OPC UA extractor will be restarted after reconnecting to the server. This may not be required if the server is the server is expected to not change much, and that it handles reconnects well. |
node-set-source | object | Read from NodeSet2 files instead of browsing the OPC UA node hierarchy. This is useful for certain smaller servers, where the full node hierarchy is known before-hand. In general, it can be used to lower the load on the server. |
alt-source-background-browse | boolean | If true , browses the OPC UA node hierarchy in the background when obtaining nodes from an alternative source, such as CDF Raw or NodeSet2 files. |
limit-to-server-config | boolean | Uses the Server/ServerCapabilities node in the OPC UA server to limit chunk sizes. Set this to false only if you know the server reports incorrect limits and you want to set them higher. If the real server limits are exceeded, the extractor will typically crash. Default value is True . |
browse-nodes-chunk | integer | Sets the maximum number of nodes per call to the Browse service. Large numbers are likely to exceed the server's tolerance. Lower numbers greatly increase startup time. Default value is 1000 . |
browse-chunk | integer | Sets the maximum requested results per node for each call to the Browse service. The server may decide to return fewer. Setting this lower increases startup times. Setting it to 0 leaves the decision up to the server. Default value is 1000 . |
attributes-chunk | integer | Specifies the maximum number of attributes to fetch per call to the Read service. If the server fails with TooManyOperations during attribute read, it may help to lower this value. This should be set as high as possible for large servers. Default value is 10000 . |
subscription-chunk | integer | Sets the maximum number of new MonitoredItems to create per operation. If the server fails with TooManyOperations when creating monitored items, try lowering this value. Default value is 1000 . |
browse-throttling | object | Settings for throttling browse operations. |
certificate-expiry | integer | Specifies the default application certificate expiration time in months. You can also replace the certificate manually by modifying the opc.ua.net.extractor.Config.xml configuration file. Note that the default values was changed as of version 2.5.3. Default value is 60 . |
retries | object | Specify the retry policy for requests to the OPC UA server. |
alt-endpoint-urls
Part of source
configuration.
List of alternative endpoint URLs the extractor can attempt when connecting to the server. Use this for non-transparent redundancy. See the OPC UA standard part 4, section 6.6.2.
We recommend setting force-restart
to true
. Otherwise, the extractor will reconnect to the same server each time.
Each element of this list should be a string.
endpoint-details
Part of source
configuration.
Details used to override default endpoint behavior. This is used to make the client connect directly to an OPC UA endpoint, for example if the server is behind NAT (Network Address Translation), circumventing server discovery.
Parameter | Type | Description |
---|---|---|
override-endpoint-url | string | Endpoint URL to override URLs returned from discovery. This can be used if the server is behind NAT, or similar URL rewrites. |
redundancy
Part of source
configuration.
Additional configuration options related to redundant servers. The OPC UA extractor supports Cold redundancy, as described in the OPC UA standard part 4, section 6.6.2.
Parameter | Type | Description |
---|---|---|
service-level-threshold | integer | Servers above this threshold are considered live. If the server drops below this level, the extractor will switch, provided monitor-service-level is set to true . Default value is 200 . |
reconnect-interval | string | If using redundancy, the extractor will attempt to find a better server with this interval if service level is below threshold. Format is as given in Timestamps and intervals. Default value is 10m . |
monitor-service-level | boolean | If true , the extractor will subscribe to changes in ServiceLevel and attempt to change server once it drops below service-level-threshold .This also prevents the extractor from updating states while service level is below the threshold, letting servers inform the extractor that they are not receiving data from all sources, and history should not be trusted. Once the service level goes back above the threshold, history will be read to fill any gaps. |
x509-certificate
Part of source
configuration.
Specifies the configuration for using a signed x509 certificate to connect to the server. Note that this is highly server specific. The extractor uses the provided certificate to sign requests sent to the server. The server must have a mechanism to validate this signature. Typically the certificate must be provided by the server.
Parameter | Type | Description |
---|---|---|
file-name | string | Path to local x509-certificate |
password | string | Password for local x509-certificate file |
store | either None , Local or User | Local certificate store to use. One of None (to use a file), Local (for LocalMachine) or User for the User store. Default value is None . |
cert-name | string | Name of certificate in store. Required to use store Example: CN=MyCertificate |
node-set-source
Part of source
configuration.
Read from NodeSet2 files instead of browsing the OPC UA node hierarchy. This is useful for certain smaller servers, where the full node hierarchy is known before-hand. In general, it can be used to lower the load on the server.
Parameter | Type | Description |
---|---|---|
node-sets | list | Required. List of nodesets to read. Specified by URL, file name, or both. If no name is specified, the last segment of the URL is used as file name. File name is where downloaded files are saved, and where the extractor looks for existing files. Note that typically, you will need to define all schemas your server schema depends on. All servers should depend on the base OPC UA schema, so you should always include https://files.opcfoundation.org/schemas/UA/1.04/Opc.Ua.NodeSet2.xml Example: [{'file-name': 'Server.NodeSet2.xml'}, {'url': 'https://files.opcfoundation.org/schemas/UA/1.04/Opc.Ua.NodeSet2.xml'}] |
instance | boolean | If true , the instance hierarchy is not obtained from the server, but instead read from the NodeSet2 files. |
types | boolean | If true , event types, reference types, object types, and variable types are obtained from NodeSet2 files instead of the server. |
node-sets
Part of node-set-source
configuration.
List of nodesets to read. Specified by URL, file name, or both. If no name is specified, the last segment of the URL is used as file name. File name is where downloaded files are saved, and where the extractor looks for existing files.
Note that typically, you will need to define all schemas your server schema depends on. All servers should depend on the base OPC UA schema, so you should always include https://files.opcfoundation.org/schemas/UA/1.04/Opc.Ua.NodeSet2.xml
Each element of this list should be a configuration specifying a node set file.
Parameter | Type | Description |
---|---|---|
file-name | string | Path to nodeset file. This is either the place where the downloaded file is saved, or a previously downloaded file. |
url | string | URL of publicly available nodeset file. |
Example:
[{'file-name': 'Server.NodeSet2.xml'}, {'url': 'https://files.opcfoundation.org/schemas/UA/1.04/Opc.Ua.NodeSet2.xml'}]
browse-throttling
Part of source
configuration.
Settings for throttling browse operations.
Parameter | Type | Description |
---|---|---|
max-per-minute | integer | Maximum number of requests per minute, approximately. |
max-parallelism | integer | Maximum number of parallel requests. |
max-node-parallelism | integer | Maximum number of concurrent nodes accross all parallel requests. |
retries
Part of source
configuration.
Specify the retry policy for requests to the OPC UA server.
Parameter | Type | Description |
---|---|---|
timeout | either string or integer | Global timeout. After this much time has passed, new retries will not be created. Set this to zero for no timeout. Syntax is N[timeUnit] where timeUnit is d , h , m , s or ms . Default value is 0s . |
max-tries | integer | Maximum number of attempts. 1 means that only the initial attempt will be made, 0 or less retries forever. Default value is 5 . |
max-delay | either string or integer | Maximum delay between attempts, incremented using exponential backoff. Set this to 0 for no upper limit. Syntax is N[timeUnit] where timeUnit is d , h , m , s or ms . Default value is 0s . |
initial-delay | either string or integer | Initial delay used for exponential backoff. Time between each retry is calculated as min(max-delay, initial-delay * 2 ^ retry) , where 0 is treated as infinite for max-delay . The maximum delay is about 10 minutes (13 retries). Syntax is N[timeUnit] where timeUnit is d , h , m , s or ms . Default value is 500ms . |
retry-status-codes | list | List of additional OPC-UA status codes to retry on. In additional to defaults. Should be integer values from http://www.opcfoundation.org/UA/schemas/StatusCode.csv, or symbolic names as shown in the same .csv file. |
retry-status-codes
Part of retries
configuration.
List of additional OPC-UA status codes to retry on. In additional to defaults. Should be integer values from http://www.opcfoundation.org/UA/schemas/StatusCode.csv, or symbolic names as shown in the same .csv file.
Each element of this list should be a string.
logger
Global parameter.
Configure logging to console or file
Parameter | Type | Description |
---|---|---|
console | object | Configuration for logging to the console. |
file | object | Configuration for logging to a rotating log file. |
trace-listener | object | Adds a listener that uses the configured logger to output messages from System.Diagnostics.Trace |
ua-trace-level | either verbose , debug , information , warning , error or fatal | Capture OPC UA tracing at this level or above. |
ua-session-tracing | boolean | Log data sent to and received from the OPC-UA server. WARNING: This produces an enormous amount of logs, only use this when running against a small number of nodes, producing a limited number of datapoints, and make sure it is not turned on in production. |
console
Part of logger
configuration.
Configuration for logging to the console.
Parameter | Type | Description |
---|---|---|
level | either verbose , debug , information , warning , error or fatal | Required. Minimum level of log events to write to the console. If not present, or invalid, logging to console is disabled. |
stderr-level | either verbose , debug , information , warning , error or fatal | Log events at this level or above are redirected to standard error. |
file
Part of logger
configuration.
Configuration for logging to a rotating log file.
Parameter | Type | Description |
---|---|---|
level | either verbose , debug , information , warning , error or fatal | Required. Minimum level of log events to write to file. |
path | string | Required. Path to the files to be logged. If this is set to logs/log.txt , logs on the form logs/log[date].txt will be created, depending on rolling-interval . |
retention-limit | integer | Maximum number of log files that are kept in the log folder. Default value is 31 . |
rolling-interval | either day or hour | Rolling interval for log files. Default value is day . |
trace-listener
Part of logger
configuration.
Adds a listener that uses the configured logger to output messages from System.Diagnostics.Trace
Parameter | Type | Description |
---|---|---|
level | either verbose , debug , information , warning , error or fatal | Required. Level to output trace messages at |
metrics
Global parameter.
Configure logging to console or file
Parameter | Type | Description |
---|---|---|
server | object | Configuration for a prometheus scrape server. |
push-gateways | list | A list of push gateway destinations to push metrics to |
nodes | object | Configuration to treat OPC-UA nodes as metrics. Values will be mapped to opcua_nodes_[NODE-DISPLAY-NAME] in prometheus |
server
Part of metrics
configuration.
Configuration for a prometheus scrape server.
Parameter | Type | Description |
---|---|---|
host | string | Required. Host name for the server. Example: localhost |
port | integer | Required. Port to host the prometheus scrape server on |
push-gateways
Part of metrics
configuration.
A list of push gateway destinations to push metrics to
Parameter | Type | Description |
---|---|---|
host | string | Required. URI of the pushgateway host Example: http://localhost:9091 |
job | string | Required. Name of the job |
username | string | Username for basic authentication |
password | string | Password for basic authentication |
push-interval | integer | Interval in seconds between each push to the gateway. Default value is 1 . |
nodes
Part of metrics
configuration.
Configuration to treat OPC-UA nodes as metrics. Values will be mapped to opcua_nodes_[NODE-DISPLAY-NAME]
in prometheus
Parameter | Type | Description |
---|---|---|
server-metrics | boolean | Map a few relevant static diagnostics contained in the Server/ServerDiagnosticsSummary node to prometheus metrics. |
other-metrics | list | List of additional nodes to read as metrics. |
other-metrics
Part of nodes
configuration.
List of additional nodes to read as metrics.
Parameter | Type | Description |
---|---|---|
namespace-uri | string | Full URI of the node namespace. If left out it is assumed to be the base namespace |
node-id | string | Identifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard |
cognite
Global parameter.
Configuration for pushing data to Cognite Data Fusion (CDF)
Parameter | Type | Description |
---|---|---|
project | string | CDF project to connect to. |
idp-authentication | object | The idp-authentication section enables the extractor to authenticate to CDF using an external identity provider (IdP), such as Microsoft Entra ID (formerly Azure Active Directory).See OAuth 2.0 client credentials flow |
host | string | Insert the base URL of the CDF project. Default value is https://api.cognitedata.com . |
cdf-retries | object | Configure automatic retries on requests to CDF. |
cdf-chunking | object | Configure chunking of data on requests to CDF. Note that increasing these may cause requests to fail due to limits in the API itself |
cdf-throttling | object | Configure the maximum number of parallel requests for different CDF resources. |
sdk-logging | object | Configure logging of requests from the SDK |
nan-replacement | either number or null | Replacement for NaN values when writing to CDF. If left out, NaN values are skipped. |
extraction-pipeline | object | Configure an associated extraction pipeline |
certificates | object | Configure special handling of SSL certificates. This should never be considered a permanent solution to certificate problems |
data-set | object | Data set used for new time series, assets, events, and relationships. Existing objects will not be updated |
read-extracted-ranges | boolean | Specifies whether to read start/end-points for datapoints on startup, where possible. It is generally recommended to use the state-storage instead of this. Default value is True . |
metadata-targets | object | Configure targets for node metadata. This configures which resources other than time series datapoints are created. By default, if this is left out, data is written to assets and time series metadata. Note that this behavior is deprecated, in the future leaving this section out will result in no metadata being written at all. |
metadata-mapping | object | Define mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have a property named EngineeringUnits , which ideally should be mapped to unit in CDF. This property lets you do that.Example: {'timeseries': {'EngineeringUnits': 'unit', 'EURange': 'description'}, 'assets': {'Name': 'name'}} |
raw-node-buffer | object | Read from CDF instead of OPC-UA when starting, to speed up start on slow servers. Requires extraction.data-types.expand-node-ids and extraction.data-types.append-internal-values to be set to true .This should generaly be enabled along with metadata-targets.raw or with no metadata targets at all.If browse-on-empty is set to true and metadata-targets.raw is configured with the same database and tables the extractor will read into raw on first run, then use raw as source for later runs. The Raw database can be deleted, and it will be re-created on extractor restart. |
delete-relationships | boolean | If this is set to true , relationships deleted from the source will be hard-deleted in CDF. Relationships do not have metadata, so soft-deleting them is not possible. |
idp-authentication
Part of cognite
configuration.
The idp-authentication
section enables the extractor to authenticate to CDF using an external identity provider (IdP), such as Microsoft Entra ID (formerly Azure Active Directory).
See OAuth 2.0 client credentials flow
Parameter | Type | Description |
---|---|---|
authority | string | AInsert the authority together with tenant to authenticate against Azure tenants. Default value is https://login.microsoftonline.com/ . |
client-id | string | Required. Enter the service principal client id from the IdP. |
tenant | string | Enter the Azure tenant. |
token-url | string | Insert the URL to fetch tokens from. |
secret | string | Enter the service principal client secret from the IdP. |
resource | string | Resource parameter passed along with token requests. |
audience | string | Audience parameter passed along with token requests. |
scopes | configuration for either list or string | |
min-ttl | integer | Insert the minimum time in seconds a token will be valid. If the cached token expires in less than min-ttl seconds, it will be refreshed even if it is still valid. Default value is 30 . |
certificate | object | Authenticate with a client certificate |
certificate
Part of idp-authentication
configuration.
Authenticate with a client certificate
Parameter | Type | Description |
---|---|---|
authority-url | string | Authentication authority URL |
path | string | Required. Enter the path to the .pem or .pfx certificate to be used for authentication |
password | string | Enter the password for the key file, if it is encrypted. |
cdf-retries
Part of cognite
configuration.
Configure automatic retries on requests to CDF.
Parameter | Type | Description |
---|---|---|
timeout | integer | Timeout in milliseconds for each individual request to CDF. Default value is 80000 . |
max-retries | integer | Maximum number of retries on requests to CDF. If this is less than 0, retry forever. Default value is 5 . |
max-delay | integer | Max delay in milliseconds between each retry. Base delay is calculated according to 125*2^retry milliseconds. If less than 0, there is no maximum. Default value is 5000 . |
cdf-chunking
Part of cognite
configuration.
Configure chunking of data on requests to CDF. Note that increasing these may cause requests to fail due to limits in the API itself
Parameter | Type | Description |
---|---|---|
time-series | integer | Maximum number of timeseries per get/create timeseries request. Default value is 1000 . |
assets | integer | Maximum number of assets per get/create assets request. Default value is 1000 . |
data-point-time-series | integer | Maximum number of timeseries per datapoint create request. Default value is 10000 . |
data-point-delete | integer | Maximum number of ranges per delete datapoints request. Default value is 10000 . |
data-point-list | integer | Maximum number of timeseries per datapoint read request. Used when getting the first point in a timeseries. Default value is 100 . |
data-points | integer | Maximum number of datapoints per datapoints create request. Default value is 100000 . |
data-points-gzip-limit | integer | Minimum number of datapoints in request to switch to using gzip. Set to -1 to disable, and 0 to always enable (not recommended). The minimum HTTP packet size is generally 1500 bytes, so this should never be set below 100 for numeric datapoints. Even for larger packages gzip is efficient enough that packages are compressed below 1500 bytes. At 5000 it is always a performance gain. It can be set lower if bandwidth is a major issue. Default value is 5000 . |
raw-rows | integer | Maximum number of rows per request to cdf raw. Default value is 10000 . |
raw-rows-delete | integer | Maximum number of row keys per delete request to raw. Default value is 1000 . |
data-point-latest | integer | Maximum number of timeseries per datapoint read latest request. Default value is 100 . |
events | integer | Maximum number of events per get/create events request. Default value is 1000 . |
sequences | integer | Maximum number of sequences per get/create sequences request. Default value is 1000 . |
sequence-row-sequences | integer | Maximum number of sequences per create sequence rows request. Default value is 1000 . |
sequence-rows | integer | Maximum number of sequence rows per sequence when creating rows. Default value is 10000 . |
instances | integer | Maximum number of data modeling instances per get/create instance request. Default value is 1000 . |
cdf-throttling
Part of cognite
configuration.
Configure the maximum number of parallel requests for different CDF resources.
Parameter | Type | Description |
---|---|---|
time-series | integer | Maximum number of parallel requests per timeseries operation. Default value is 20 . |
assets | integer | Maximum number of parallel requests per assets operation. Default value is 20 . |
data-points | integer | Maximum number of parallel requests per datapoints operation. Default value is 10 . |
raw | integer | Maximum number of parallel requests per raw operation. Default value is 10 . |
ranges | integer | Maximum number of parallel requests per get first/last datapoint operation. Default value is 20 . |
events | integer | Maximum number of parallel requests per events operation. Default value is 20 . |
sequences | integer | Maximum number of parallel requests per sequences operation. Default value is 10 . |
instances | integer | Maximum number of parallel requests per data modeling instances operation. Default value is 4 . |
sdk-logging
Part of cognite
configuration.
Configure logging of requests from the SDK
Parameter | Type | Description |
---|---|---|
disable | boolean | True to disable logging from the SDK, it is enabled by default |
level | either trace , debug , information , warning , error , critical or none | Log level to log messages from the SDK at. Default value is debug . |
format | string | Format of the log message. Default value is CDF ({Message}): {HttpMethod} {Url} {ResponseHeader[X-Request-ID]} - {Elapsed} ms . |
extraction-pipeline
Part of cognite
configuration.
Configure an associated extraction pipeline
Parameter | Type | Description |
---|---|---|
external-id | string | External ID of the extraction pipeline |
frequency | integer | Frequency to report Seen to the extraction pipeline in seconds. Less than or equal to zero will not report automatically. Default value is 600 . |
certificates
Part of cognite
configuration.
Configure special handling of SSL certificates. This should never be considered a permanent solution to certificate problems
Parameter | Type | Description |
---|---|---|
accept-all | boolean | Accept all remote SSL certificates. This introduces a severe risk of man-in-the-middle attacks |
allow-list | list | List of certificate thumbprints to automatically accept. This is a much smaller risk than accepting all certificates |
allow-list
Part of certificates
configuration.
List of certificate thumbprints to automatically accept. This is a much smaller risk than accepting all certificates
Each element of this list should be a string.
data-set
Part of cognite
configuration.
Data set used for new time series, assets, events, and relationships. Existing objects will not be updated
Parameter | Type | Description |
---|---|---|
id | integer | Data set internal ID |
external-id | string | Data set external ID |
metadata-targets
Part of cognite
configuration.
Configure targets for node metadata. This configures which resources other than time series datapoints are created. By default, if this is left out, data is written to assets and time series metadata. Note that this behavior is deprecated, in the future leaving this section out will result in no metadata being written at all.
Parameter | Type | Description |
---|---|---|
raw | object | Write metadata to the CDF staging area (Raw). |
clean | object | Write metadata to CDF clean, assets, time series, and relationships. |
data-models | object | ALPHA: Write metadata to CDF Data Models. This will create CDF data models based on the OPC UA type hierarchy, then populate them with data from the OPC UA node hierarchy. Note that this requires extraction.relationships.enabled and extraction.relationships.hierarchical to be set to true , and there must be exactly one root node with ID i=84 .Note that this feature is in alpha there may be changes that require you to delete the data model from CDF, and breaking changes to the configuration schema. These changes will not be considered breaking changes to the extractor. |
raw
Part of metadata-targets
configuration.
Write metadata to the CDF staging area (Raw).
Parameter | Type | Description |
---|---|---|
database | string | Required. The CDF Raw database to write to. |
assets-table | string | |
timeseries-table | string | Name of the Raw table to write time series metadata to. |
relationships-table | string | Name of the Raw table to write relationships metadata to. |
clean
Part of metadata-targets
configuration.
Write metadata to CDF clean, assets, time series, and relationships.
Parameter | Type | Description |
---|---|---|
assets | boolean | Set to true to enable creating CDF assets from OPC UA nodes. |
timeseries | boolean | Set to true to enable adding metadata to CDF time series based on OPC UA properties. |
relationships | boolean | Set to true to enable creating relationships from OPC UA references. Requires extraction.relationships to be enabled. |
space | string | Data modeling space to write to. If this is set, metadata is written to the core data models instead of to CDF Clean. Note that only timeseries are currently supported. Nodes will be created in the CogniteExtractorTimeSeries view in the industrial data models. |
source | string | External ID of the source that created core data model time series will be tied to. If this is not specified, it defaults to either OPC_UA:[source.endpoint-url] or OPC_UA_NODESET:[source.node-set-source.nodesets[LAST].[file-name or url] |
data-models
Part of metadata-targets
configuration.
ALPHA: Write metadata to CDF Data Models.
This will create CDF data models based on the OPC UA type hierarchy, then populate them with data from the OPC UA node hierarchy. Note that this requires extraction.relationships.enabled
and extraction.relationships.hierarchical
to be set to true
, and there must be exactly one root node with ID i=84
.
Note that this feature is in alpha there may be changes that require you to delete the data model from CDF, and breaking changes to the configuration schema. These changes will not be considered breaking changes to the extractor.
Parameter | Type | Description |
---|---|---|
enabled | boolean | Required. Set this to true to enable writing to CDF Data Models. |
model-space | string | Required. Set the space to create data models in. The space will be created if it does not exist. |
instance-space | string | Required. Set the space instances will be created in. The space will be created if it does not exist. May be the same as model-space . |
model-version | string | Required. Version used for created data model and all created views. |
types-to-map | either Referenced , Custom or All | Configure which types to map to Data Models.Referenced means that only types that are referenced by instance nodes will be created.Custom means that all types not in the base namespace will be created.All means that all types will be created.Note: Setting this to All is rarely useful, and may produce impractically large models. Default value is Custom . |
skip-simple-types | boolean | Do not create views without their own connections or properties. Simplifies the model greatly, but reduces the number of distinct types in your model. |
ignore-mandatory | boolean | Let mandatory options be nullable. Many servers do not obey Mandatory requirements in their own models, which breaks when they are ingested into CDF, where nullable constraints are enforced. |
connection-target-map | object | Target connections on the form "Type.Property": "Target" . This is useful for certain schemas. This overrides the expected type of specific CDF Connections, letting you override incorrect schemas. For example, the published nodeset file for ISA-95 incorrectly states that the EquipmentClass reference for EquipmentType is an Object , while it should be an ObjectClass .Example: {'EquipmentType.EquipmentClass': 'ObjectType'} |
enable-deletes | boolean |
connection-target-map
Part of data-models
configuration.
Target connections on the form "Type.Property": "Target"
. This is useful for certain schemas. This overrides the expected type of specific CDF Connections, letting you override incorrect schemas. For example, the published nodeset file for ISA-95 incorrectly states that the EquipmentClass
reference for EquipmentType
is an Object
, while it should be an ObjectClass
.
Example:
EquipmentType.EquipmentClass: ObjectType
Parameter | Type | Description |
---|---|---|
Any string matching [A-z0-9-_.]+ | either ObjectType , Object , VariableType , Variable , ReferenceType , DataType , View or Method | NodeClass to override connection with. |
metadata-mapping
Part of cognite
configuration.
Define mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have a property named EngineeringUnits
, which ideally should be mapped to unit
in CDF. This property lets you do that.
Example:
timeseries:
EngineeringUnits: unit
EURange: description
assets:
Name: name
Parameter | Type | Description |
---|---|---|
assets | object | Map metadata for assets. |
timeseries | object | Map metadata for time series. |
assets
Part of metadata-mapping
configuration.
Map metadata for assets.
Parameter | Type | Description |
---|---|---|
Any string | either description , name or parentId | Target asset attribute |
timeseries
Part of metadata-mapping
configuration.
Map metadata for time series.
Parameter | Type | Description |
---|---|---|
Any string | either description , name , parentId or unit | Target time series attribute |
raw-node-buffer
Part of cognite
configuration.
Read from CDF instead of OPC-UA when starting, to speed up start on slow servers. Requires extraction.data-types.expand-node-ids
and extraction.data-types.append-internal-values
to be set to true
.
This should generaly be enabled along with metadata-targets.raw
or with no metadata targets at all.
If browse-on-empty
is set to true
and metadata-targets.raw
is configured with the same database and tables the extractor will read into raw on first run, then use raw as source for later runs. The Raw database can be deleted, and it will be re-created on extractor restart.
Parameter | Type | Description |
---|---|---|
enable | boolean | Required. Set to true to enable loading nodes from CDF Raw. |
database | string | Required. CDF RAW database to read from. |
assets-table | string | CDF RAW table to read assets from, used for events. This is not useful if there are no custom nodes generating events in the server. |
timeseries-table | string | CDF RAW table to read time series from. |
browse-on-empty | boolean | Run normal browse if nothing is found when reading from CDF, either because the tables are empty, or because they do not exist. Note that nodes may be present in the CDF RAW tables. Browse will run if no valid nodes are found, even if there are nodes present in RAW. |
mqtt
Global parameter.
Push data to CDF one-way over MQTT. This requires that the MQTT-CDF Bridge application is running somewhere with access to CDF.
Parameter | Type | Description |
---|---|---|
host | string | Required. The address of the MQTT broker. Example: localhost |
port | integer | Required. Port to connect to on the MQTT broker. Example: 1883 |
username | string | The MQTT broker username. Leave empty to connect without authentication. |
password | string | The MQTT broker password. Leave empty to connect without authentication. |
use-tls | boolean | Set this to true to enable Transport Level Security when communicating with the broker. |
allow-untrusted-certificates | boolean | Set this to true to allow untrusted SSL certificates when communicating with the broker. This is a security risk, we recommend using custom-certificate-authority instead. |
custom-certificate-authority | string | Path to certificate file for a certificate authority the broker SSL certificate will be verified against. |
client-id | string | MQTT client id. Should be unique for a given broker. Default value is cognite-opcua-extractor . |
data-set-id | integer | Data set to use for new assets, relationships, events, and time series. Existing objects will not be updated. |
asset-topic | string | Topic to publish assets on. Default value is cognite/opcua/assets . |
ts-topic | string | Topic to publish timeseries on. Default value is cognite/opcua/timeseries . |
event-topic | string | Topic to publish events on. Default value is cognite/opcua/events . |
datapoint-topic | string | Topic to publish datapoints on. Default value is cognite/opcua/datapoints . |
raw-topic | string | Topic to publish raw rows on. Default value is cognite/opcua/raw . |
relationship-topic | string | Topic to publish relationships on. Default value is cognite/opcua/relationships . |
local-state | string | Set to enable storing a list of created assets/timeseries to local litedb. Requires state-storage.location to be set. If this is left empty, metadata will have to be read each time the extractor restarts. |
invalidate-before | integer | Timestamp in ms since epoch to invalidate stored mqtt states. On extractor restart, assets/timeseries created before this will be re-created in CDF. They will not be deleted or updated. Requires the state-storage to be enabled. |
skip-metadata | boolean | Do not push any metadata at all. If this is true, plan timeseries without metadata will be created, like when using raw-metadata , and datapoints will be pushed. Nothing will be written to raw and no assets will be created. Events will be created, but without asset context |
raw-metadata | object | Store assets/timeseries metadata and relationships in raw. Assets will not be created at all, timeseries will be created with just externalId , isStep , and isString . Both timeseries and assets will be persisted in their entirity to CDF Raw. Datapoints are not affected.Events will be created but without being contextualized to assets. The external ID of the source node is added to metadata if applicable |
metadata-mapping | object | Define mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have a property named EngineeringUnits , which ideally should be mapped to unit in CDF. This property lets you do that.Example: {'timeseries': {'EngineeringUnits': 'unit', 'EURange': 'description'}, 'assets': {'Name': 'name'}} |
raw-metadata
Part of mqtt
configuration.
Store assets/timeseries metadata and relationships in raw. Assets will not be created at all, timeseries will be created with just externalId
, isStep
, and isString
. Both timeseries and assets will be persisted in their entirity to CDF Raw. Datapoints are not affected.
Events will be created but without being contextualized to assets. The external ID of the source node is added to metadata if applicable
Parameter | Type | Description |
---|---|---|
database | string | Required. Raw database to write metadata to. |
assets-table | string | Raw table to use for assets. |
timeseries-table | string | Raw table to use for timeseries. |
relationships-table | string | Raw table to use for relationships. |
metadata-mapping
Part of mqtt
configuration.
Define mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have a property named EngineeringUnits
, which ideally should be mapped to unit
in CDF. This property lets you do that.
Example:
timeseries:
EngineeringUnits: unit
EURange: description
assets:
Name: name
Parameter | Type | Description |
---|---|---|
assets | object | Map metadata for assets. |
timeseries | object | Map metadata for time series. |
assets
Part of metadata-mapping
configuration.
Map metadata for assets.
Parameter | Type | Description |
---|---|---|
Any string | either description , name or parentId | Target asset attribute |
timeseries
Part of metadata-mapping
configuration.
Map metadata for time series.
Parameter | Type | Description |
---|---|---|
Any string | either description , name , parentId or unit | Target time series attribute |
influx
Global parameter.
Configuration for pushing to an InfluxDB database. Data points and events will be pushed, but no context or metadata.
Parameter | Type | Description |
---|---|---|
host | string | Required. URL of the host InfluxDB server |
username | string | The username for connecting to the InfluxDB database. |
password | string | The password for connecting to the InfluxDB database. |
database | string | Required. The database to connect to on the InfluxDB server. The database will not be created automatically. |
point-chunk-size | integer | Maximum number of points to send in each request to InfluxDB. Default value is 100000 . |
read-extracted-ranges | boolean | Whether to read start/end points on startup, where possible. It is recommended that you use state-storage instead. |
read-extracted-event-ranges | boolean | Whether to read start/end points for events on startup, where possible. It is recommended that you use state-storage instead. |
extraction
Global parameter.
Configuration for general extraction options, such as data types, mapping, and filters.
External ID generation
IDs used in OPC UA are special nodeId
objects with an identifier and a namespace that must be converted to a string before they are written to CDF. A direct conversion, however, has several potential problems.
- The namespace index is by default part of the node, but it may change between server restarts. Only the namespace itself is fixed.
- The namespace table may be modified, in which case all old node IDs are invalidated.
- Node IDs are not unique between different OPC UA servers.
- Node identifiers can be duplicated accross namespaces.
The solution is to create a node ID on the following form:
[id-prefix][namespace][identifierType]=[identifier as string]([optional array index])
.
For example, the node with node ID ns=1;i=123
with ID prefix gp:
would be mapped to gp:http://my.namespace.url:i=123
.
You can optionally override this behavior for individual nodes by using node-map
(#extraction.node-map).
Parameter | Type | Description |
---|---|---|
id-prefix | string | Global prefix for externalId s in destinations. Should be unique for each extractor deployment to prevent name conflicts. |
root-node | object | Root node. Defaults to the Objects node. Default value is {'node-id': 'i=86'} . |
root-nodes | list | List of root nodes. The extractor will start exploring from these. Specifying nodes connected with hierarchical references can result in some strange behavior and should be avoided |
node-map | object | Map from external IDs to OPC UA node IDs. This can, for example, be used to place the node hierarchy as a child of a specific asset in the asset hierarchy, or to manually rename certain nodes. |
namespace-map | object | Map OPC-UA namespaces to prefixes in CDF. If not mapped, the full namespace URI is used. This saves space compared to using the full URI, and might make IDs more readable. |
auto-rebrowse-period | string | Time in minutes between each automatic re-browse of the node hierarchy. Format is as given in Timestamps and intervals, this option accepts cron expressions. Set this to 0 to disable automatic re-browsing of the server. Default value is 0m . |
enable-audit-discovery | boolean | Enable this to make the extractor listen to AuditAddNodes and AuditAddReferences events from the server, and use that to identify when new nodes are added to the server. This is more efficient than browsing the node hierarchy, but does not work with data models and requires that the server supports auditing. |
data-push-delay | string | Time between each push to destinations. Format is as given in Timestamps and intervals. Default value is 1s . |
update | object | Update data in destinations on re-browse or restart. Set auto-rebrowse-period to do this periodically. |
data-types | object | Configuration related to how data types and arrays should be handled by the OPC UA extractor. |
relationships | object | Map OPC UA references to relationships in CDF, or edges in CDF data models. Generated relationships will have external ID on the form [prefix][reference type];[source][target] Only relationships between mapped nodes are extracted. |
node-types | object | Configuration for mapping OPC UA types to CDF in some way. |
map-variable-children | boolean | Set to true to make the extractor read children of variables and potentially map those to timeseries as well. |
transformations | list | A list of transformations to be applied to the source nodes before pushing. Transformations are applied sequentially, so it can help performance to put Ignore filters first, and TimeSeries filters can undo Property transformations. |
deletes | object | Configure soft deletes. When this is enabled, all read nodes are written to a state store after browse, and nodes that are missing on subsequent browses are marked as deleted from CDF, with a configurable marker. A notable exception is relationships in CDF, which has no metadata, so these are hard deleted if cognite.delete-relationships is enabled. |
status-codes | object | Configuration for ingesting status codes to CDF timeseries. |
rebrowse-triggers | object | Configuration for triggering rebrowse based on changes to specific nodes. |
root-node
Part of extraction
configuration.
Root node. Defaults to the Objects node
Parameter | Type | Description |
---|---|---|
namespace-uri | string | Full URI of the node namespace. If left out it is assumed to be the base namespace |
node-id | string | Identifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard |
root-nodes
Part of extraction
configuration.
List of root nodes. The extractor will start exploring from these. Specifying nodes connected with hierarchical references can result in some strange behavior and should be avoided
Each element of this list should be a root node.
Parameter | Type | Description |
---|---|---|
namespace-uri | string | Full URI of the node namespace. If left out it is assumed to be the base namespace |
node-id | string | Identifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard |
node-map
Part of extraction
configuration.
Map from external IDs to OPC UA node IDs. This can, for example, be used to place the node hierarchy as a child of a specific asset in the asset hierarchy, or to manually rename certain nodes.
Example:
myCustomExternalId:
node-id: i=15
namespace-uri: urn:mynamespace
Parameter | Type | Description |
---|---|---|
Any string | object | Target node ID for mapping external ID. Default value is {'node-id': 'i=86'} . |
proto_node_id
Part of node-map
configuration.
Target node ID for mapping external ID.
Parameter | Type | Description |
---|---|---|
namespace-uri | string | Full URI of the node namespace. If left out it is assumed to be the base namespace |
node-id | string | Identifier of the node id, on the form i=123, s=string, etc. See the OPC-UA standard |