Skip to main content

Configure the OPC UA extractor

To configure the OPC UA extractor, you must edit the configuration file. The file is in YAML format, and the sample configuration file contains all valid options with default values.

You can leave many of the fields empty to let the extractor use the default values. The configuration file separates the settings by component, and you can remove an entire component to disable it or use the default values.

ProtoNodeId

You can provide an OPC UA nodeid in several places in the configuration file with an object in YAML with the following structure:

  • node:
    • node-id: i=123
    • namespace-urii: opc.tcp://test.test/

To find the node IDs, we recommend using the uaexpert tool.

Locate the datatype/event-type/node in the hierarchy, then find the node ID on the right side under Attribute > NodeId. Find the Namespace Uri by matching the NamespaceIndex on the right to the list on the left, by default No highlight

If either part is left empty, it is converted to a different node ID based on context. This happens automatically for events if you use the configuration tool released with version 1.1. If a mapping is specified in namespace-map, you can use the mapped value in place of namespace-uri.

Source

This part of the configuration file concerns the extraction from the OPC UA server.

ParameterDescription
endpoint-urlThe URL of the OPC UA server to connect to. In practice, this is the URL of the discovery server, where multiple levels of security may be provided. The OPC UA extractor attempts to use the highest security possible based on the configuration. Required.
reverse-connect-urlThe local URL used for reverse-connect. This is the URL the server should connect to. You should also specify an endpoint-url. The server is entirely responsible for initiating connections, so it can be placed entirely behind a firewall. Leave empty to use direct connections.
auto-acceptSet to true to automatically accept connections from servers. If you set this to false and try to connect to a server with higher security than None, the connection fails. A certificate is placed in the rejected certificates folder (by default application_dir/pki/rejected/) but you can manually move it to the accepted certificates folder (application_dir/pki/accepted). A simple solution is to set this to true once on the first connection, then change it to false.
username/passwordUsed for server login, leave username empty to use no authentication.
x509-certificateSpecifies the configuration for using a signed x509 certificate to connect to the server. Options:
  • file-name - the location of the x509-certificate
  • password - the password to the x509-certificate file.
  • store - the local store to use, either None (to use file), Local (for LocalMachine), or User.
  • cert-name - the name of the certificate in the store.
secureTry to connect to an endpoint with security above None.
ignore-certificate-issuesIgnore all suppressible certificate errors on the server certificate. You can use this setting if you receive errors such as “Certificate use not allowed”.

CAUTION: This is potentially a security risk. Bad certificates can open the extractor to man-in-the-middle attacks from the server, or similar. If the server security is located elsewhere (it is running locally, or over a secure VPN, or similar), it is most likely fairly safe.

Some errors are not suppressible, and must be remedied on the server.
publishing-intervalSets the interval (n milliseconds) between publishing requests to the server. This limits the maximum frequency of points pushed to CDF, but not the maximum frequency of points on the server. In most cases this can ve set to the same as Extraction.DataPushDelay. If you set it to 0, the server chooses the interval according to the specification.
sampling-intervalSets the sample rate of subscriptions on the server. The server usually defines a set of permitted sample-rates and picks the closest to what you specify here. Many servers do not support more than a single sample rate. Set the interval to 0 to use the server default.

This setting generally sets the maximum rate of points from the server (in milliseconds). On many servers sampling is an internal operation, but on some this may access external systems, and setting this very low can increase the load on the server significally. It typically limits the density of the points from the server, but not always.

queue-lengthSpecifies the length of internal server queue for points and events. Normally, this can be set to the same as publishing-interval/sampling-interval. Higher numbers increase the strain on the server. Many servers have a very limited max queue size, or just ignore this parameter entirely and use a fixed size for everything.
force-restartIf true, the OPC UA extractor will not attempt to reconnect using the OPC UA reconnect protocol on a disconnect from the server, but restart completely. Use this option for servers that do not support reconnecting
exit-on-failureIf true, the OPC UA Extractor will not automatically restart after a crash, but defer to some external mechanism.
restart-on-reconnectIf true, the OPC UA Extractor will be restarted on reconnect. This may not be required if the server is expected to be static, and if it handles reconnects well. Setting this to true lowers restart times.
keep-alive-intervalSpecifies the interval in milliseconds between each keep-alive request to the server. The connection times out if a keep-alive request fails twice (2 * interval + 100ms). This typically happens if the server hangs on a heavy operation and does not manage to respond to keep-alive requests, or if the server goes down. In the first case, waiting can be a good option. In the second case, it is better to time out quickly.
node-set-sourceRead from NodeSet2 files instead of browsing the OPC UA node hierarchy. This is useful for smaller servers, where the full node hierarchy is defined. It can in general be used to lower load on the server if parts of it is known beforehand. Options:
  • node-sets - a list of objects with either file-name or url, pointing to a NodeSet2.xml file.
  • instance - Boolean. If true, the instance hierarchy is not browsed from the server, but obtained from the NodeSet files instead.
  • types - Boolean. If true, event types, reference types, and object types are obtained from the NodeSet2 files.
limit-to-server-configThe default value true uses the Server_ServerCapabilities object to limit chunk sizes. Set this to false only if you want to set the limits higher and are certain that the server is reporting the wrong limits. If the real server limits are exceeded, the extractor will typically crash.
alt-source-background-browseIf true, browses the OPC UA node hierarchy in the background when reading nodes from NodeSet files or from CDF Raw. This setup does not reduce load on the server, but can speed up startup.
browse-chunkSets the number of max desired results from each call of the Browse service to OPC UA. Most servers have some limits, but the default of 1000 is usually reasonable. The server should also usually limit this on its own.
browse-nodes-chunkSets the number of max nodes to browse per browse service call. If set too high, the browse operation may fail. Most servers have an upper limit to the number of operations per service call, and this value also may affect the speed. We do not recommend setting this to 1, but it may be necessary for some servers.
attributes-chunkSpecifies the max number of attributes to fetch per operation. If the server fails with a TooManyOperations exception during attribute read, it may help to lower this value. 1000 should be fine for most servers, and may even be set higher for higher-spec servers. For very large servers, 1000 will take a very long time, and this should be set as high as possible, even if that requires increasing the keep-alive-interval.
subscription-chunkSets the max number of new MonitoredItems to create per operation. If the server fails with TooManyOperations, try to lower this value. Unless there are a large number of nodes on the server, 1000 per chunk is generally fine.
browse-throttlingConfiguration object for throttling browses.
  • max-per-minute - Maximum number of browse requests per minute.
  • max-parallelism - Maximum number of parallel browse requests, if supported by the server.
  • max-node-parallelism - Maximum number of nodes to read in parallel. This can be used to limit the number of continuation points used by the extractor.
certificate-expirySpecifies the default certificate expiration in months. You can also replace the certificate with your own by modifying the .xml config file. Defaults to 5 years as of v2.5.3.

History

The OPC UA Extractor supports reading from data and event history in OPC UA. For data, the Historizing attribute must be set on the nodes to be read. For events, the node IDs of the emitters must be specified explicitly in the configuration.

ParameterDescription
enabledSet to false to disable history read. Default: . This overrides all other history configurations and disables it entirely for both events and data-points.
dataSet to false to disable history for datapoints.Default: true. This makes it possible to only enable history for events.
backfillEnable backfill, meaning that data is read backwards and forwards through history. If there is a lot of history, the server can start reading live values without completing history-read first. If set to false (default), the behavior is pre 1.1, meaning that the data is read from the beginning of history to the end before any live streaming begins.
require-historizingtrue to require Historizing to be set on timeseries to read history.
restart-periodTime in seconds to wait between each restart of history. Setting this too low may impact performance. Leave at 0 to disable periodic restarts. Optionally, use N[timeunit] where timeunit is w, d, h, m, s or ms. You may also use a cron expression on the form [minute] [hour] [day of month] [month] [day of week].
data-chunkMax number of results to request per HistoryRead call when reading variables. Generally, this is limited by the server, so it can safely be set to 0.
data-nodes-chunkMax number of nodes to query per HistoryRead call when reading variables. If Granularity is set, this is applied afterward.
event-chunkMax number of results to request per HistoryRead call when reading events. Generally, this is limited by the server, so it can safely be set to 0.
event-nodes-chunkMax number of nodes to query per HistoryRead call when reading events.
granularityGranularity in seconds for chunking history read operations. Variables with the latest timestamp within the same chunk have their history read together. Reading more variables per operation is more efficient, but if the granularity is set too high then a large number of duplicates are fetched. This can be inefficient for very large granularities. The best choice for this value is a few times the expected update frequency of your variables. Optionally, use N[timeunit] where timeunit is w, d, h, m, s or ms.
start-timeEarliest timestamp to read from in milliseconds since January 1, 1970. Optionally, use the syntax N[timeunit](-ago) where timeunit is w, d, h, m, s or ms. In past if -ago is added, future if not.
end-timeTimestamp to be considered the end of forward history. Only relevant if max-read-length is set. In milliseconds since 1/1/1970. The default is the current time, if this is 0. Optionally, use syntax N[timeunit](-ago)(-ago) where timeunit is w, d, h, m, s or ms. In past if -ago is added, future if not.
ignore-continuation-pointsSet to true to attempt to read history without using ContinationPoints, instead using the Time of events, and SourceTimestamp of datapoints to incrementally change the start time of the request until no points are returned.
max-history-lengthMaximum length of each read of history, in seconds. If this is set greater than zero, history will be read in chunks of maximum this size, until the end. This can potentially take a very long time if end-time is much larger than start-time. Optionally, use N[timeunit] where timeunit is w, d, h, m, s or ms.
throttlingConfiguration object for throttling history reads.
  • max-per-minute - The maximum number of history requests per minute.
  • max-parallelism - The maximum number of parallel history requests, if supported by the server.
  • max-node-parallelism - Maximum number of nodes to read in parallel. This can be used to limit the number of continuation points used by the extractor.
log-bad-valuesDefault: true. Log bad history datapoints, count per read at debug and each datapoint at verbose.

Cognite - CDF API

Configuration for pushing directly to the CDF API.

ParameterDescription
projectThe CDF project. Required. Can be left out if the OPC UA Extractor is set to debug mode.
api-keyThe CDF API key. Required. Can be left out if the OPC UA Extractor is set to debug mode.
hostThe CDF service URL.
debugSet to true to prevent the pusher from pushing to CDF or connecting at all. Used for testing.
read-extracted-rangesSpecifies whether to read start/end-points on startup, where possible. At least one pusher should be able to do this, otherwise the back/frontfill will run for the entire history every restart. The CDF pusher is not able to read start/end points for events, so if reading historical events is enabled, one other pusher able to do this should be enabled. If the server has a lot of variables, this can be extremely slow, and we recommend using the state-store instead.
data-set-idThe internal ID of the CDF data set to be used for all new time series, assets, and events. Already created items will not be affected.
data-set-external-idThe data set to use for new objects, overridden by data-set-id. Requires the capability datasets:read for the given data set.
nan-replacementReplacement value for values that are non-finite e.g. NaN, +Infinity and -Infinity. If this is left empty, these points are simply ignored.
raw-metadataConfiguration for using Cognite Raw to store assets and time series metadata.
raw-metadata/databaseThe Cognite Raw database to store metadata in, required for this feature to be enabled.
raw-metadata/assets-tableThe Cognite Raw table to store assets in. If this is set along with database, assets are not pushed to the asset hierarchy but instead written to Cognite Raw. Time series will not be contextualized in this case, but if timeseries-table is set, the asset external-id will be stored there. The assets are simply pushed as full asset JSON objects with all the data available from extraction.
raw-metadata/timeseries-tableThe Cognite Raw table to store time series in. If this is set along with ****database, time series are pushed with only minimum information (isStep, isString, externalId). Everything else is stored in Cognite Raw as full time series JSON objects.
raw-node-bufferRead from CDF instead of OPC UA when starting the extractor, to speed up starting on slow servers. This requires the extraction.expand-node-ids and extraction-append-internal-values to be set to true. Generally this would be enabled along with skip-metadata or raw-metadata. Reading from Raw into clean using this is generally not supported.

If browse-on-empty is set to true, and raw-metadata is configured with the same database and tables, the extractor will read from the server on first startup only, then use Raw for all further reads.

With this enabled, rebrowse/updates are generally pointless.

  • enabled - true to enable feature.
  • database - Raw database to read from
  • assets-table - Raw table to read assets from, for events.
  • timeseries-table - Raw table to read timeseries from, for events and datapoints.
  • browse-on-empty - Run normal browse if nothing is found when reading from CDF. Note that nodes may be present in the raw table. Browse will still run if none of them are variables, and none of them have a valid EventNotifier.
metadata-mappingContains two string/string maps named assets and timeseries. It lets you define mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have an EngineeringUnits field, which ideally should be mapped to unit in CDF. This can be done with

timeseries:

  "EngineeringUnits": "unit"

Legal attributes are: name, description, and parentId, as well as unit for timeseries. parentId must be the parent externalId of the timeseries, and it must be an asset mapped by the OPC UA Extractor. It may be a string ID directly, or a node ID.
skip-metadataIf true, assets will not be written to CDF at all, and only basic timeseries will be created. This is the same as when raw-metadata is enabled, except that nothing will be pushed to RAW either.
idp-authenticationConfiguration for authentication using a bearer access token.

See OAuth 2.0 client credentials flow.

Required fields are client-id, tenant, secret, scopes.

min-ttl is optional minimum time-to-live in seconds for the token. The default value is 30.

implementation is the implementation used. The default is MSAL (Microsoft Authentication Library), which is usually the best choice. Optionally, you can use a custom implementation by setting it to Basic.

authority is the identity provider endpoint. The default is https://login.microsoftonline.com/.
cdf-retriesConfigure automatic retries on requests to CDF. Fields:
  • timeout - The maximum timeout for each individual try.
  • max-retries - The maximum number of retries, less than 0 retries forever.
  • max-delay - The maximum delay in milliseconds between each try. Base delay is calculated according to 125*2^retry ms. If less than 0, there is no maximum (0 would mean no delay).
If the connection to CDF is very poor, you may need to change this setting, Lowering the max number of retries can also lower the time to failure-buffering starts, which may be necessary if there is a lot of data.
cdf-chunkingConfigure chunking of data on requests to CDF. Note that some of these reflect actual limits in the API, and increasing them may cause requests to fail. See https://docs.cognite.com/api/v1/.
  • time-series - The maximum number of time series per get/create time series request.
  • assets - The maximum number of assets per get/create asset request.
  • data-point-time-series - The maximum number of time series per datapoint create request.
  • data-points - The maximum number of datapoints per datapoint create request.
  • data-point-list - The maximum number of time series per datapoint read request, used when getting first point in a time series.
  • data-point-latest - The maximum number of time series per datapoint read latest request.
  • raw-rows - The maximum number of rows per request to CDF Raw. Used with raw state-store and for raw asset/time series metadata.
  • events - The maximum number of events per get/create events request.
cdf-throttlingConfigure how requests to CDF should be throttled. Each entry is the maximum allowed number of parallel requests to CDF. Fields: time series, assets, datapoints, raw, ranges (first/last datapoint), and events.
sdk-loggingConfiguration for logging using the .NET SDK. This is additional debug information about requests, and will show in detail what requests fail and how long they take.
  • disable - Set to true to disable logging from the SDK. The default value is false.
  • level - The level of logging, one of trace, debug, information, warning, error, critical, none.
  • format - The formatting of the log message.
extraction-pipelineConfigure an extraction pipeline manager. The pipeline must be created beforehand.
  • pipeline-id - The externalId of extraction pipeline in CDF.
  • frequency - The frequency to report "Seen", in seconds. Less than or equal to zero will not report automatically.
browse-callbackCall a Cognite function with the number of assets, time series, and relationships created and updated after each browse and rebrowse operation. The function is called with a JSON object containing the following fields:
  • idPrefix - The configured extraction.id-prefix.
  • assetsCreated - The number of new assets or raw rows in the assets table created.
  • assetsUpdated - The number of assets updated, or raw rows in the asset table modified.
  • timeSeriesCreated - The number of new time series or raw rows in the time series table.
  • timeSeriesUpdated - The number of time series updated, or raw rows in the time series table modified.
  • minimalTimeSeriesCreated - The number of time series created with no metadata, only used if time series are written to raw.
  • relationshipsCreated - The number of new relationships or raw rows in the relationships table.
  • rawDatabase - Name of the configured raw database.
  • assetsTable - Name of the configured raw table for assets.
  • timeSeriesTable - Name of the configured raw table for time series.
  • relationshipsTable - Name of the configured raw table for relationships.
"Minimal time series" here refers to time series that are created with no metadata when time series are written to raw. This option requires functions:WRITE scoped to the function given by external id or id, and functions:READ if external-id is used. It is a YAML object with fields:
  • external-id - function external id. If this is used, functions:READ is required.
  • id - function internal id.
  • report-on-empty - default false, set to true to always report, even if nothing was modified in CDF.

Influx

Configuration for pushing to an InfluxDB database. Datapoints and events will be pushed, but no context or metadata.

ParameterDescription
hostThe URL of the InfluxDB server.
usernameThe username for connecting to the database.
passwordThe password for connecting to the database
databaseThe database to connect to on the server. The database will not be created automatically.
debugIf true, the pusher will not push to target. Use for testing.
read-extracted-rangesWhether to read start/end-points on startup, where possible. At least one pusher should be able to do this, otherwise back/frontfill will run for the entire history every restart.
point-chunk-sizeMax number of points per push. Try to increase if the pushing seems to be slow.
non-finite-replacementReplacement value for values that are non-finite, e.g. NaN, +Infinity and -Infinity. Leave empty to ignore these points.

MQTT

The MQTT pusher pushes to CDF one-way over MQTT. It requires that the MQTTCDFBridge application is running somewhere with access to CDF.

ParameterDescription
hostThe address of TCP MQTT broker. This needs to be running for the pusher to function.
portThe port on the TCP MQTT broker.
usernameThe MQTT broker username. Leave empty to connect without authentication.
passwordThe MQTT broker password. Leave empty to connect without authentication.
client-idThe MQTT Client ID. This needs to be unique for each broker.
data-set-idThe internal ID of CDF dataset to be used for all new time series, assets, and events. Already created items will not be affected.
asset-topicThe topic to use for assets. Needs to match the configuration of MQTTCDFBridge (it does by default).
ts-topicThe topic to use for time series.
event-topicThe topic to use for events.
datapoint-topicThe topic to use for datapoints.
raw-topicThe topic to use for raw rows.
local-stateSet to enable storing a list of created assets/time series in a local database. Requires the StateStorage.Location property to be set. The value of this option is the table name. The default value is empty. Using this with raw state-storage does not make sense.
invalidate-beforeTimestamp in ms since epoch to invalidate stored states. Any objects created before this will be replaced the next time the OPC UA Extractor is restarted.
debugIf true, the pusher will not push to target. Used for testing.
non-finite-replacementThe replacement value for values that are non-finite e.g. NaN, +Infinity and -Infinity, or not between -10^100 and 10^100. If this is left empty, these points are simply ignored.
raw-metadataConfiguration for using Cognite Raw to store assets and time series metadata.
raw-metadata/databaseThe Cognite Raw database to store metadata in, required for this feature to be enabled.
raw-metadata/assets-tableThe Cognite Raw table to store assets in. If this is set along with database, assets are not pushed to the asset hierarchy but instead written to Raw. Time series will not be contextualized in this case, but if timeseries-table is set, the asset external ID will be stored there. The assets are simply pushed as full asset JSON objects with all the data available from extraction.
raw-metadata/timeseries-tableThe Cognite Raw table to store time series in. If this is set along with database, time series are pushed with minimum information (isStep, isString, externalId). Everything else is stored in Cognite Raw as full time series JSON objects.
metadata-mappingContains two string/string maps named assets and timeseries. It lets you define mappings between properties in OPC UA and CDF attributes. For example, it is quite common for variables in OPC UA to have an EngineeringUnits field, which ideally should be mapped to unit in CDF. This can be done with

timeseries:

  "EngineeringUnits": "unit"

Legal attributes are: name, description, and parentId, as well as unit for timeseries. parentId must be the parent externalId of the timeseries, and it must be an asset mapped by the OPC UA Extractor. It may be a string ID directly, or a node ID.
skip-metadataIf true, assets will not be written to CDF at all, and only basic timeseries will be created. This is the same as when raw-metadata is enabled, except that nothing will be pushed to RAW either.

Logger

Log entries are at one of Fatal, Error, Warning, Information, Debug, Verbose, in order of decreasing importance. Each level covers the ones of higher importance.

ParameterDescription
console/levelThe level of messages to write to console. If not present, or invalid, logging to console is disabled.
file/levelThe level of messages to write to file. If not present, or invalid, logging to file is disabled.
file/pathThe path to a log file, logs are rotated.
file/retention-limitThe maximum number of logs to keep in log folder. The oldest are deleted.
file/rolling-intervalA rolling interval for log files. Either day or hour. The default value is day.

StateStorage

A local LiteDb database or a table in CDF Raw that stores various persistent information between runs. It can be used as a replacement of the potential process of reading first/last datapoints from CDF, and also allow storing first/last times for events.

ParameterDescription
intervalThe time in seconds between each time current extraction state is persisted. 0 disables this feature.
databaseWhich type of database to use. Either None, Raw, LiteDb.
event-storeThe name of the table or litedb collection to store information about extracted events.
influx-event-storeThe name of the table or litedb collection to store information about event ranges in influxdb failure buffer.
influx-variable-storeThe name of the table or litedb collection to store information about variable ranges in influxdb failure buffer.
locationThe path to the .db file used for storage, or the name of the Cognite Raw database.
variable-storeThe name of the table or litedb collection to store information about extracted OPC UA variables.

FailureBuffer

If the connection to a destination goes down, the OPC UA Extractor supports buffering datapoints and events in influxdb or a local file. This is helpful if the connection is unstable.

ParameterDescription
datapoint-pathThe path to the binary file where datapoints are buffered. Leave empty to disable pushing datapoints to file. Buffering to file is very fast, and is generally hardware bound.
enabledSet to true to enable the FailureBuffer for all pushers.
event-pathThe path to the binary file where events are buffered. Leave empty to disable pushing events to file.
influxSet to true to enable buffering in influxdb. Requires influxb to be running. This serves as an alternative to a local file, but only really makes sense if pushing to influxdb is required for other reasons.
influx-state-storeSet to true to enable storing the state of the influxdb buffer to a local database. This makes the influxdb buffer persistent even if the OPC UA Extractor stops before it is emptied. Requires the StateStorage.Location option to be set.

Metrics

The OPC UA Extractor can push some metrics about usage to a prometheus pushgateway server.

ParameterDescription
server/hostThe hostname for a locally hosted prometheus server, used for scraping.
server/portThe port used for a locally hosted prometheus server.
push-gatewaysA list of pushgateway configurations. The OPC UA Extractor will periodically push to each of these in turn.
push-gateways/hostThe pushgateway URL root. Ex. config my.prometheus.server and job myjob gives the final endpoint my.prometheus.server/metrics/jobs/myjob
push-gateways/jobThe job to use in the destination.
push-gateways/usernameThe username for the prometheus target.
push-gateways/passwordThe password for the prometheus target.
nodesUse to treat certain OPC UA nodes as metrics.
  • server-metrics - If true, a couple of relevant diagnostics from ServerDiagnosticsSummary are mapped.
  • other-metrics - List of ProtoNodeId describing nodes that should be treated as metrics.

Extraction

Contains configuration about most extraction options. Mapping, datatypes, filters etc.

External ID generation

IDs used in OPC UA are special nodeId object with an identifier and a namespace that need to be converted to a string for destination systems. However, a direct conversion has several problems:

  • It will use the namespaceIndex, which is not necessarily preserved between server restarts.
  • The namespace table may be modified, in which case all old nodeIds are invalidated.
    • NodeIds are also not unique between OPC UA servers, and frequently just count from 1, which would make reading from multiple OPC UA servers impossible.
  • Node identifiers can be duplicated on different namespaces.

The solution is a nodeId on the following form:

IdPrefix + namespace + identifiertype(i,s,g,etc.) + = + identifier value as string
(+ [index in array if viable])

For example, the node with nodeId (“SomeId”, “http://my.namespace.url”), using the id-prefix “gp:” would be mapped to gp:http://my.namespace.url:i=SomeId. You can specify a namespace mapping in extraction/namespace-map to, for example, turn this into gp:mnu:i=SomeId

If it is an array, it would turn into an object with the above ID, and several timeseries with IDs like gp:mnu:i=SomeId[1].

Alternatively, you can manually override each nodeId manually.

ParameterDescription
id-prefixPrefix used to generate NodeIds.
ignore-name-prefixDEPRECATED, use transformations. List of strings used to filter out prefixes on the DisplayName of nodes during browsing. This means that children of these nodes are also filtered out.
ignore-nameDEPRECATED, use transformations. List of full DisplayNames to ignore instead of just a prefix.
data-push-delayTime between each push to destinations, in ms. Optionally, use N[timeunit] where timeunit is w, d, h, m, s or ms
root-nodeA single ProtoNodeId (as described above) used as the origin of the browse. An empty ProtoNodeId (no identifier or no namespace) is treated as the objects folder. Combined with root-nodes, if specified. If neither root-node or root-nodes is specified, this defaults to the Objects folder.
root-nodesA list of ProtoNodeIds to use as root nodes when browsing. These will generally be created as root assets in CDF. If a node set as root node is discovered as a descendant of another root node it will be ignored, but it may be best to avoid doing this at all.
node-mapMap from strings, representing externalIds, to ProtoNodeIds. This can be used to override the externalIds, for example to place the hierarchy as children of an asset in CDF.

For example, if UaRoot is set to the same value as the RootNode, all the nodes in the tree will be placed as children of the node with externalId UaRoot.
namespace-mapUsed as described above to map namespaces to shortened identifiers.
data-typesSub-object containing configuration for how data types and arrays should be handled by the OPC UA Extractor.
data-types/custom-numeric-typesUsed to manually set types in OPC UA to be numeric. This can be used to make custom types be treated as numbers, etc. The conversion is done with the C# "Convert" functionality. If no valid conversion exists, this will fail.
data-types/ignore-data-typesList of ProtoNodeId (as described above), describing data types on variables to filter out.
data-types/unknown-as-scalarAssume non-specific ValueRanks in OPC UA (ScalarOrOneDimensions and Any), are scalar, if they do not have an ArrayDimension set. If such a variable produces an array, only the first element will be mapped to CDF. In order to properly extract arrays to CDF, ArrayDimensions must be set.
data-types/max-array-sizeMax length of arrays to be mapped to destinations. If this is set to 0, only scalar values are mapped. Each array-type variable in the source system is converted to an object in the destination system, then each entry in the array is added as a child variable of that object. (In CDF this will mean that you get an asset with the externalId corresponding to the original variable, with time series for each entry in the array.)

This requires the ArrayDimensions property to be set and be of length 1.
data-types/allow-string-variablesSet to true to map variables of non-numeric types to strings in destination systems.
data-types/auto-identify-typesMap out the data type hierarchy before starting, useful if there are custom or enum types. Necessary for enum metadata and for enums-as-strings to work. If set to false, any custom numeric types must be added manually.

This causes some extra work on startup.
data-types/enums-as-stringsIf set to false and auto-identify-types is set to true, or there are manually added enums in custom-numeric-types, enums will be mapped to numeric time series, and labels are added as metadata fields. If set to true, labels are not mapped to metadata, and enums will be mapped to string time series with values equal to mapped label values.
data-types/data-type-metadataAdd a metadata property dataType which contains the name or ID of the OPC UA datatype. Built-in types can always be mapped to name, custom types require auto-identify-types to be set to true.
data-types/null-as-numericTreat null data types as numeric. This can be useful on servers without string variables, and faulty data types.
data-types/expand-node-idsAdd attributes such as NodeId, ParentNodeId and TypeDefinitionId to nodes in Raw, as full NodeIds encoded reversibly.
data-types/append-internal-valuesAdd internal attributes like ValueRank, ArrayDimensions, AccessLevel and Historizing to nodes in Raw.
data-types/estimate-array-sizesIf max-array-size is set, this looks for the MaxArraySize property on each node with one-dimension ValueRank, if it is not found, it tries to read the value as well, and look at the current size. ArrayDimensions is still the prefered way to identify array sizes, this is not guaranteed to generate reasonable or useful values.
auto-rebrowse-periodTime in minutes between each automatic re-browse of the node hierarchy. Since only new nodes are pushed to destinations, this is usually quite fast. Optionally, use N[timeunit] where timeunit is w, d, h, m, s or ms. You may also use a cron expression on the form [minute] [hour] [day of month] [month] [day of week].
enable-audit-discoveryThe OPC UA extractor listens to AuditAddNodes and AuditAddReferences events on the server node, then uses the information in these to browse the hierarchy. Much more efficient than simply browsing periodically, but requires server support for auditing.
map-variable-childrenBy default children of variables are treated as properties. If this is set to true, they can be treated as objects or variables instead. This will cause some variables to be mapped to both timeseries and assets, to allow timeseries to have timeseries children.
updateUpdate data in destinations on re-browse or restart. Set auto-rebrowse-period to some value to do this periodically. Consists of two objects, objects, and variables, controlling updates of assets and time series respectively. For each, name, description, context, and metadata can be configured separately.

context refers to the structure of the node graph in OPC UA (assetId and parentId in CDF). Metadata refers to any information obtained from OPC UA properties (metadata in CDF).

Enabling any of these will increase the startup- and rebrowse-time of the OPC UA Extractor. Enabling metadata will increase it more.
relationshipsMap OPC UA non-hierarchical references to relationships in CDF. The generated relationships will have external-id [prefix][reference type name (or inverse-name)];[namespace source][id source];[namespace target][id target]

Only relationships between mapped nodes will be added. This may be relevant if the server contains functional relationships, like connected components, a non-hierarchical reference based system for location, etc.
relationships/enabledEnable mapping non-hierarchical relationships to CDF. This is also required for any kind of relationship mapping to occur at all.
relationships/hierarchicalMap hierarchical references to relationships in CDF.
relationships/inverse-hierarchicalCreate inverse relationships for each hierarchical reference. For efficiency these are inferred, not read.
node-typesConfig related to mapping object- and variable-types to destinations.
node-types/metadataAdd the TypeDefinition as a metadata field to all nodes.
node-types/as-nodesAllow discovered types to be treated as nodes and mapped to CDF assets. Requires these to be inside the hierarchy, a solution to this may be to specify the “Types” folder as a root node.
transformationsA list of transformations to be applied to the source nodes before pushing. The possible transformations are:
  • Ignore - ignore the node. This will ignore all descendants of the node. If the filter does not use "is-array", "description" or "parent", this is done while reading, and so children will not be read at all. Otherwise, the filtering happens later.
  • Property - turn the node into a property, which is treated as metadata. This also applies to descendants. Nested metadata is give a name like grandparent_parent_variable, for each variable in the tree. There is some overhead associated with the filters.
  • DropSubscriptions - do not subscribe to this node with events or data-points.
  • TimeSeries - make the variable not a property, so that it is treated as a timeseries instead. Requires parents to be non-properties as well.
Note that transformations are applied sequentially, so it can help performance to put Ignore filters first, and that TimeSeries transformations can undo Property transformations.

It is possible to have multiples of each filter type. Each transformation consists of a type field and a filter field. The type is either Ignore or Property, the filter has the following fields:
  • name - regex filter on node DisplayName.
  • description - regex filter on node Description.
  • id - regex filter on string representation of node ID, on the form “i=123", "s=string", etc.
  • is-array - true/false whether the node is an array, if this is set to some value, the filter will only match variables that satisfy the requirement.
  • namespace - regex filter on full namespace of the node ID.
  • type-definition - regex filter on string representation of TypeDefinition NodeId, on the form “i=123", "s=string", etc.
  • node-class - filter on the NodeClass of the node, one of Object, Variable, ObjectType, VariableType.
  • historizing - true/false on the Historizing attribute on variables. If this is set to some value, filter will only match variables.
  • parent - another instance of this filter which will be applied to the parent node, if it exists. For nodes without registered parents, this will always miss.

Subscriptions

A few options for subscriptions to events and datapoints. Subscriptions in OPC UA consist of “Subscription” objects on the server, which contain a list of MonitoredItems. By default, the extractor produces a maximum of four subscriptions:

  • DataChangeListener - handles datapoint subscriptions.
  • EventListener - handles event subscriptions.
  • AuditListener - which handles audit events.
  • NodeMetrics - which handles subscriptions for use as metrics.

Each of these can contain a number of MonitoredItems.

ParameterDescription
data-pointsDefault true. Enables subscriptions on data points.
eventsDefault true. Enables subscriptions for events.
data-change-filterModify the DataChangeFilter used for datapoint subscriptions. See OPC UA reference part 4 7.17.2 for details. These are passed to the server in DataChangeListener.
  • trigger - One of Status, StatusValue, StatusValueTimestamp. Default is StatusValue.
  • deadband-type - Default None. One of None, Absolute or Percent.
  • deadband-value - Default 0, meaning depends on deadband-type.
ignore-access-levelIgnore the AccessLevel attribute and subscribe to all Variables, reading history from all nodes with Historizing set to true. This is the pre-2.3 behavior.
log-bad-valuesLog bad subscription datapoints.

Events

Events in OPC UA are usually custom when used on a server, and servers that support events often have a large number active. In OPC UA any node may specify the EventNotifier property, which indicates that it emits events and optionally stores historical events.

By default, all events will be read. If all-events is set to false, only events that do not belong to the base namespace will be read.

The attributes of each event are automatically mapped out, and a few general properties are filtered off. Others may be used as metadata in CDF or other source systems, or in some cases be mapped directly to event properties.

If the event has a SourceNode that refers to a node in the mapped hierarchy, it will be used to set the assetId property on the event in CDF.

The old options event-ids, emitter-ids, and historizing-emitter-ids are in practice deprecated, but will still work and may be used as a workaround for servers that are not fully compliant with the OPC UA standard.

ParameterDescription
enabledTrue to enable reading events from the server. If this is false, no events will be read.
historyTrue to enable reading historical events.
all-eventsTrue to read all events, not just custom events. Default value is true.
read-serverTrue to also check the server node when looking for event emitters. Default true.
exclude-event-filterRegex filter on event type DisplayName, matches will not be extracted.
exclude-propertiesList of BrowseNames for properties of events to be excluded from metadata or other consideration. By default only Time and Severity are used from the BaseEventType, all properties of subtypes are included.
destination-name-mapMap source browse names to other values in the destination. For CDF, internal properties may be overwritten, by default Message is mapped to description, SourceNode is used for context, and EventType is used for type. These may also be excluded or replaced by overrides in DestinationNameMap. If multiple properties are mapped to the same value, the first non-null is used.

If StartTime, EndTime, or SubType are specified, either directly or through the map, these are used as event properties instead of metadata. StartTime and EndTime should be either DateTime, or a number corresponding to the number of milliseconds since January 1 1970. If no StartTime or EndTime are specified, both are set to the Time property of BaseEventType. Type may be overridden case-by-case using NodeMap in the Extraction configuration, or in a dynamic way here. If no Type is specified, it is generated from Event NodeId in the same way ExternalIds are generated for normal nodes.
event-ids (deprecated)List of ProtoNodeIds (as described above) to be mapped to destinations. Events must be ObjectTypes and subtypes of BaseEventType in the OPC UA hierarchy. An empty ProtoNodeId defaults to the BaseEventType. This serves as an allowlist. If not specified, all events will be extracted.
emitter-ids (deprecated)List of ProtoNodeIds used as emitters. An empty ProtoNodeId defaults to the server node. This allows specifying additional event emitters. This is used to add extra emitters that are not in the extracted node hierarchy, or that does not correctly specify the EventNotifier property.
historizing-emitter-ids (deprecated)List of ProtoNodeIds that must be a subset of the EmitterIds. These emitters will have their event history read. The server must support this. The events.history option must be set for this to work. This is used to supplement the EventNotifier property, so that events that do not have the EventNotifier property set may still have their events read. Note that attempting to read historical events from non-historizing emitters may cause issues.

Pub-Sub

This is an experimental feature that allows subscribing to OPC UA pubsub instead of using OPC UA subscriptions for datapoints only. This requires the OPC UA server to be available and to expose the full PubSub configuration, as described in Part 14 of the OPC UA standard. It currently only supports MQTT.

Note that this does not disable subscriptions, you may want to consider setting subscriptions: data-points: false to avoid getting double datapoints.

Timeseries are not created from OPC UA pubsub configuration, but must be discovered in the OPC UA node hierarchy.

ParameterDescription
enabledDefault false. Enables pub-sub discovery.
prefer-uadpDefault true. If false, the extractor will prefer using uadp if the same datasets are exposed through multiple DataSetWriters.
file-nameSave or read configuration from a file. If the file does not exist, it will be created from server configuration. If this is pre-created manually, the server does not need to expose pubsub configuration.