Skip to main content

PI replace utility metrics

You can configure the PI replace utility to check the data point quality in Cognite Data Fusion (CDF) against the data points in the PI Data Archive. If you include the data-quality section, the differences between PI and CDF will be logged to the configured logger, sent to a Prometheus pushgateway using metrics, and written to a CSV file on exit using the report-file-name configuration parameter.

Logger output

PI replace logs the differences between the PI Data Archive and CDF with warning levels:


[2020-07-30 08:15:01.938 INF] Pushing metrics to http://localhost:9091 with job name replace-pi-job
[2020-07-30 08:15:02.445 INF] Connecting to PI server
[2020-07-30 08:15:19.484 INF] Found 16549 time series in PI. The replace range is ("2020-07-28T00:00:00.0000000Z", "2020-07-29T00:00:00.0000000Z")
[2020-07-30 08:15:19.532 INF] 16549/16549 time series have not been processed.
[2020-07-30 08:15:19.551 INF] Progress will be stored with an interval of 10 secs.
[2020-07-30 08:15:20.061 WRN] TimeSeries1 (Numeric) - datapoint number difference: -2/9
[2020-07-30 08:15:20.074 WRN] TimeSeries2 (Numeric) - datapoint number difference: -11/31
[2020-07-30 08:15:22.960 WRN] TimeSeries3 - datapoint number difference: -6065/12042
[2020-07-30 08:18:02.830 INF] Completed replacing 1000/16549 time series in range (2020-07-28 00:00:00.000, 2020-07-29 00:00:00.000) (163.2766988 secs)
[2020-07-30 08:24:53.899 WRN] TimeSeries4 (Numeric) - datapoints with different values 3/7
[2020-07-30 08:26:28.255 WRN] TimeSeries5 has no datapoints in CDF. Should have 6
[2020-07-30 08:27:05.487 WRN] TimeSeries6 has no datapoints in CDF. Should have 6
[2020-07-30 08:41:00.468 INF] Completed replacing 2000/16549 time series in range (2020-07-28 00:00:00.000, 2020-07-29 00:00:00.000) (177.6383223 secs)
[2020-07-30 09:06:44.673 WRN] TimeSeries7 (Numeric) - datapoints with different values 3/8

where:

  • TimeSeries1 (Numeric) is the data point number difference: -2/9 indicates 9 data points in the PI Data Archive, but 2 are missing from CDF.

  • TimeSeries4 (Numeric) is the data points' value difference. 3/7 indicates that both CDF and PI have 7 data points, but 3 data points have different values in CDF compared to the data points in the PI Data Archive.

  • TimeSeries5 has no data points in CDF. For instance: Should have 6 indicate that the time series TimeSeries5 exists in CDF but has no data points. The same time series has 6 data points in the PI Data Archive.

CSV report

The PI replace utility writes a report with the differences between the PI Data Archive and CDF to a CSV file on exit:

CSV report

where:

  • TimeSeries: Name is the name of the time series. This is the PI Point name.

  • RangeStart indicates that data points with timestamps higher than or equal to this date and time were replaced.

  • RangeEnd indicates that data points with timestamps smaller than or equal to this date and time were replaced.

  • TotalDataPoints is the total number of data points in the PI Data Archive.

  • CountDifference is the number of data points in CDF subtracted from the number of points in the PI Data Archive.

  • ValueDifference is the number of data points in CDF that have different values than the data points in the PI Data Archive.

  • MaxDelta is the maximum difference of the numeric data point values. The individual data point delta percentage is calculated as the fraction: (Value in PI - Value in CDF) / Value in PI

  • AvgDelta is the average data point value difference.

Metrics

PI replace can send metrics to a Prometheus pushgateway. The metrics can be displayed in a Grafana dashboard and monitored by CDF.

Metrics_prometheus
NameDescription
HeartbeatThe time since the last metrics push from the PI replace process.
Estimated time leftThe estimated time left until all time series are processed, based on the average iteration duration.
# Time seriesThe total number of time series being processed.
# Data pointsThe total number of data points processed until now.
Time range left to coverThe time range left to query PI for data points. When it reaches 0, all time series have been replaced.
Avg duration per 1000The average time it takes to replace 1000 time series in one iteration. The step-hours configuration parameter defines the time span of an iteration.
Data points with different valuesThe percentage of the processed data points with different values in CDF compared to the PI Data Archive.
Data point count differenceThe percentage of the data points in the PI Data Archive that are missing from CDF.
# Extractor incidentsThe number and type (data loss or reconnection) of extractor incidents being handled.

Replace progress

Replace progress
NameDescription
Replace progressThe overall progress. Replace to is the goal timestamp (start-time configuration parameter) and Replace progress is the current iteration timestamp (step). The replace process is completed when the yellow line meets the green line.
Iteration - # of time series replacedThe number of time series replaced in the current iteration (step-hours) and the replacing rate (time series/sec).
Iteration - Duration per 1000The duration of replacing 1000 time series in one iteration (step). The dashed line is the estimated time left until all time series are replaced.

Performance

Performance metrics
NameDescription
ProcessThe CPU usage and memory consumption of the PI replace process on the host machine.
CDF RequestsThe total number and latency of requests made to CDF.
CDF Uploaded data pointsThe total number of data points uploaded to CDF and upload rate (data point/sec).

Data quality

Data quality metrics
NameDescription
Data points with different valuesThe total number of data points with different values and their rate (points/sec).
Data point count differenceThe total number of data points in the PI Data Archive that are missing from CDF and its rate (points/sec).
Numeric data point deltaThe average and maximum difference between numeric data points in the PI Data Archive and CDF, as a percentage. A MaxDelta of 100% means that one time series had at least one data point where ABS (Value in PI - Value in CDF) / Value in PI greater than Value in PI.

PI server

PI server metrics
NameDescription
# PI connectionsThe number of times PI replace establishes a connection with the PI server. If this value is greater than 1, an error caused PI replace to reconnect to PI.
PI query rateThe rate of queries (query/sec) to the PI server and the query latency.
PI AF ValuesThe total number of good and bad AF Values obtained from PI and its rate (value/sec). Typically, the PI extractor converts 'good' AF Values to data points and uploads these to CDF.