PI replace utility metrics
You can configure the PI replace utility to check the data point quality in Cognite Data Fusion (CDF) against the data points in the PI Data Archive. If you include the dataquality
section, the differences between PI and CDF will be logged to the configured logger, sent to a Prometheus pushgateway using metrics, and written to a CSV file on exit using the reportfilename configuration parameter.
Logger output
PI replace logs the differences between the PI Data Archive and CDF with warning levels:
[20200730 08:15:01.938 INF] Pushing metrics to http://localhost:9091 with job name replacepijob
[20200730 08:15:02.445 INF] Connecting to PI server
[20200730 08:15:19.484 INF] Found 16549 time series in PI. The replace range is ("20200728T00:00:00.0000000Z", "20200729T00:00:00.0000000Z")
[20200730 08:15:19.532 INF] 16549/16549 time series have not been processed.
[20200730 08:15:19.551 INF] Progress will be stored with an interval of 10 secs.
[20200730 08:15:20.061 WRN] TimeSeries1 (Numeric)  datapoint number difference: 2/9
[20200730 08:15:20.074 WRN] TimeSeries2 (Numeric)  datapoint number difference: 11/31
[20200730 08:15:22.960 WRN] TimeSeries3  datapoint number difference: 6065/12042
[20200730 08:18:02.830 INF] Completed replacing 1000/16549 time series in range (20200728 00:00:00.000, 20200729 00:00:00.000) (163.2766988 secs)
[20200730 08:24:53.899 WRN] TimeSeries4 (Numeric)  datapoints with different values 3/7
[20200730 08:26:28.255 WRN] TimeSeries5 has no datapoints in CDF. Should have 6
[20200730 08:27:05.487 WRN] TimeSeries6 has no datapoints in CDF. Should have 6
[20200730 08:41:00.468 INF] Completed replacing 2000/16549 time series in range (20200728 00:00:00.000, 20200729 00:00:00.000) (177.6383223 secs)
[20200730 09:06:44.673 WRN] TimeSeries7 (Numeric)  datapoints with different values 3/8
where:

TimeSeries1 (Numeric)
is the data point number difference:2/9
indicates 9 data points in the PI Data Archive, but 2 are missing from CDF. 
TimeSeries4 (Numeric)
is the data points' value difference.3/7
indicates that both CDF and PI have 7 data points, but 3 data points have different values in CDF compared to the data points in the PI Data Archive. 
TimeSeries5
has no data points in CDF. For instance:Should have 6
indicate that the time seriesTimeSeries5
exists in CDF but has no data points. The same time series has 6 data points in the PI Data Archive.
CSV report
The PI replace utility writes a report with the differences between the PI Data Archive and CDF to a CSV file on exit:
where:

TimeSeries: Name
is the name of the time series. This is the PI Point name. 
RangeStart
indicates that data points with timestamps higher than or equal to this date and time were replaced. 
RangeEnd
indicates that data points with timestamps smaller than or equal to this date and time were replaced. 
TotalDataPoints
is the total number of data points in the PI Data Archive. 
CountDifference
is the number of data points in CDF subtracted from the number of points in the PI Data Archive. 
ValueDifference
is the number of data points in CDF that have different values than the data points in the PI Data Archive. 
MaxDelta
is the maximum difference of the numeric data point values. The individual data point delta percentage is calculated as the fraction: (Value in PI  Value in CDF) / Value in PI 
AvgDelta
is the average data point value difference.
Metrics
PI replace can send metrics to a Prometheus pushgateway. The metrics can be displayed in a Grafana dashboard and monitored by CDF.
Name  Description 

Heartbeat  The time since the last metrics push from the PI replace process. 
Estimated time left  The estimated time left until all time series are processed, based on the average iteration duration. 
# Time series  The total number of time series being processed. 
# Data points  The total number of data points processed until now. 
Time range left to cover  The time range left to query PI for data points. When it reaches 0, all time series have been replaced. 
Avg duration per 1000  The average time it takes to replace 1000 time series in one iteration. The stephours configuration parameter defines the time span of an iteration. 
Data points with different values  The percentage of the processed data points with different values in CDF compared to the PI Data Archive. 
Data point count difference  The percentage of the data points in the PI Data Archive that are missing from CDF. 
# Extractor incidents  The number and type (data loss or reconnection) of extractor incidents being handled. 
Replace progress
Name  Description 

Replace progress  The overall progress. Replace to is the goal timestamp (starttime configuration parameter) and Replace progress is the current iteration timestamp (step). The replace process is completed when the yellow line meets the green line. 
Iteration  # of time series replaced  The number of time series replaced in the current iteration (stephours) and the replacing rate (time series/sec). 
Iteration  Duration per 1000  The duration of replacing 1000 time series in one iteration (step). The dashed line is the estimated time left until all time series are replaced. 
Performance
Name  Description 

Process  The CPU usage and memory consumption of the PI replace process on the host machine. 
CDF Requests  The total number and latency of requests made to CDF. 
CDF Uploaded data points  The total number of data points uploaded to CDF and upload rate (data point/sec). 
Data quality
Name  Description 

Data points with different values  The total number of data points with different values and their rate (points/sec). 
Data point count difference  The total number of data points in the PI Data Archive that are missing from CDF and its rate (points/sec). 
Numeric data point delta  The average and maximum difference between numeric data points in the PI Data Archive and CDF, as a percentage. A MaxDelta of 100% means that one time series had at least one data point where ABS (Value in PI  Value in CDF) / Value in PI greater than Value in PI. 
PI server
Name  Description 

# PI connections  The number of times PI replace establishes a connection with the PI server. If this value is greater than 1, an error caused PI replace to reconnect to PI. 
PI query rate  The rate of queries (query/sec) to the PI server and the query latency. 
PI AF Values  The total number of good and bad AF Values obtained from PI and its rate (value/sec). Typically, the PI extractor converts 'good' AF Values to data points and uploads these to CDF. 