Extractor metrics
The PI extractor can send performance metrics for remote monitoring and debugging. You can configure the PI extractor to use Prometheus. If you've enabled the metrics section, the extractor uploads metrics to the configured pushgateway or server. If you're using a pushgateway, all the metrics with descriptions will be listed under the configured job name.
Prefix metric names
The metric names are prefixed with the metric source:
-
dotnet
is for metrics on .NET GC and memory. -
process
is for metrics about the running process, such as the number of threads, CPU time, and memory. -
push
is for metrics related to the pushgateway, such as last time pushed and last time failed. -
cognite_sdk
is for metrics produced by the Cognite .NET SDK. This is mostly related to the calls to the CDF API. -
extractor_utils
is for metrics produced by the Extractor Utils library. This is related to the number and duration of calls to individual endpoints in CDF and metrics on state storage. -
pi_extractor
is for metrics specific to the operation of the PI extractor.
Metrics
Name | Type | Purpose |
---|---|---|
pi_extractor_details | Gauge | Information about the version of the running extractor. The metrics contain the version label with the version of the running extractor. The actual metric value can be ignored. Example: version="2.1.0.0" |
pi_extractor_start | Gauge | The UNIX timestamp when the extractor was last started. Example: 1599495797230 (7 September 2020 16:23:17.230) |
pi_extractor_timeseries | Gauge | The total number of time series being processed by the extractor. |
pi_extractor_timeseries_states | Gauge | The total number of time series per extraction state. The state label indicates whether the state is frontfill , backfill , or streaming . Example: state="frontfill" value=13378 (13.000 time series are currently being frontfilled) |
pi_extractor_pi_connection_start | Gauge | The UNIX timestamp of the last connection attempt to the PI server. Example: 1599495797590 |
pi_extractor_pi_connection_time | Gauge | The elapsed time in milliseconds for the current connection to the PI server. Example: 7030683.7172 (~1.95 hours) |
pi_extractor_pi_connections | Gauge | The number of times the extractor has connected to the PI server. Internal errors on the extractor or on the PI server will cause the extractor to reconnect to the server, increasing the value of this gauge. If the PI extractor has been running since startup without reconnecting to the PI server, the value should be 1. |
pi_extractor_pi_queries | Summary | The number and duration of queries to the PI server. This includes queries to the PI Data Archive (frontfill, backfill). Sample count: number of queries. Sample sum: Sum of the duration of all queries |
pi_extractor_pi_query_points | Counter | The total number of data points (AF values) returned in queries to the PI Data Archive (frontfill, backfill). There are two labels: mode and state . mode is either frontfill or backfill . state is either good or bad .Example: mode="frontfill" state="good" value=179070022 (~180 million good data points received from frontfill queries) |
pi_extractor_pi_data_pipe_events | Counter | The total number of events received from the PI Data Pipe (stream). There are three labels: action , status and type . action is either Add , Insert , Delete , Refresh , or Update . status is either good or bad . type is either AFDataPipeEvent or AFDataPipeRangeDeletedEvent (C# class type of the event)The total number of data points per (action, status, type). Example: action="Update" status="good" type="AFDataPipeEvent" value=410045 (410.000 update events containing good data point updates) |
pi_extractor_pi_data_loss | Summary | The number and duration of data loss incidents observed in the PI Data Pipe (stream) Sample count: Number of data loss incidents observed. Sample sum: Sum of the duration of all incidents. |
pi_extractor_event_queue_size | Gauge | The number of events fetched from the PI Data Pipe and put in queues for processing. The name label indicates the type of the queue: double and string . The double queue contains data pipe events related to numeric data points. The string queue contains data pipe events related to string data points. Total number of events in the extractor queues. Example: name=”double” value=100000 (There are 100k numeric events in the queue to be processed by the extractor) |
pi_extractor_frontfill | Gauge | The maximum and minimum timestamps for the time series undergoing frontfill. The aggregate label indicates if it's the max or the min timestamp.Example: aggregate=”min” value=1546077427120 and aggregate=”max” value=1600256700000 (The minimum timestamp of all time series frontfilling is 29 December 2018 09:57:07.120 and the maximum is 16 September 2020 11:45:00. That is, there is about 2 year of data to be frontfilled). |
pi_extractor_backfill_goal_time | Gauge | The timestamp of the current backfill goal. The backfiller works in steps. This metric indicates how far back in time the backfiller will fetch data points for the current step. This can be used as a measure backfiller progress. Example: value=1599652198896 (This timestamp is ~ UTC 9 September 2020. If the current UTC date is 16 September 2020, then the current backfill step is fetching data for one week back). |
pi_extractor_pusher_data_points | Counter | The total number of data points per action handled by the extractor before inserting into CDF. The action label specifies how the data point is being handled. added are points to be uploaded to CDF. bad are points discarded because they're marked as bad in PI. frontfill-skipped are data points skipped during frontfill. backfill-skipped are points skipped during backfill.Example: action=”added” value=100000 (there are 100k points to be added in CDF).To know the number of points that were actually uploaded to CDF, use extractor_utils_cdf_datapoints |
pi_extractor_stream_iterations | Counter | The number of stream iterations. During one iteration, the extractor streamer will dequeue events from the extractor event queues, parse the events, and upload the data points to CDF. Example: value=150 (150 iterations since the extractor started). |
pi_extractor_stream_iteration_star | Gauge | The start timestamp of the current stream iteration. This can be used to calculate the streaming latency. Example: value=1600265491419 (16 September 2020 14:11:31). Comparing it to UTC now (or even push_time_seconds) provides an indication of how late the current iteration is. |
pi_extractor_streamer_data_points | Counter | The number of data points per type being handled by the streamer. label type indicates the type of event in the PI server that generated the data point. new are real-time data being added to the time series (newer than the newest data point in a time series). historical are updates/insertion of data points in the past. ahead are new data points that are ahead of the newest data point in CDF because the frontfiller hasn't caught up with the real-time data points. Example: type=”new” value=100000 (100k new data points handled by the streamer so far.) |
pi_extractor_streamer_oldest_data_point | Gauge | The age/time in seconds of the oldest (historical) data point received in the current stream iteration. This could be used as an indication of how far back the historical updates occurred, in case some updates are lost and a refill is required. Example: value=120 (Oldest historical data point in the current streaming iteration is 2 minutes old.) |
extractor_utils_cdf_timeseries_requests | Summary | A summary describing the number and duration of time series requests to CDF. |
extractor_utils_cdf_datapoint_requests | Summary | A summary describing the number and duration of data point requests to CDF. |
extractor_utils_cdf_datapoints | Counter | The total number of data points pushed to CDF by the utils. |
extractor_utils_cdf_invalid_data_points | Counter | The total number of data points skipped due to bad timestamp or value. CDF requires data points to be within a certain time and value range. |
cognite_sdk_fetch_inc | Counter | The total number of POST/GET/… actions performed |
cognite_sdk_fetch_error_inc | Counter | The total number of errors on actions. |
cognite_sdk_fetch_retry_inc | Counter | The total number of retries on actions. |
cognite_sdk_decode_error_inc | Counter | The total number of data errors, meaning invalid data was received from CDF. |
cognite_sdk_fetch_latency_update | Gauge | The measured latency in milliseconds on actions performed. |
dotnet_total_memory_bytes | Gauge | The number of memory bytes of allocated memory |
dotnet_collection_count_total | Counter | The number of garbage collection (GC) calls, grouped by generation. |
process_cpu_seconds_total | Counter | The total number of CPU seconds used by the process. |
process_num_threads | Gauge | The number of current threads in use by the process. |
process_open_handles | Gauge | The number of currently open handles. |
process_open_fds | Gauge | The number of currently open file descriptors. |
process_private_memory_bytes | Gauge | The number of current private memory bytes. |
process_resident_memory_bytes | Gauge | The number of current resident memory bytes. |
process_virtual_memory_bytes | Gauge | The number of current virtual memory. |
process_working_set_bytes | Gauge | The number of memory bytes for the current working set. |
process_start_time_seconds | Gauge | The start time in UNIX time seconds. |
push_time_seconds | Gauge | The time in UNIX time seconds data was last pushed to the Prometheus pushgateway. |