Monitor data quality

When you rely on data to make operational decisions, it is critical that you know when the data is reliable and that end users know when they can rely on the data to make decisions.

Follow these steps to continuously monitor the data quality of time series:

Step 1. Create a monitor and rule sets

To create a monitor:

  1. In the Console left-hand menu, select Quality monitoring.

  2. Select + Create.

  3. Enter a name for your monitor and a description to indicate what the monitor will be used for, or what kind of data the monitor includes.

  4. Select whether everyone can view or view & edit the monitor.

  5. Select Create to create the monitor.

  6. Group time series with similar data quality requirements into rule sets, and add as many rule sets as you need.

    To create a rule set and start adding data to your monitor, select + Add new rule set.

  7. Search and filter to find the time series you want to add to your rule set.

  8. Select the time series that you want to add to your rule set.

  9. Select Add to monitor.

  10. Select edit time series to add or remove time series, or go to monitor to view the data in your monitor.

Note: Each monitor can contain a maximum of 500 time series and each rule set a maximum of 150 time series.

Step 2. Add data quality monitoring to rule sets

Add data quality monitoring to a rule set to specify the data quality requirements for a group of time series.

  1. In the Cognite Console, open an existing monitor and then a rule set.

  2. Select + Add rules to specify the quality requirements for the time series in the rule set.

  3. Define the time window that matches the requirements of your data science model or application.

    The time window specifies how far back the data quality monitor should check that the selected time series meet the data quality requirements.

    Note: The time window has to be between 1 minute and 3 hours. We recommend that you specify a time window between 1 minute and 30 minutes.

  4. Select the data quality rules.

    Different models and apps have different data quality requirements and will need to be monitored for different aspects of data quality. Select which data quality rules to apply to the time series in each rule set:

    • Max age of the last data point - checks that the latency is acceptable for each of the time series in the rule set.
    • Max distance between data points - monitors the gap between any two data points in each of the time series in the rule set.
    • Min number of data points - checks that the number of data points in the defined time window is high enough for each of the time series in the rule set.
    • Max value - checks that the data points are below a max value for each of the time series in the rule set.
    • Min value - checks that the data points are above a minimum value for each of the time series in the rule set.
  5. Set up notifications.

    Specify the email addresses or a webhook URL to notify if the data quality fails to meet the requirements, and when data quality returns to normal.

    A webhook lets an app provide other applications with real-time information. You can use a tool like Opsgenie to receive the notification and pass it on to the relevant recipients through email, Slack, or other mediums.

    Select rules, time window and webhooks

  6. Select Apply.

Step 3. Report the quality status in apps and models

If the data quality does not meet the requirements, the data quality monitor writes an event to Cognite Data Fusion (CDF). The event is of the type Data Quality Monitoring Alert, and includes a sub-type corresponding to the type of rule that was broken:

  • min_count
  • max_age_of_last_data_point
  • max_value
  • min_value
  • max_distance_between_points

The metadata fields contain information about which monitor and rule set the event belongs to, and which time series initially broke the rule. This allows you to report the data quality status in other apps and models.

To get the latest data quality status for a particular monitor, you can query events from CDF and filter on the type of events. Learn more about events here.

When you query for events:

  • "dataKitId" corresponds to the ID of the monitor
  • "subKitId" corresponds to the ID of the rule set

For example, the request below returns all currently broken rules for a monitor with 322 as its dataKitId. You can see the IDs for a monitor and its rule sets by opening the monitor in the Console.

URL: https://api.cognitedata.com/api/v1/projects/{project}/events/list

Body:

{
    "filter": {
        "type": "Data Quality Monitoring Alert",
        "metadata": {
        	"dataKitId": 322,
        	"isOpen": "true"
        }
    }
}
1
2
3
4
5
6
7
8
9

Response example:

{
   "startTime": 1575900039980,
   "type": "Data Quality Monitoring Alert",
   "subtype": "min_count",
   "description": "",
   "metadata": {
     "dataKitId": "322",
     "dimension": "min_count",
     "isOpen": "true",
     "subKitId": "312",
     "subKitName": "Untitled rule set",
   },
   "source": "Data Quality Monitoring in the Cognite Console",
   "id": 6740001776635791,
   "lastUpdatedTime": 1575900044311,
   "createdTime": 1575900044311
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

When the data quality is restored and meets the requirements, the event is updated with an end time, and the isOpen metadata field is set to false.

The response also includes the subKitId and subKitName, so you can easily see which group of time series has broken rules.

Last Updated: 3/19/2020, 8:57:22 AM