About data governance

Solid data governance is a must for scalable digitalization in industry.

Cognite Data Fusion provides:

  • Secure access management. Learn more about access management.
  • Data sets which let you document and track data lineage, ensure data integrity, and collaborate with 3rd parties writing their insights back to your Cognite Data Fusion project
  • Data kits which let you track which apps and models are running on Cognite Data Fusion and monitor the data quality of time series used for apps and models

Data sets and data kits

A data set is a grouping of data based on origin. You will typically create one data set for each data pipeline ingesting data into CDF. For example, you may create one data set for all work orders from SAP. You use data sets to know where data in your CDF project comes from and to ensure the integrity of that data.

A data kit is a grouping of data based on utilization. For example, one data kit may contain all the data needed for a particular data science model that monitors the health of a well. You use data kits to document which apps and models are running on Cognite Data Fusion. Learn more about how to manage data kits.

Time series data quality monitoring

When you rely on data to make operational decisions, it is critical that you know when the data is reliable and that end users know when they can rely on the data to make decisions. Define your data quality requirements to ensure that data is reliable.

  • Set min/max limits for time series data, requirements for density of data points, data latency, or other data quality dimensions
  • Get alerted when data quality is low and display data quality status in other apps and dashboards

Learn more about time series data quality monitoring.

Last Updated: 2/2/2020, 5:26:31 PM