# Introducing data sets

Many of you love the current data sets feature in the Cognite Console that lets you manually document metadata for your data sources. Now, we are taking data sets to the next level and making data sets a full-stack integrated solution.

Data sets let you document and track data lineage, ensure data integrity, and allow 3rd parties to write their insights securely back to your Cognite Data Fusion (CDF) project.

Data sets group and track data by its source. For example, a data set can contain all work orders originating from SAP, or the output data from a 3rd party partner's machine learning model. Typically, organizations have one data set for each of its data ingestion pipelines in CDF.

To get started with data sets:

  1. Check out our in-depth article to learn more.
  2. Navigate to the fusion.cognite.com (opens new window), select Data sets, and follow the steps in our guide to create your first data set.

# Trace data lineage

  • IT managers need to know which data is currently available in their CDF project.
  • Data scientists need to know if they can rely on the input data for the use cases they're solving, and who to contact if they need more information.
  • IT support staff need to know how the data is integrated to help troubleshoot any issues.

# Ensure data integrity

  • When data engineers and IT managers have designed, implemented, and approved the data ingestion pipelines, they need to protect the pipelines from accidental changes to keep the data valid and accurate.

# Let 3rd parties write data securely back to CDF

  • When you want 3rd parties to write their insights back to your CDF project, you need to provide a safe container to hold their data and, at the same time, safeguard data from your other data pipelines.

# Resources

Data sets are available through: