# Create extraction pipelines
This article explains how you create extraction pipelines for all types of extractors you want to monitor via the Extraction pipelines page. You need to add Data set, Name and External ID on the Create extraction pipeline page, then you can add or edit additional information later on the Extraction pipelines overview page.
In this article:
# Before you start
A data set must exist for the data you want to add to an extraction pipeline.
Navigate to Manage & Configure > Manage access and set any of these capabilities for users and extractors:
|extractionpipelines|| ||Gives access to list and view metadata of the pipeline.||Align access with users that have |
|extractionpipelines|| ||Gives access to create and edit individual pipelines and edit notification settings.||Align access with users that have |
|extractionruns|| ||Gives access to view extractor run history reported by the individual extraction pipeline runs.||Align access with users that have |
|extractionruns|| ||Give this capability to extractors to allow for state and heartbeat reporting back to Cognite Data Fusion.||Scope access to the specific extraction pipeline ID which represents a particular extractor, ensuring that statuses and errors can be reported only by that specific extractor.|
# Create extraction pipelines
Sign in to Cognite Data Fusion (opens new window).
In the top menu,
navigate to Integrate > Monitor extraction pipelines, or
navigate to Manage & Configure > Create, view, and manage data sets. Then select a data set and open the Lineage tab to add a pipeline to the selected data set.
Click Create extraction pipeline where you will be requested to fill in the mandatory fields for creating a pipeline.
- Click Create to open the Extraction pipeline overview. On this page, you can add additional information to give contexts and insights about the pipeline.
# Enable email notifications
Extraction pipelines enable owners and other stakeholders to receive email notifications. The notifications are triggered when the extraction pipeline reports a failed run to CDF.
You enter the email addresses that will be notified when you add Owner and other contacts for the extraction pipeline in the Contacts section of the Extraction pipeline page.
Email notifications are only triggered when an extraction pipeline status changes state. This is to prevent multiple emails for ongoing incidents. For new incidents, emails are only sent for the first reported failed run and when the incident is resolved. Multiple reported failures in succession are ignored.
# Best practice for documenting extraction pipelines
It is good practice to enter comprehensive information about a pipeline to simplify troubleshooting and administration of pipelines. The minimum information you need to record is Data set, Name and External ID.
Monitor the pipeline status by setting up email notifications for failed pipelines and adding contact details for the pipeline owner and other stakeholders that need to be alerted about a failed pipeline. You'll find the switch for activating email notifications when you add a contact.
Select the schedule that is set up in the extractor to document how often extractor is expected to update the data in CDF and record the source system name and the Raw database tables to keep track of where your data is extracted from and ingested into.
Information specific to your organization can be added using the metadata fields with key/value pairs.
It is also very useful to enter all contexts and other insights about the pipeline to speed up troubleshooting issues. Enter free text using the Documentation field for this purpose. This will be displayed as a ReadMe section on the Extraction pipeline overview page. Make sure to keep this content updated at all times. You can format the text using Markdown (opens new window)