Skip to main content

About data extraction

Extractors connect to source systems and push data in its original format to Cognite Data Fusion (CDF) as part of the data integration workflow. You can extract data with prebuilt Cognite extractors or create a customer extractor using Python and .NET utilities packages and SDKs.

The prebuilt extractors can stream or extract data in batches to the CDF staging area or directly to a CDF service with little or no data transformation.

Note

The Cognite extractors only need read access to the source systems and never change the original data.

To integrate data into a data model, you can:

  • Use Cognite’s extractor and transformation components.
  • Use 3rd party extractor and transformation components.
  • Develop custom solutions.

You can also ingest data into CDF using the Cognite API or integrate Extract, Transform, Load (ETL) tools with the PostgreSQL gateway if your data is already in the cloud.

Integration architecture

Cognite extractors

ExtractorDescription
Cognite DB extractorConnects to databases using ODBC, runs queries, and batch extracts data into the CDF staging area or directly to the CDF time series service. Make sure you have ODBC drivers for the databases you're connecting to. This extractor is available as both a Windows service and a standalone executable.
Cognite OPC UA extractorConnects to the open OPC UA protocol and streams time series into the CDF time series service and events into the CDF events service. It batch extracts the OPC UA node hierarchy into the CDF staging area or as CDF assets and relationships.
Cognite PI extractorConnects to the PI Data archive and streams time series into the CDF time series service. In parallel, the extractor ingests historical data (backfilling).
Cognite PI AF extractorConnects to the PI Asset Framework and batch extracts Asset Framework elements and all their attributes. The extractor then builds a tree for each element in the CDF staging area.
Cognite PI replace utilityRe-ingests time series to CDF. Use this utility if the PI extractor can't sync PI content due to invalid or missing data points.
Cognite Studio For Petrel extractorConnects to SLB Studio For Petrel and extracts the records to CDF as protobuf objects.
Cognite EDM extractorConnects to Landmark Engineers Data Model database and exports the data to the CDF staging area.
Cognite WITSML extractorConnects to WITSML data sources and sends the data to CDF.
Cognite OSDU extractorConnects to OSDU and sends the data to the CDF staging area.
Cognite Documentum extractorConnects to OpenText Documentum and Documentum D2 and batch extracts documents via the Documentum Foundation Classes (DFC) library or the D2 REST API (recommended) into the CDF files service. It extracts metadata into the CDF staging area.
Cognite SAP ExtractorConnects to SAP through OData endpoints and batch extracts data into the CDF staging area.

Download installation files

  1. Navigate to Data management > Integrate > Extractors.
  2. Select the extractor best suited to your source system.
  3. Download the Installation file and, optionally, the PDF documentation.
Extractor download page

Custom extractors

You can create custom extractors if your source system can't use any of the prebuilt extractors. Select Create a custom extractor to find Python and .NET utils packages and SDKs to get started.

Cognite Python extractor-utils SDK

Use the Python extractor-utils SDK, an extension of the Cognite Python SDK, to develop extractors in Python. For more information, see the Python extractor-utils SDK documentation.

Cognite .NET extractor utilities code

Use the .NET extractor to develop extractors in C#. For more information, see the .NET utils documentation.