About data extraction
Extractors connect to source systems and push data in its original format to Cognite Data Fusion (CDF) as part of the data integration workflow. You can extract data with prebuilt Cognite extractors or create a customer extractor using Python and .NET utilities packages and SDKs.
The prebuilt extractors can stream or extract data in batches to the CDF staging area or directly to a CDF service with little or no data transformation.
The Cognite extractors only need read access to the source systems and never change the original data.
To integrate data into a data model, you can:
- Use Cognite’s extractor and transformation components.
- Use 3rd party extractor and transformation components.
- Develop custom solutions.
You can also ingest data into CDF using the Cognite API or integrate Extract, Transform, Load (ETL) tools with the PostgreSQL gateway if your data is already in the cloud.
Cognite extractors
Extractor | Description |
---|---|
Cognite DB extractor | Connects to databases using ODBC, runs queries, and batch extracts data into the CDF staging area or directly to the CDF time series service. Make sure you have ODBC drivers for the databases you're connecting to. This extractor is available as both a Windows service and a standalone executable. |
Cognite OPC UA extractor | Connects to the open OPC UA protocol and streams time series into the CDF time series service and events into the CDF events service. It batch extracts the OPC UA node hierarchy into the CDF staging area or as CDF assets and relationships. |
Cognite PI extractor | Connects to the PI Data archive and streams time series into the CDF time series service. In parallel, the extractor ingests historical data (backfilling). |
Cognite PI AF extractor | Connects to the PI Asset Framework and batch extracts Asset Framework elements and all their attributes. The extractor then builds a tree for each element in the CDF staging area. |
Cognite PI replace utility | Re-ingests time series to CDF. Use this utility if the PI extractor can't sync PI content due to invalid or missing data points. |
Cognite Studio For Petrel extractor | Connects to SLB Studio For Petrel and extracts the records to CDF as protobuf objects. |
Cognite EDM extractor | Connects to Landmark Engineers Data Model database and exports the data to the CDF staging area. |
Cognite WITSML extractor | Connects to WITSML data sources and sends the data to CDF. |
Cognite OSDU extractor | Connects to OSDU and sends the data to the CDF staging area. |
Cognite Documentum extractor | Connects to OpenText Documentum and Documentum D2 and batch extracts documents via the Documentum Foundation Classes (DFC) library or the D2 REST API (recommended) into the CDF files service. It extracts metadata into the CDF staging area. |
Cognite SAP Extractor | Connects to SAP through OData endpoints and batch extracts data into the CDF staging area. |
Download installation files
- Navigate to Data management > Integrate > Extractors.
- Select the extractor best suited to your source system.
- Download the Installation file and, optionally, the PDF documentation.
Custom extractors
You can create custom extractors if your source system can't use any of the prebuilt extractors. Select Create a custom extractor to find Python and .NET utils packages and SDKs to get started.
Cognite Python extractor-utils SDK
Use the Python extractor-utils SDK, an extension of the Cognite Python SDK, to develop extractors in Python. For more information, see the Python extractor-utils SDK documentation.
Cognite .NET extractor utilities code
Use the .NET extractor to develop extractors in C#. For more information, see the .NET utils documentation.