Extract data

Extractors connect to source systems and push data in its original format to a staging area as part of the data integration workflow. From the staging area, the data is transformed to the CDF data model and stored in Cognite Data Fusion (CDF). Our extractors require only read access to the source systems and never modify the original data.

Data extractors operate in different modes. They can stream data or extract data in batches to the staging area. Also, they can extract data directly to the CDF data model with little or no data transformation.

Integration architecture

We divide source systems into two main types:

  • OT source systems - for example industrial control systems with time series data. Getting OT data into CDF can be time-critical (a few seconds) and the data often need to be extracted continuously.

  • IT source systems - for example ERP systems, file servers, databases, and engineering systems (3D CAD models). IT data typically change less frequently (minutes or hours) than OT data and can often be extracted in batch jobs.

In this article:

Standard extractors

We provide extractors for both OT and IT source systems. You can also create your own custom extractor.

Extractor Capabilities
Cognite DB Extractor The DB Extractor can connect to most databases that support SQL, for example Oracle, MySQL or Postgres.
Cognite Documentum Extractor OpenText Documentum is a document management system that is widely used in the oil & gas industry. The Documentum Extractor supports both "DFC" and "D2" extraction modes.
Cognite PI Extractor OSIsoft PI is widely used in the oil & gas industry, and also in other industries. The extractor connects to a PI Server to extract time series.
Cognite OPC-UA Extractor OPC-UA is an open protocol that is very popular in the industry and manufacturing segment.

Custom extractors

If your source system can not be supported by any of the extractors above, or there is not a standard way to extract data from the system, you need a custom extractor. Cognite has extensive experience in making custom extractors and we provide SDKs to help you getting started.

The Python extractor-utils SDK

The Python extractor-utils SDK is an extension of the Cognite Python SDK, and makes it easier to develop custom data extractors for CDF.

For detailed information, see the The Python extractor-utils SDK documentation.

To download and install Python, visit Python.org.

Last Updated: 4/3/2020, 12:39:41 PM