メイン コンテンツにスキップする

Product tour

This product tour provides a high-level overview of the Cognite Data Fusion (CDF) architecture and the main steps to fast-track your implementation.

Cognite Data Fusion (CDF) is a platform for contextualization and DataOps:

  • Contextualization combines machine learning, artificial intelligence, and domain knowledge to map resources from different source systems to each other in your industrial knowledge graph.

  • DataOps is a set of tools and practices to manage your data lifecycle through collaboration and automation.

Architecture

Cognite Data Fusion (CDF) runs in the cloud and has a modular design.

The CDF architecture

You can interact with your data through dedicated workspaces in the CDF web application, or with our APIs and SDKs.

The following sections introduce the main steps to implementing CDF and how they relate to the different CDF modules.

Step 1: Set up data management

When making decisions, it's important to know that data is reliable and that you can trust the data.

Before integrating and contextualizing data in CDF, you must define and implement your data governance policies. We recommend appointing a CDF admin to work with the IT department to ensure that CDF follows your organization's security practices. Connect CDF to your identity provider (IdP), and use the existing user identities to manage access to the CDF tools and data.

To build applications on top of the data in CDF, you depend on a well-defined data model to make assumptions about the data structure. CDF has out-of-the-box data models to build a structured, flexible, contextualized knowledge graph.

Step 2: Integrate data

Established data governance policies allow you to add data from your IT, OT, and ET sources into CDF. These data sources include industrial control systems supplying sensor data, ERP systems, and massive 3D CAD models in engineering systems.

Extract data

With read access to the data sources, you can set up the system integration to stream data into the CDF staging area, where it can be normalized and enriched. We support standard protocols and interfaces like PostgreSQL and OPC-UA to facilitate data integration with your existing ETL tools and data warehouse solutions.

We have extractors made for specific systems and standard ETL tools that work with most databases. This approach lets us minimize logic in the extractors and run and re-run transformations on data in the cloud.

Transform data

The data is stored in its original format in the CDF staging area. You can run and re-run transformations on your data in the cloud and reshape it to fit the CDF data model.

Decoupling the extraction and transformation steps makes it easier to maintain the data pipelines and reduces the load on the source systems. We recommend transforming the data using your existing ETL tools. We also offer the CDF Transformation tool as an alternative for lightweight transformation jobs.

Enhance data

The automatic and interactive contextualization tools in CDF let you combine artificial intelligence, machine learning, a powerful rules engine, and domain expertise to map resources from different source systems to each other in the CDF data model. Start by contextualizing your data with artificial intelligence, machine learning, and rules engines, then let domain experts validate and fine-tune the results.

Step 3: Consume data and build solutions

With complete and contextualized data in your industrial knowledge graph, you can use the built-in industrial tools, and build powerful apps and AI agents to meet your business needs.

All the information stored in CDF is available through our REST-based API. Cognite also provides connectors and SDKs for common programming languages and analytics tools, like Python, JavaScript, Spark, OData (Excel, Power BI), and Grafana. We also offer community SDKs for Java, Scala, Rust, and .Net.

The Functions service provides a scalable, secure, and automated way to host and run Python code.