Aller au contenu principal

Data modeling

Beta

The features described in this section are currently in beta testing and are subject to change. We recommend that you don't use the features in production systems.

For more information, request to join the data modeling group on the Cognite Hub.

info

If you want to go directly to the specifications of the data modeling capabilities, go to the data modeling specifications.

Data modeling makes it easier for developers, data architects, business analysts, and other stakeholders to view and understand relationships between data objects.

Resource types

A data model organizes data objects and standardizes the properties of real-world entities and how they relate to one another. Data models are the core of an ontology, a knowledge graph, or an industry standard and are crucial in building solutions like data science models, mobile apps, and web apps.

In Cognite Data Fusion (CDF), data models collect industrial data by resource types that let you define the data elements, specify their properties, and model the relationships between them. The different resource types are used to both store and organize data.

tip

Visit the Getting started with data modeling course to quickly learn how to create, populate, and query a data model in CDF with a sample set of data.

The list below outlines the core characteristics you should aim for when you model your data:

  • Explicitness - An explicit data model provides a contract and interface between data providers and data consumers. Data consumers can understand and use the underlying data correctly, and both groups can use the data model as a shared context and vocabulary to communicate about the data.

  • Flexibility - Data models must be flexible and customizable. When you're building a solution, like an application or a data science model, the complexity will change over time, forcing the data model to scale with it.

    Larger data models, often ontologies or information models, require governed versions of the data model to allow you to create and test different models from different perspectives at the same time.

    The level of customization must also satisfy different ways of expressing the data, not only data types constructed from text, numbers, enums, and lists fields but also complex relationships between the data types.

  • Governance - Different versions and models may depend on the same underlying data as you iterate on your data models. To prevent the versions and models from breaking, you must govern any structural changes to the underlying data.

    Also, data models themselves require governance. When building applications, you often need to test the data models before you publish a new version. Larger data models mapping ontologies can have various levels of business processes attached. For example, modeling industry standards could involve an in-depth review before you iterate on a model.

  • Accessibility - Data modeling does not provide any value unless data within the data model is easily accessible for ingestion and consumption.

    Ingesting data into the data model must be efficient and include quality assurance via monitoring and alerts to improve error handling and lead to a better experience working with ETL tools and moving data.