Prerequisites: This document assumes that the reader is familiar with the Cognite Data Fusion (CDF) data modeling concepts
Space, Data Model and View. See Containers, views, and data models for details.Principles
The following principles guide data modeling decisions. They emphasize purpose, collaboration, and simplicity.Design data models for real use cases
Data models should be designed for real use cases, not for modeling alone. Data modeling can become an academic exercise focused on naming and relationships rather than practical value. Extremely detailed models that mirror the real world one-to-one often perform poorly and are understood only by their authors. Real use cases ground the effort in reality and ensure the model serves the organization. Start with three business questions and drive the modeling process from them.Emphasize cooperation in data modeling
The essence of data modeling is cooperation: people working together to create a shared understanding of the business and data. People with different backgrounds and perspectives come together through data modeling to build that shared understanding. Expect friction as the process surfaces and makes concrete the views from different parts of the organization. Working through this friction creates the shared understanding that unlocks collaboration and helps solve critical business questions.Keep data models parsimonious
Data models should be as simple as possible, but not simpler. Parsimony is a fundamental principle in science and engineering: a model should be as simple as possible, but not simpler. For an Enterprise Data Model, include all concepts shared across the organization and exclude those relevant to only one business area.Best practices
The following best practices apply the principles above to organizational structure and data model design.Establish a data governance team
Establish a data governance team to ensure implementation of the best practices. This follows from the principle of cooperation. The team should be cross-functional with representatives from all relevant parts of the organization. This anchors the Enterprise Data Model in the organization so it will be used and maintained.Create one Enterprise Data Model
Create one Enterprise Data Model shared across the organization. This follows from the principle of cooperation. One main data model forces the organization to cooperate and serves as the foundation for collaboration. It encodes implicit knowledge explicitly, making it easier to discuss, onboard new employees and partners, and make decisions. The Enterprise Data Model becomes a common language across business units and subdomains.Create solution data models per business area
Each business area can create one or more Solution Data Models based on the Enterprise Data Model. This follows from cooperation and parsimony. The Enterprise Data Model develops slowly and can become large and hard to use for practical applications. A Solution Data Model uses a subset of the Enterprise Data Model and can extend it with additional concepts. This lets business areas move faster, adapt to their needs, and test new concepts before integrating them into the Enterprise Data Model. Solution models can have higher fidelity for their subdomain. At the Enterprise level, all parties must agree on naming conventions; at the Solution level, a business unit can use subdomain-specific naming and has more freedom.Reference only the Enterprise Data Model from solution models
Solution Data Models should always reference the Enterprise Data Model, not another Solution Model. This follows from cooperation and parsimony. The trade-off is between cooperation and speed. A single data model would slow development; separate models per business area would create siloing, isolating data and knowledge. Allowing Solution Data Models to build on other Solution Data Models increases complexity and can lead to siloing through obfuscation. Using the Enterprise Data Model for shared concepts enforces discussion when new concepts are introduced, anchoring them in the organization so other business areas can build on them.Keep each data model in its own Space
Keep each data model in its ownSpace; do not share Spaces across data models.
This follows from parsimony. A dedicated Space clarifies which concepts belong to the data model and simplifies access control for the owning team. It also establishes a clear portfolio of data products the organization owns.
Keep Views in the same version and Space as the data model
Keep allViews in the same version and Space as the data model.
This follows from parsimony. Although a data model can contain Views from different Spaces and versions, mixing versions causes confusion. All Views should be in the same Space and controlled by the team that owns the data model. To reuse a View from the Enterprise Data Model in a solution model, use the implements option—this lets you extend the View in the solution without affecting the Enterprise Data Model.
Keep the Enterprise Data Model as small as possible, but not smaller
Keep the Enterprise Data Model as small as possible, but not smaller. This follows from cooperation and parsimony. Each concept has a cost—it must be discussed and anchored in the organization. Exclude concepts that are not shared; they can elicit strong opinions without grounding and waste time. As the organization evolves, avoid changes that cascade to all Solution Data Models. Introducing a concept too early reduces future flexibility; changing it later is much more costly.Mature organizations typically maintain three Enterprise Data Model versions:
legacy (phased out), current (most solution models use this), and future (in development). Aim for forward-compatible models to enable smooth migration.