跳至主要内容

Diagram parsing for data modeling

Beta

The features described in this section are currently in beta testing and are only available to selected customers.

Diagram parsing analyzes high-quality vectorized and rasterized engineering diagrams to detect symbols and tags and their connections. It automatically maps and stores the information to resources in the Cognite Data Fusion (CDF) knowledge graph. You can review and verify the automatic mappings as part of the process.

In vectorized diagrams, the files contain SVG elements, whereas in rasterized diagrams, the files are scanned PDFs.

Access capabilities

See access capabilities to add the necessary capabilities for diagram parsing workflows.

Concepts

Library

A library is a collection of symbols for automatic and manual detection of objects in an engineering diagram. Users create and modify libraries that are available only in their CDF project. The user must specify a library before running a parsing job.

提示

Make sure to give meaningful and unique names when you create libraries.

Template

A template is also a collection of symbols. Cognite provides templates, which are available by default for every project. They're read-only but can be cloned as project-specific libraries.

Templates are available out of the box. It's recommended to start with a template that contains a set of symbols. Using a template helps you get started with already detected symbols.

Symbol

A symbol is a blueprint to detect a particular type of equipment, such as a valve, instrument, or pipe. A symbol always exists within a library and contains one or more variants. Symbols can be mapped to assets and files.

Variant

Variant defines one of the many possible geometrical compositions of a symbol. Different SVG paths are used to make a specific symbol, like a valve. Each of the combinations is saved as a variant.

Adding multiple variants to each symbol makes automated detection more accurate by providing the algorithm with more examples. This approach also accounts for situations where a single symbol has multiple visual representations or minor visual differences exist across different files.

Diagram

A diagram is a single page of a parsed file that contains all the detected symbols and their connections. Each diagram is created with a particular library.

In the engineering diagram, different colors indicate parsings, symbols detected, or mappings:

  • Black: No parsing
  • Purple: Symbols detected
  • Blue: Symbols mapped to assets

During the parsing process, the diagram is initially placed in a queue with the in queue status. Once a user picks it up, the status changes to parsing. After the parsing is completed, the status updates to parsed. If an error occurs, the status changes to failed, and the diagram will include a message about the error details.

Run diagram parsing

To parse a diagram:

  1. Navigate to CDF > Data management > Contextualize > Diagram parsing.

  2. Select the symbol library you want to use to detect symbols in the diagram.

  3. Select the file you want to parse, and then select Run parsing.

  4. Wait for the parsing to complete, and in the Actions column, select 👁 to view the parsed file.

  5. Use the tabs to verify different aspects of the automatic mappings:

    1. On the Symbol tab, make sure that the relevant symbols are mapped and that the mapping is correct:

      1. To add newly detected symbols to the library, select the symbol and select which Asset class and Asset type the symbol represents. You can also add the symbol as a variant of an existing symbol.

      2. If a symbol maps to the wrong asset, select the symbol and then Detach from symbol. Only this particular instance of the symbol in that specific file is deleted.

    2. On the Mapping tab, map the detected symbols to the correct assets:

      1. Select any unmapped symbol, then select Add asset mapping + and select the asset to map it to.

      2. If a symbol is mapped to the wrong asset, select the symbol and select Change mapping.

      3. Select 🗑 to remove incorrect mappings.

    3. On the Connection tab, check that the symbols are correctly connected. You can hover over a symbol to view the entire connection group and select a symbol to view or edit the closest connection.

      1. To delete incorrect connections, select Remove connection icon.

      2. Select two symbols to add missing connections and then select Create connection.

  6. Rerun the parsing and verify that your changes have been recognized.

Note
  • In Edit layer visibility icon, toggle Show detected symbols to show and hide detected symbols in the diagrams. When you toggle it off, you get a basic pdf file.

  • Toggle Show background document to show and hide the diagram's background document details (SVG paths).

  • Select Highlight undetected items to show the undetected symbols in the diagram.

Manage symbol libraries

Libraries contain the symbols that diagram parsing uses to detect objects in engineering diagrams. You can create or modify libraries to add new symbols or variants of an existing symbol.

To add a symbol to a library:

  1. Navigate to CDF > Data management > Contextualize > Diagram parsing.

  2. Select Libraries and the library to which you want to add the symbol. If you don't have a symbol library, create one:

    1. To make a copy of a library template, select > Duplicate template.

    2. Select + to create a new library.

  3. Select Open file and open the file containing the symbols you want to add. We recommend using legend files containing a set of symbols.

  4. To select the symbol you want to add:

  • Drag and move the cursor to select the symbol.

(or)

  • Use the keyboard shortcut: Shift while you drag to select a symbol.

Specify which Asset class (for example, Pump) and Asset type (for example, Centrifugal) the symbol represents.

  1. To add a variant of an existing symbol, select the symbol and Add as variant.
Important

When you delete a symbol from the library page, the symbol is deleted from every file using it. When you delete a variant from the library page, this shape of the variant won't be identified as a symbol for parsing diagrams in future parsings.