Skip to main content
Entity matching uses machine learning to find matches between entities from different data sources. You train a model on example matches, then use it to predict matches across your data.

How it works

The entity matching service learns from labeled examples you provide. You create a match configuration with training data, then run the model to predict matches for new entities. The service scores each potential match and returns candidates above a confidence threshold.

Common use cases

Entity matching is useful for linking sensor data to equipment, matching P&ID tags to assets, consolidating duplicate records from multiple systems, and enriching data with cross-references. You can match entities across different schemas and naming conventions.
Entity matching requires training data. Provide representative examples of correct matches and non-matches for best results.

Key capabilities

  • Train models on your labeled match examples
  • Predict matches for new entity pairs or batches
  • Configure thresholds to control precision and recall
  • Review suggestions when matches need human verification

Entity matching pipelines (preview)

Pipelines let you define and run repeatable entity matching flows as pipeline resources with runs you can inspect (including latest run). You create, update, list, and delete pipelines, then start runs and read run results by pipeline and run ID. Pipelines are available in the beta and alpha API references; they are not part of the stable 20230101 OpenAPI bundle yet.

Advanced joins (alpha)

Advanced joins extend contextualization with join resources and related jobs: for example creating and listing advanced joins, estimating contextualization quality, managing matches, measuring mapped percentage, and suggesting improvements, often with asynchronous job polling for long-running steps. These endpoints are published under the alpha API reference only (/advancedjoins/...).
Last modified on April 23, 2026