Aggregation capabilities in data models

Use the aggregation endpoint to summarize and analyze data by computing advanced metrics and bucketed distributions. For example, you can use the aggregation endpoint to generate data for dashboards and summaries, create numeric summaries, and obtain grouped statistical insights.

The aggregation endpoint supports:

Group by: partition or group data based on specified properties. For instance, to see the average temperature for equipment, you can group by the equipment property.
Aggregation functions: compute metrics on the grouped data. This includes calculating sums, averages, minimums, maximums, counts, and histograms.
Filters and sorting: like the search endpoint, the aggregation endpoint supports filtering and sorting. This allows you to refine your data set before performing aggregations.

Recommendations

Combine filters and aggregations: to minimize the volume of data to process, always apply filters before running aggregations.
Use groupBy efficiently: restrict grouping to properties that are meaningful for your analysis. Excessive grouping can make results difficult to interpret and reduce performance.
Understand tokenization for text queries: when searching for text, refer to the search endpoint's tokenization rules.
Plan around indexing delays: because of indexing delays, it may take a few seconds for new or updated data to be available for aggregation.

Example aggregation request

The aggregation request below groups by equipment and computes an average of the temperature property.

Example aggregation request
{
  "view": {
    "type": "view",
    "space": "testSpace",
    "externalId": "Equipment",
    "version": "v1"
  },
  "query": "temperature sensor",
  "groupBy": ["equipment"],
  "aggregates": [
    {
      "avg": {
        "property": "temperature"
      }
    }
  ],
  "instanceType": "node",
  "properties": ["name", "description"],
  "targetUnits": [],
  "filter": {
    "range": {
      "property": "temperature",
      "lt": 25,
      "gt": 15
    }
  },
  "includeTyping": false,
  "limit": 100
}

Where:

view: defines which view the query will read from. Required.
aggregates: lists one or more aggregation operations. Maximum five operations in a single request.
groupBy: is an array of property names used for grouping.
query and properties: combine to perform text-based matching on the specified properties (name, description).
filter: restricts data before aggregation. Only results with temperature between 15 and 25 are included.

Aggregating a single type

The examples below show different aggregations for a type called Company with these properties:

id	name	industry	revenue	head_count
2001c18...	Ryan and Friesen and Sons	Internet	80734000	5
fe8bf42...	Pfannerstill Inc	Biotechnology	37038000	50
6dc069...	Boyer and Kohler Group	Public Relations and Communications	21598000	1
f23a8a...	Mohr Group	Fund-Raising	7440000	56
d346e0...	Veum LLC	Real Estate	5224000	10

Average `head_count` across all companies

Calculate the average head_count across every company in the dataset.

Request

{
  "aggregates": [
    { "avg": { "property": "head_count" } }
  ],
  "instanceType": "node",
  "view": {
    "type": "view",
    "space": "Companies",
    "externalId": "Company",
    "version": "1"
  }
}

Response

[
  {
    "instanceType": "node",
    "aggregates": [
      {
        "aggregate": "avg",
        "property": "head_count",
        "value": 32.29912284331168
      }
    ]
  }
]

Average `head_count` grouped by industry

Group companies by industry and compute the average head_count for each distinct industry value.

Request

{
  "aggregates": [{ "avg": { "property": "head_count" } }],
  "groupBy": ["industry"],
  "view": {
    "type": "view",
    "space": "Companies",
    "externalId": "Company",
    "version": "1"
  }
}

Response

[
  {
    "instanceType": "node",
    "aggregates": [
      {
        "aggregate": "avg",
        "property": "head_count",
        "value": 310.941333333333333
      }
    ],
    "group": {
      "industry": "Automotive"
    }
  },
  {
    "instanceType": "node",
    "aggregates": [
      {
        "aggregate": "avg",
        "property": "head_count",
        "value": 39.10427807486631
      }
    ],
    "group": {
      "industry": "Writing and Editing"
    }
  }
]

Histogram aggregation

Use the histogram function to divide a numeric property (head_count) into buckets. Below, each bucket has a width of 10. The result shows how many companies fall into each bucket, grouped by industry.

Request

{
  "aggregates": [
    { "histogram": { "property": "head_count", "interval": 10.0 } }
  ],
  "groupBy": ["industry"],
  "view": {
    "type": "view",
    "space": "Companies",
    "externalId": "Company",
    "version": "1"
  }
}

Response (abbreviated)

{
  "instanceType": "node",
  "group": { "industry": "Automotive" },
  "aggregates": [
    {
      "aggregate": "histogram",
      "interval": 10.0,
      "property": "head_count",
      "buckets": [
        { "start": 0.0, "count": 52 },
        { "start": 10.0, "count": 65 },
        { "start": 20.0, "count": 55 },
        ...
      ]
    }
  ]
},
{
  "instanceType": "node",
  "group": { "industry": "Writing and Editing" },
  "aggregates": [
    {
      "aggregate": "histogram",
      "interval": 10.0,
      "property": "head_count",
      "buckets": [
        { "start": 0.0, "count": 58 },
        { "start": 10.0, "count": 55 },
        ...
      ]
    }
  ]
}

Text search within an aggregation

Similar to the search endpoint, you can combine text-based queries with aggregations. This example computes the average head_count for companies where name starts with "Inc".

Request

{
  "query": "Inc",
  "properties": ["name"],
  "aggregates": [{ "avg": { "property": "head_count" } }],
  "groupBy": ["industry"],
  "instanceType": "node",
  "view": {
    "type": "view",
    "space": "Companies",
    "externalId": "Company",
    "version": "1"
  }
}

Response

[
  {
    "instanceType": "node",
    "group": { "industry": "Farming" },
    "aggregates": [
      {
        "aggregate": "avg",
        "property": "head_count",
        "value": 31.4811320754717
      }
    ]
  },
  ...
]

GraphQL examples

You can also perform the examples above through the GraphQL API as well:

# Average head_count for all companies:
aggregateCompany {
  items {
    avg {
      head_count
    }
  }
}

# Average head_count grouped by industry:
aggregateCompany(groupBy: industry) {
  items {
    avg {
      head_count
    }
  }
}

# Histogram of head_count (interval of 10)
aggregateCompany {
  items {
    histogram(interval: 10) {
      head_count {
        count
        start
      }
    }
  }
}

# Search by string ("Inc") combined with aggregation:
aggregateCompany(query: "Inc") {
  items {
    avg {
      head_count
    }
  }
}

Recommendations​

Example aggregation request​

Aggregating a single type​

Average head_count across all companies​

Request​

Response​

Average head_count grouped by industry​

Request​

Response​

Histogram aggregation​

Request​

Response (abbreviated)​

Text search within an aggregation​

Request​

Response​

GraphQL examples​

Recommendations

Example aggregation request

Aggregating a single type

Average `head_count` across all companies

Request

Response

Average `head_count` grouped by industry

Request

Response

Histogram aggregation

Request

Response (abbreviated)

Text search within an aggregation

Request

Response

GraphQL examples