Saltar al contenido principal

Aggregation capabilities in data models

Use the aggregation endpoint to summarize and analyze data by computing advanced metrics and bucketed distributions. For example, you can use the aggregation endpoint to generate data for dashboards and summaries, create numeric summaries, and obtain grouped statistical insights.

The aggregation endpoint supports:

  • Group by: partition or group data based on specified properties. For instance, to see the average temperature for equipment, you can group by the equipment property.

  • Aggregation functions: compute metrics on the grouped data. This includes calculating sums, averages, minimums, maximums, counts, and histograms.

  • Filters and sorting: like the search endpoint, the aggregation endpoint supports filtering and sorting. This allows you to refine your data set before performing aggregations.

Recommendations

  • Combine filters and aggregations: to minimize the volume of data to process, always apply filters before running aggregations.
  • Use groupBy efficiently: restrict grouping to properties that are meaningful for your analysis. Excessive grouping can make results difficult to interpret and reduce performance.
  • Understand tokenization for text queries: when searching for text, refer to the search endpoint's tokenization rules.
  • Plan around indexing delays: because of indexing delays, it may take a few seconds for new or updated data to be available for aggregation.

Example aggregation request

The aggregation request below groups by equipment and computes an average of the temperature property.

Example aggregation request
{
"view": {
"type": "view",
"space": "testSpace",
"externalId": "Equipment",
"version": "v1"
},
"query": "temperature sensor",
"groupBy": ["equipment"],
"aggregates": [
{
"avg": {
"property": "temperature"
}
}
],
"instanceType": "node",
"properties": ["name", "description"],
"targetUnits": [],
"filter": {
"range": {
"property": "temperature",
"lt": 25,
"gt": 15
}
},
"includeTyping": false,
"limit": 100
}

Where:

  • view: defines which view the query will read from. Required.
  • aggregates: lists one or more aggregation operations. Maximum five operations in a single request.
  • groupBy: is an array of property names used for grouping.
  • query and properties: combine to perform text-based matching on the specified properties (name, description).
  • filter: restricts data before aggregation. Only results with temperature between 15 and 25 are included.

Aggregating a single type

The examples below show different aggregations for a type called Company with these properties:

idnameindustryrevenuehead_count
2001c18...Ryan and Friesen and SonsInternet807340005
fe8bf42...Pfannerstill IncBiotechnology3703800050
6dc069...Boyer and Kohler GroupPublic Relations and Communications215980001
f23a8a...Mohr GroupFund-Raising744000056
d346e0...Veum LLCReal Estate522400010

Average head_count across all companies

Calculate the average head_count across every company in the dataset.

Request
{
"aggregates": [
{ "avg": { "property": "head_count" } }
],
"instanceType": "node",
"view": {
"type": "view",
"space": "Companies",
"externalId": "Company",
"version": "1"
}
}
Response
[
{
"instanceType": "node",
"aggregates": [
{
"aggregate": "avg",
"property": "head_count",
"value": 32.29912284331168
}
]
}
]

Average head_count grouped by industry

Group companies by industry and compute the average head_count for each distinct industry value.

Request
{
"aggregates": [{ "avg": { "property": "head_count" } }],
"groupBy": ["industry"],
"view": {
"type": "view",
"space": "Companies",
"externalId": "Company",
"version": "1"
}
}
Response
[
{
"instanceType": "node",
"aggregates": [
{
"aggregate": "avg",
"property": "head_count",
"value": 310.941333333333333
}
],
"group": {
"industry": "Automotive"
}
},
{
"instanceType": "node",
"aggregates": [
{
"aggregate": "avg",
"property": "head_count",
"value": 39.10427807486631
}
],
"group": {
"industry": "Writing and Editing"
}
}
]

Histogram aggregation

Use the histogram function to divide a numeric property (head_count) into buckets. Below, each bucket has a width of 10. The result shows how many companies fall into each bucket, grouped by industry.

Request
{
"aggregates": [
{ "histogram": { "property": "head_count", "interval": 10.0 } }
],
"groupBy": ["industry"],
"view": {
"type": "view",
"space": "Companies",
"externalId": "Company",
"version": "1"
}
}
Response (abbreviated)
{
"instanceType": "node",
"group": { "industry": "Automotive" },
"aggregates": [
{
"aggregate": "histogram",
"interval": 10.0,
"property": "head_count",
"buckets": [
{ "start": 0.0, "count": 52 },
{ "start": 10.0, "count": 65 },
{ "start": 20.0, "count": 55 },
...
]
}
]
},
{
"instanceType": "node",
"group": { "industry": "Writing and Editing" },
"aggregates": [
{
"aggregate": "histogram",
"interval": 10.0,
"property": "head_count",
"buckets": [
{ "start": 0.0, "count": 58 },
{ "start": 10.0, "count": 55 },
...
]
}
]
}

Text search within an aggregation

Similar to the search endpoint, you can combine text-based queries with aggregations. This example computes the average head_count for companies where name starts with "Inc".

Request
{
"query": "Inc",
"properties": ["name"],
"aggregates": [{ "avg": { "property": "head_count" } }],
"groupBy": ["industry"],
"instanceType": "node",
"view": {
"type": "view",
"space": "Companies",
"externalId": "Company",
"version": "1"
}
}
Response
[
{
"instanceType": "node",
"group": { "industry": "Farming" },
"aggregates": [
{
"aggregate": "avg",
"property": "head_count",
"value": 31.4811320754717
}
]
},
...
]

GraphQL examples

You can also perform the examples above through the GraphQL API as well:

# Average head_count for all companies:
aggregateCompany {
items {
avg {
head_count
}
}
}

# Average head_count grouped by industry:
aggregateCompany(groupBy: industry) {
items {
avg {
head_count
}
}
}

# Histogram of head_count (interval of 10)
aggregateCompany {
items {
histogram(interval: 10) {
head_count {
count
start
}
}
}
}

# Search by string ("Inc") combined with aggregation:
aggregateCompany(query: "Inc") {
items {
avg {
head_count
}
}
}