Aggregations

Hyperspace aggregations facilitate data analytics by summarizing and extracting insights in a structured manner. These aggregations work on a set of documents to generate summary statistics or analytical results.

The results of these aggregations are stored as distinct keys in the query results objects.

Metric Aggregations

Metric Aggregations compute and analyze numeric measures of document sets. These aggregations operate on the documents' numeric measures to produce a single numeric result for a specific metric.

To create a metric aggregation, use the following structure:

{
  "aggs": {
    "agg_name": {
      "metric_type": {
        "fieldname": "numeric_field"
      }
    }
  }
}

Where:

  • agg_name: Specifies the name (key) under which aggregation results are saved.

  • metric_type: Specifies the type of aggregation. Possible values are:

    • sum – Returns the sum of the field of the relevant candidates.

    • min – Returns the minimum value of the field of the relevant candidates.

    • max – Returns the maximum value of the field of the relevant candidates.

    • avg – Returns the average of the field of the relevant candidates.

    • count – Returns the total number of valid field entries of the relevant candidates.

    • cardinality – Returns the total number of valid field values of the relevant candidates.

    • percentiles – Returns the percentiles of the field of the relevant candidates.

  • fieldname – Specifies the name of the field to be used in the aggregation.

Example 1

In the following example, the sum of the field "sales" is calculated and stored under a key named "total_sales".

{
  "aggs": {
    "total_sales": {
      "sum": {
        "field": "sales"
      }
    }
  }
}

Example 2

In the following example, the number of unique values of the field "user_id" are calculated and stored under a key named "unique_users".

{
  "aggs": {
    "unique_users": {
      "cardinality": {
        "field": "user_id"
      }
    }
  }
}

Date Histogram Aggregations

Date Histogram Aggregation groups documents into time intervals, creating buckets based on date or time values. The basic syntax includes the following terms –

  • "field" – Specifies the field to be aggregated.

  • "interval" – Specifies the time interval for creating buckets. Common intervals include "year", "quarter", "month", "week", "day", "hour", "minute", or "second".

Example

In the following example, the result includes a set of buckets, each representing a specific day.

{
  "aggs": {
    "daily_histogram": {
      "date_histogram": {
        "field": "timestamp",
        "interval": "day"
      }
    }
  }
}

Combining Aggregations and Candidate Filtering

Aggregations can be combined with candidate filtering to perform different aggregations on different candidates in the same query.

Example

{
  "query": {
    "aggs": {
    "avg_rating_electronics": {
      "avg": {
        "field": "rating"
      },
      "filter": {
        "term": {
          "category": "electronics"
        }          
      }
    },
    "avg_rating_high_price": {
      "avg": {
        "field": "rating"
      },
      "filter": {
        "range": {
          "price": {
            "gte": 50
          }
        }
      }
    }
  }
}

Terms Aggregation

Terms Aggregation is a type of bucket aggregation that groups documents based on the values of a particular field.

Example

In the following example, the aggregation is applied to the field "unique_products" across all documents that meet the query criteria. The aggregation results are stored in buckets according to the values of the field "prod_name".

{
   "query": {
            "bool": {
                "must": {
                    "term": {"product_type_name": "Dress"}
                }
            }
        },
        "aggs": {
            "unique_products": {
                "terms": {
                    "field": "prod_name"
                }
            }
    }
}

Cardinality Aggregation

Cardinality Aggregation estimates the number of unique field values within a collection, which is also called the count of distinct values or unique values.

Example

In the following example, the aggregation is applied to the field "unique_products" for all documents that meet the query criteria. The aggregation results return the total number of all unique values of the field "prod_name".

{
   "query": {
       "bool": {
            "must": {
                "term": {"product_type_name": "Dress"}
            }
        }
    },
    "aggs": {
        "unique_products": {
            "cardinality": {
                "field": "prod_name"
            }
        }
    }
}

Last updated