Aggregations

Hyperspace aggregations allow you to perform analytics on your data, gather insights, and summarize information in a structured way. Aggregations operate on a set of documents and produce summary statistics or analysis results.

The aggregation result is stored under the query results objects, as a separate key.

Metric Aggregations

Metric Aggregations allow you to compute and analyze numeric measures on sets of documents. Metric aggregations operate on a numeric field in documents and produce a single numeric result for the specified metric.

To create a metric aggregation, follow the following template -

{
  "aggs": {
    "agg_name": {
      "metric_type": {
        "fieldname": "numeric_field"
      }
    }
  }
}

Where -

  • agg_name - aggregation results will be saved under this key

  • metric_type - the type of aggregation. Possible values are:

    • sum - Returns the sum of the field over the relevant candidates

    • min - Returns the min of the field over the relevant candidates

    • max - Returns the max of the field over the relevant candidates

    • avg- Returns the average of the field over the relevant candidates

    • count - Returns the total number of valid field entries in the relevant candidates

    • cardinality - Returns the total number of valid field values in the relevant candidates

    • percentiles - Returns the percentiles of the field over the relevant candidates.

  • fieldname - the name of the field to be used in the aggregation

Example 1 -

{
  "aggs": {
    "total_sales": {
      "sum": {
        "field": "sales"
      }
    }
  }
}

In the above example, the sum of the field "sales" will be calculated and stored under a key named "total_sales"

Example 2 -

{
  "aggs": {
    "unique_users": {
      "cardinality": {
        "field": "user_id"
      }
    }
  }
}

In the above example, the number of unique values of the field "user_id" will be calculated and stored under a key named "unique_users".

Date Histogram Aggregations

Date Histogram Aggregation groups documents into time intervals, forming buckets based on date or time values. The basic syntax includes the terms

  • "field": specifies the filed over which the aggregation wll be performed

  • "interval": Defines the time interval for creating buckets. Common intervals include "year," "quarter," "month," "week," "day," "hour," "minute," or "second."

Example -

{
  "aggs": {
    "daily_histogram": {
      "date_histogram": {
        "field": "timestamp",
        "interval": "day"
      }
    }
  }
}

In the above example, the result will include a set of buckets, each representing a specific day. Each bucket contains:

Combining Aggregations and Candidate Filtering

You can combine aggregations and candidate filtering, such that different aggregations on the same query will be performed on different candidates.

Example -

{
  "query": {
    "aggs": {
    "avg_rating_electronics": {
      "avg": {
        "field": "rating"
      },
      "filter": {
        "term": {
          "category": "electronics"
        }          
      }
    },
    "avg_rating_high_price": {
      "avg": {
        "field": "rating"
      },
      "filter": {
        "range": {
          "price": {
            "gte": 50
          }
        }
      }
    }
  }
}

In the above example, the aggregation "avg_rating_electronics" will be performed on documents with "category" value that equals "electronics", while the aggregation "avg_rating_high_price" will be performed on documents with "price" value greater than or equal to 50.

Terms Aggregation

Terms Aggregation is a bucket aggregation that allows you to group documents into buckets based on the values of a specific field.

Example -

{
   "query": {
            "bool": {
                "must": {
                    "term": {"product_type_name": "Dress"}
                }
            }
        },
        "aggs": {
            "unique_products": {
                "terms": {
                    "field": "prod_name"
                }
            }
    }
}

In the above example, the aggregation will be performed over the field "unique_products" over all documents that pass the query. The aggregation results will be stored under buckets, according to the possible value of the field "prod_name".

Cardinality Aggregation

Cardinality Aggregation allows you to estimate the number of unique values of fields in a collection, which is often referred to as the count of distinct or unique values.

Example -

{
   "query": {
       "bool": {
            "must": {
                "term": {"product_type_name": "Dress"}
            }
        }
    },
    "aggs": {
        "unique_products": {
            "cardinality": {
                "field": "prod_name"
            }
        }
    }
}

In the above example, the aggregation will be performed over the field "unique_products" over all documents that pass the query. The aggregation results will return the number of all possible value of the field "prod_name".

Last updated