Aggregations
Hyperspace allows aggregations of numerical fields over selected documents. The aggregations function is implemented inside the score function, where each aggregation is performed over the candidates that passed the filtering up to this position in the code.
The aggregation result is stored as a key under the query results object.
For example -
In the above example, the query results will include a key named "'aggregations", with the following sub keys:
a key named “max_rating”, with a value of the max value of the rating of all candidates that passed the filter over genres.
a key named "sum_budget", which includes the sum over the field “budget” of all candidates that passed the filters over genres and languages.
a key named "percentile_budget", which includes the 10,15,32 and 75 percentile over the field “budget” of all candidates that passed the filters over genres and languages.
The following aggregations types are supported
aggregate_sum(str agg_name, str fieldname)
- Returns the sum of the field over the relevant candidatesaggregate_min (str agg_name, str fieldname)
- Returns the min of the field over the relevant candidatesaggregate_max (str agg_name, str fieldname)
- Returns the max of the field over the relevant candidatesaggregate_avg (str agg_name, str fieldname)
- Returns the average of the field over the relevant candidatesaggregate_count (str agg_name)
- Returns the total number of valid field entries in the relevant candidatesaggregate_
cardinality(str agg_name, str fieldname)
- Returns the total number of valid field values in the relevant candidatesaggregate_percentile(str agg_name, str fieldname, list[float] percentiles)
- Returns the percentiles of the field over the relevant candidates.
Date Histogram
You can create store the aggregation results as a histograms by date, using the function date_histogram(str agg_name, str fieldname, str time_interval)
.
The aggregation result will be stored under key "agg_name"
. Results will be binned to a histogram with resolution determined by time_interval
. The available units for time_interval
are s/m/h/d.
Example:
In this example, the aggregation results is binned into a histogram where the width of each bin is 1d.
Aggregation functions include all data points that reached the relevant code part, regardless if they are included in the candidate list
Last updated