Aggregations
Hyperspace allows aggregations of numerical fields over candidate lists. The aggregations can be implemented inside the score function, where each aggregation is performed over the candidates that passed the filtering up to its position in the code.
The aggregation result will be returned under the query results objects, as a separate key. Example:
In the above example, the query results will include a
a key named “max_rating”, with a value of the max value of the rating of all candidates that passed the filter over genres.
a key named "sum_budget", which includes the sum over the field “budget” of all candidates that passed the filters over genres and languages.
The following aggregations types are supported
aggregate_sum(str agg_name, str fieldname)
- Returns the sum of the field over the relevant candidatesaggregate_min (str agg_name, str fieldname)
- Returns the min of the field over the relevant candidatesaggregate_max (str agg_name, str fieldname)
- Returns the max of the field over the relevant candidatesaggregate_avg (str agg_name, str fieldname)
- Returns the average of the field over the relevant candidatesaggregate_median (str agg_name, str fieldname)
- Returns the median of the field over the relevant candidatesaggregate_count (str agg_name)
- Returns the total number of valid field entries in the relevant candidatesaggregate_
cardinality(str agg_name, str fieldname)
- Returns the total number of valid field values in the relevant candidates
Date Histogram
Hyperspace allows to create histograms by date of the aggregation results, using the function date_histogram(str agg_name, str fieldname, str time_interval)
.
The aggregation result will be saved under the same key as a standard aggregation. However, results will be segmented as a histogram with resolution determined by time_interval
. The available units for time_interval
are s/m/h/d.
Example:
In this example, the aggregation results will be binned into a histogram where the width of each bin is 1d.
In score functions, only the outer "if" condition generate candidates (if match('genres') in the example) while the inner "if" conditions only change their score. By contrast, when using aggregations, all "if" conditions have the same effect.
Last updated