Comment on page

# Candidate Score

Hyperspace support various methods of scoring and arithmetics, based on rarity of keywords in the collection.

The rarity

`score `

can be calculated for matched keywords. Hyperspace calculate this score over keywords or lists of keywords, using the TF-IDF formula. Two different types of usages are currently allowed -

- rarity_max(str fieldname) returns the maximum rarity out of all the keywords in the list,
- rarity_sum(str fieldname) returns the sum of rarities of all the keywords in the list.

For keyword fields (non lists) the two functions will return the same result.

**Example:**

score = rarity_max("cities") + rarity_sum("streets")

Hyperspace allows multiple methods for score arithmetic, as explained below

- Sum
- Max
- Arithmetic operations

The function receives n scores (results of score functions) and returns their sum

**Syntax**

`sum (float score1, float score2,...`

`)`

**Example**

score1 = rarity_max("city")

score2 = rarity_max("Country")

score3 = rarity_max("Continent")

score4 = .....

score_sum = sum(score1, score2, score3...)

**Where**-

- score1, score2, score3 are the results of a score function.
- score_sum

The function receives n scores (results of score functions) and returns the maximum of their values

**Syntax**

max(float score1, float score2)

**Example**

score1 = rarity_max("city")

score2 = rarity_max("Country")

score3 = rarity_max("Continent")

score4 = .....

score_max = max(score1, score2, score3...)

**Where**-

- score1, score2, score3 are the results of a score function.
- score_max

rarity_sum and rarity_max may only return different score for list[keywords]. In particular, when used for matching fields of type keyword, they will always return the same score.

Hyperspace allows arithmetic operations between scores, using the operators

`+, *, -, /`

. These operators can be used in combination with the operator `=`

**Example**

score0 = 0.0

if (match("field 1") or match("field2") or match("Expiration date")):

score0 += rarity_max("visit_times_in_personal care")

score0 -= rarity_sum("Credit card")

score1 = 2 * score0

**Where-**

- score0 is the result of a score function.

Hyperspace allows to include the KNN vector score in the lexical score function, by using the function

`distance`

(`str vector_fieldname1, str vector_fieldname2, r32 min_score)`

. The

`distance()`

function calculates the KNN score based on the metric defined in the data configuration schema file. It will then return the score if it is above the `min_score_threhold,`

or 0 otherwise `min_score`

can be a dynamic value, provided as part of the query params.`By default,vector_fieldname2= vector_fieldname1 and min_score_threhold = 0`

The distance function can only be used as part of the last return statement.

In addition, all other

`return`

statements must`return 0`

, `False `

or `none`

. For example:**Example 1**:

def score_function(params, doc):

if match("genre"):

return

else if match("countries"):

return False

score = rarity_max("tags")

if score < 1:

return 0

return score1 + 0.3 * distance("tagline_embedding", 0.2)

In the above example, distance calculates the KNN score between

`params["tagline_embedding"]`

and `doc["`

tagline_embedding`"]`

. If the score is above 0.2, the function will return score1 + 0.3 * knn_score. Otherwise it will return score1.**Example 2**:

def score_function(params, doc):

score1 = rarity_max("tags")

return score1 + distance("tagline_embedding", "overview_embedding", params["min_score"])

In the above example, distance calculates the KNN score between

`params["tagline_embedding"]`

and `doc["overview_embedding"]`

. If the score is above params["min_score"], it will return score1 + distance. Otherwise it will return score1.Last modified 3d ago