Candidate Score

Hyperspace support various methods of scoring and arithmetic, based on rarity of keywords in the collection.

Constant

You can assign constant scores using standard assignment in the score function.

Example:

if match("cities"):
    score = 5

Rarity Score (TF-IDF)

You can calculate rarity score for two matched keywords between any matched keywords between two lists of keywords. Hyperspace uses the TF-IDF formula for the score.

Two different types of usages are currently allowed -

  • rarity_max(str fieldname) returns the maximum rarity out of all the keywords in the list,

  • rarity_sum(str fieldname) returns the sum of rarities of all the keywords in the list.

For keyword fields (non lists) the two functions will return the same result.

Example:

score = rarity_max("cities") + rarity_sum("streets")

rarity_sum and rarity_max will only return different score for list[keywords]. In particular, when used for matching fields of type keyword, they will always return the same score.

Score Operations

Hyperspace allows multiple methods for score arithmetic

  • Sum

  • Max

  • Arithmetic operations

Sum of Scores

The function receives n scores (output of functions such as rarity_sum()) and returns their sum

Syntax

sum (float score1, float score2,...)

Example

score1 = rarity_max("city")
score2 = rarity_max("Country")
score3 = rarity_sum("Continent")
score4 = .....
score_sum = sum(score1, score2, score3...)

Where -

  • score1, score2, score3 are the results of a score function.

  • score_sum is the sum of score1, score2, score3...

Max of Scores

The function receives n scores (output of functions such as rarity_sum()) and returns the maximum of their values

Syntax

max(float score1, float score2)

Example

score1 = rarity_max("city")
score2 = rarity_max("Country")
score3 = rarity_sum("Continent")
score4 = .....
score_max = max(score1, score2, score3...)

Where -

  • score1, score2, score3 are the results of a score function.

  • score_max is the maximum between score1, score2, score3...

Arithmetic Operators

Hyperspace allows arithmetic operations between scores, using the operators +, *, -, / . These operators can be used in combination with the operator =

Example

score0 = 0.0
if (match("field 1") or match("field2") or match("Expiration date")):
   score0 += rarity_max("visit_times_in_personal care")
   score0 -= rarity_sum("Credit card")
score1 = 2 * score0

Where-

  • score0 is the result of a score function.

Vector Distance

You can include the vector search score in the score function, by using the function distance(str vector_fieldname1, str vector_fieldname2, float min_score).

distance() returns the KNN score if the score is abovemin_score_threhold,or 0 otherwise, according to the metric defined in the data configuration schema file.. min_score can be a dynamic value, defined in the query params.

The function operates on params[vector_fieldname1] and doc[vector_fieldname2].

By defaultvector_fieldname2 = vector_fieldname1 and min_score_threhold = 0

Limitations

  • The distance function can only be used as part of the last return statement.

  • All other return statements must return 0, False or none. For example:

Example 1:

def score_function(params, doc):
    if match("genre"):
        return
    else if match("countries"):
        return False
    score = rarity_max("tags")        
    if score < 1:
        return 0      
    return score1 + 0.3 * distance("tagline_embedding", 0.2)

In the above example, distance calculates the KNN score between params["tagline_embedding"]and doc["tagline_embedding"]. If the score is above 0.2, the function will return score1 + 0.3 * knn_score. Otherwise it will return score1.

Example 2:

def score_function(params, doc):
    score1 = rarity_max("tags")            
    return score1  + distance("tagline_embedding", "overview_embedding", params["min_score"])

In the above example, the distance() function returns the KNN score between params["tagline_embedding"]and doc["overview_embedding"]. If the score is below params["min_score"], it will return 0.

Last updated