# Candidate Generation

Hyperspace score function generates candidates by filtering the database. Hyperspace supports several filtering methods, both range-match based and keyword-match based.

The filtering is only performed at the external conditions stack, that is, only the external “if” conditions will affect the candidates list. The final candidate list can then be created using additional filters and scores.

As an example, only the `if`<mark style="color:red;">`match`</mark>`('genres')` in the following query will create candidates, while the if <mark style="color:red;">match</mark>('countries')  condition will allow to modify candidate score but will not change the overall candidate list

```python
def score_function_recommendation( params, doc):
    score = 0.0
    if match('genres'):
       score += rarity_sum('genres')
       if match('countries'):
          score += rarity_sum('countries')       
    return score 
```

Hyperspace candidate filtering can be performed in multiple method

* Exact match between keywords
* Window Match Between Dates
* Match Between Geo Coordinates

## Exact Match Between Keywords

Exact keyword matching can be performed using the function `match(str fieldname)`. The function operates on either keywords or lists of keywords. For keywords, the function returns True for an exact match between the keywords and for lists of keywords, it returns True for an exact match between any 2 keywords in the two lists. Hyperspace allows two forms of matching

* Match between a field in the query and the same field in the database documents
* Match between a field in the query and a different field in the database documents

**Example**:

```python
if match("city",”shipping_city”) and match("street"):
    pass
```

In the above example:

* The field '<mark style="color:purple;">street'</mark> is compared between the query and each document. If the field includes a matching value, the corresponding match function will return true.
* The field '<mark style="color:purple;">city'</mark> in the query is compared with the field '<mark style="color:purple;">shipping\_city</mark>' in the database documents. If there is a matching value, the corresponding match function will return true.

## Window Match Between Dates

Window matching between dates can be performed using the function `window_match(str fieldname, unsigned int Dt0, unsigned int Dt1)`.&#x20;

The function compares the dates `V[fieldname] - dt0 and V[fieldname] - dt1 to Q[fieldname].`

In other words, the function operates on integer fields and returns **True** `if V[fieldname] - dt0 < Q[fieldname] < V[fieldname] - dt1`, and **False** otherwise.

**Where**-

<mark style="color:purple;">Q</mark> is the query document value&#x20;

<mark style="color:purple;">V</mark> is the candidate vector

<mark style="color:purple;">dt1</mark>, <mark style="color:purple;">dt0</mark> state the range of the window to match. <mark style="color:purple;">dt1</mark> and <mark style="color:purple;">dt0</mark> must include units (s/m/h/d).

**Example**:

```python
if window_match(Arrival_times, “3d”,“1d"):
   pass
```

The <mark style="color:purple;">window\_match</mark> condition will return True if&#x20;

`doc[fieldname] - 3d< params[fieldname] < doc[fieldname] - 2d`

For example, if `params[fieldname]=`1698225495, which is equivalent to GMT October 25, 2023 9:18:15 AM, and `doc[fieldname]=`1698311895, which is equivalent to GMT October 26, 2023 9:18:15 AM, then`params[fieldname] > doc[fieldname] - 2d` and <mark style="color:purple;">window\_match</mark> will return **False.**

## Match Between Geo Coordinates

Geographical coordinates can be compared using the function `geo_dist_match(str fieldname, float thresh)`.

The function returns **True** if the distance between the coordinates is below the threshold, and **False** otherwise.

**Example:**

```python
if geo_dist_match("geolocation", 45.02):
    pass
```

## Filtering Based on Document Fields Values

The input query values and database documents values can be accessed using the syntax `params[fieldname]` or `doc[fieldname]`, correspondingly.  The retrieved values can than be used as part of the score function.&#x20;

**Example:**

```python
def score_function_recommendation( params, doc):
    score = 0.0
    if match('genres') and doc['budget'] > 10000000:
       score += rarity_sum('genres')  
    return score 
```

## Filtering Based on KNN score

Filtering based on the distance between vectors can be performed using the function knn\_filter(`str vector_fieldname1, str vector_fieldname2, r32 min_score)`

`knn_filter()` operates on a one or two vector fields and calculates the KNN score, based on the metric defined in the data configuration schema file. It will then return 1 if it is above the `min_score_threhold,`or 0 otherwise. `min_score` can be a dynamic value, included in the query params.

By default`,vector_fieldname2 = vector_fieldname1 and min_score_threhold = 0`

### Limitations

`knn_filter() can`only performed at the last return statement.

All other `return` statements must `return 0`, `False` or none. For example:

**Example 1**:

```python
def score_function(params, doc):
    if match("genre"):
        return
    else if match("countries"):
        return False
    score = rarity_max("tags")        
    if score < 1:
        return 0      
    return score1 + 0.3 * knn_filter("tagline_embedding", 0.2)
```

In the above example, knn\_filter calculates the KNN score between `params["tagline_embedding"]`and `doc["`tagline\_embedding`"].` If the score is above 0.2, the function will return score1  + 0.3. Otherwise it will return score1.

**Example 2**:

```python
def score_function(params, doc):
    score1 = rarity_max("tags")            
    return score1  * knn_filter("tagline_embedding", "overview_embedding", params["min_score"])
```

In the above example, knn\_filter calculates the KNN score between `params["tagline_embedding"]`and `doc["overview_embedding"]`. If the score is above params\["min\_score"], it will return score1. Otherwise it will return 0.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.hyper-space.io/hyperspace-docs/~/changes/uCQNjcW7J3OXknfYJaVa/projects/the-query-flow/candidate-generation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
