Building a Hybrid Search Query
The following describes how to build and run a Hybrid Search query. A Hybrid search performs both a Classic Search and a Vector Search. It then assigns a multiplier (weight) to the resulting matches and then retrieves the documents with the top highest scores for retrieval.
To build a hybrid search query –
Define the Hybrid Search query schema by specifying the following –
Running the Hybrid Search Query
Copy the following code snippet to run the lexical search query –
Where–
lexical_query_schema – Specifies the document for similarity search and the multiplier of the return score, as described in Step 3, Defining the Classic Query Schema.
size – Specifies the number of results to return.
function_name – Specifies the scoring function to be used in the Classic Search query as described in Step 1, Creating the Scoring Function.
collection_name – Specifies the Collection in which to search.
Assigning Score Weights
By default, all query components, vector and lexical, are assigned with weight = 1.0. To change that, add a key named "knn" that includes a key "query" for the lexical search and designated keys with vector field name ('vector_field_1' in the example) for the vector search.
In the above example, the vector_field_1 score will be multiplied by 0.6 in the overall score and the lexical query score by 0.05.
All fields of type dense_vector in data_point will be included in the vector search. Unless specified otherwise in a relevant 'boost; key under the 'knn' key, the corresponding weight will be assigned the default value of 1.0.
Building a Hybrid Score Function
The KNN score can be included in the lexical score function. To do that, use the function distance
('vector_field_1') or knn_filter('vector_field_1', min_score=params['min_score']). The distance function returns the KNN distance, while the knn_filter function returns 1 if the KNN score is above min_score and 0 otherwise. Both functions can only be used in the last return statement.
In the above example, the score function will return score0 if the KNN score of 'vector_field_1' is above 0.3, and zero otherwise.
Last updated