Building a Vector Search Query

The following describes how to build and run a Vector Search query. This Vector Search matches the similarity of documents in a Collection to a document that you provide by measuring the proximity (or similarity) between vectors rather than relying on traditional keyword matching or exact matches. This Vector search assigns a numerical score to each document according to the type of vector scoring method that you specified in the metric field, such as Hamming. This provides the identifiers of the documents with the top highest scores for retrieval. A multiplier (weight) option can be assigned to the score values.

Note – While uploading data into a Collection, 'dense_vector' value assigned to the 'type' attribute signifies that the data is suitable for a Vector Search. Therefore, only data that has been uploaded in this manner will be matched in a Vector search, as described in Creating a Database Schema Configuration File.

Defining the Vector Query Schema

Define the Vector Search query schema by specifying the following –

vector_query_schema = {
                         'params': data_point
                      }

Where

'params'– Specifies the document for which the query is searching for a match.

All fields of type 'dense_vector' in 'params' will be included in the search. For example, if 'params' includes two fields of type 'dense_vector', the query will be preformed on both fields and the score will be a sum of the KNN scores.

Assigning Score Weights

Define the weighted Vector Search query schema by specifying the following –

vector_query_schema = {
                         'params': data_point,
                         'knn': {
                         'vector_field_name_1': {'boost': 0.5},
                         'vector_field_name_2': {'boost': 0.5}      
                         }                   
                       }

Where

  • 'vector_field_name_1'– the name of the first vector field to be given a weight

  • 'vector_field_name_2'– the name of the second vector field to be given a weight

  • 'boost' - the key states the value of the given weight per score type, which is 0.5 in the example. Default value is 1.0.

In the above example, the query will perform two KNN calculations and return the weighted sum of the score, score = 0.5 * score(vector_field_name_1) + 0.5 * score(vector_field_name_2)

Using multiple vectors in the same query requires all of the included vectors to be indexed in the same manner, for example HNSW for all fields in the database schema configuration file.

Multi vector query can be integrated with lexical search to create a multi hybrid query.

Running the Vector Query

Copy the following code snippet to run the Vector Search query –

results = hyperspace_client.search(vector_query_schema, 
                                   size=5,                 
                                   collection_name=collection_name)

Where –

  • vector_query_schema – Specifies the document for similarity search and the multiplier of the return score, as described in Step 2, Defining the Vector Query Schema.

  • size – Specifies the quantity of results to retrieve.

  • collection_name – Specifies the Collection in which to search.

Last updated

#108: Max's Nov 6 changes

Change request updated