Hyperspace Docs
Hyperspace Homepage
  • Getting started
    • Overview
      • Hyperspace Advantages
      • Hyperspace Search
    • Quick Start
  • flows
    • Setting Up
      • Installing the Hyperspace API Client
      • Connecting to the Hyperspace Server
      • Creating a Database Schema Configuration File
        • Vector Similarity Metrics
        • Index Type Methods
      • Creating a Collection
      • Uploading Data to a Collection
      • Building and Running Queries
        • Building a Lexical Search Query
        • Building a Vector Search Query
        • Building a Hybrid Search Query
      • Retrieving Results
    • Data Collections
      • Uploading Data
      • Accessing Data
      • Supported Data Types
    • Queries
      • DSL Query interface
        • Aggregations
        • Bool Query
        • Candidate Generation and Metadata Filtering
        • Scoring and Ranking
  • Reference
    • Hyperspace Query Flow
    • Features and Benefits
    • Search Processing Unit (SPU)
    • Hyperspace Document Prototype
  • API Documentation
    • Hyperspace Client
      • add_batch
      • add_document
      • async_req
      • clear_collection
      • collections_info
      • commit
      • create_collection
      • delete_collection
      • delete_by_query
      • dsl_search
      • get_schema
      • get_document
      • reset_password
      • search
      • update_by_query
      • update_document
    • DSL Query Framework
      • Aggregations
        • Cardinality Aggregation
        • Date Histogram
        • Metric Aggregations
        • Terms Aggregation
      • Bool Queries
        • Free Text Search
        • 'match' Clause
        • 'filter' Clause
        • 'must' Clause
        • 'must_not' Clause
        • 'should' Clause
        • 'should_not' Clause
      • Candidate Generation and Metadata Filtering
        • Geo Coordinates Match
        • Range Match
        • Term Match
      • Scoring and Ranking
        • Boost
        • 'dis_max'
        • Function Score
        • Rarity Score (TF-IDF)
  • Releases
    • 2024 Releases
Powered by GitBook
On this page
  • Running the Hybrid Search Query
  • Creating the Hybrid Search Query
  • Linear Combination of Scores
  • Assigning Weights
  • Building a Hybrid Score Function
  1. flows
  2. Setting Up
  3. Building and Running Queries

Building a Hybrid Search Query

PreviousBuilding a Vector Search QueryNextRetrieving Results

Last updated 10 months ago

The following describes how to build and run a Hybrid Search query. A Hybrid search performs both a Lexical Search and a Vector Search. It then assigns a multiplier (weight) to the resulting matches and retrieves the documents with the top highest scores for retrieval.

Running the Hybrid Search Query

If you are using a score function, copy the following code snippet to run the hybrid search query –

results = hyperspace_client.search(query, 
                                   size=5, 
                                   function_name='score_function',               
                                   collection_name=collection_name)
Object results = hyperspace_client.search(query, 
                                          size=5, 
                                         'score_function',               
                                          collection_name);
let results = hyperspaceClient.search(query, 
                                   size=5, 
                                   'score_function',               
                                  collection_name);

Where–

  • document – Specifies the document for similarity search and the multiplier of the return score, as described in Step 3, Defining the Lexical Query Schema.

  • size – Specifies the number of results to return.

  • function_name – Specifies the scoring function to be used in the Lexical Search query as described in Step 1, .

  • collection_name – Specifies the Collection in which to search.

Alternatively, if you use DSL syntax, copy the following code snippet

results = hyperspace_client.search_dsl(query,
                                    size=5,
                                    collection_name=collection_name)    
let results = hyperspaceClient.searchDSL(query,
                                    size=5,
                                    collection_name=collection_name);
const results = hyperspaceClient.searchDSL(query,
                                    size=5,
                                    collection_name=collection_name);

Where–

  • query_string is your query logic, see example below.

Creating the Hybrid Search Query

You can create Hybrid Search queries in two methods

  1. Linear combination of vector and lexical search (default)

  2. Hybrid score function

Linear Combination of Scores

The query will be a hybrid search query with the score being a linear combination of lexical and vector scores. By default, query components are assigned with weight = 1.0.

Assigning Weights

To change the weights, add a key named "knn" that includes a key "query" for the lexical search and designated keys with vector field name ('vector_field_1' in the example) for the vector search.

hybrid_query_schema = {
                        'params':  {"name": "John", "Age": 30},
                        'knn': [{'field':'query','boost': 0.05}, 
                                {'field':'vector_field_1','boost': 0.6}] 
                      }
JsonObject params = new JsonObject();
params.add("name", new JsonPrimitive("John"));
params.add("age", new JsonPrimitive(30));
params.add("vector_field", new Gson().toJsonTree(vector).getAsJsonArray());

JsonObject knn_vector = new JsonObject();
knn_vector.add("boost", new JsonPrimitive(1));
            
JsonObject knn_query = new JsonObject();
knn_query.add("boost", new JsonPrimitive(2));
            
JsonObject knn = new JsonObject();
knn.add("query", knn_query);
knn.add("vector", knn_vector);
            
JsonObject hybrid_search = new JsonObject();
hybrid_search.add("params", params);
hybrid_search.add("knn", knn);
const hybridQuerySchema = {
                        'params': {"name": "John", "Age": 30},
                        'knn': [{'field':'query','boost': 0.05}, 
                                {'field':'vector_field_1','boost': 0.6}] 
                      };

In the above example, the vector_field_1 score will be multiplied by 0.6 in the overall score and the lexical query score by 0.05.

All fields of type dense_vector under 'params' will be included in the vector search, unless the corresponding 'boost' key is set to 0. The weight will be assigned the default value of 1.0.

Building a Hybrid Score Function

You can also preform Hybrid Search using a hybrid score function, by including the KNN search function in the lexical score function. The score in this case is determined by the score function logic and any weights assigned in the query schema will be ignored.

Both functions can only be used in the last return statement. All other return statements must return 0, None, or False.

Example 1

def score_function_hybrid_1( params , doc ):
    score0 = 0.0
    if match('production_companies'):
        return 2 * knn_filter('vector_field_1', min_score=0.3)
    return 0.0

In the above example, the score function will return 2 if the KNN score of 'vector_field_1' is above 0.3, and 0 otherwise.

Example 2

def score_function_hybrid_2( params , doc ):
    score0 = 0.0
    if match('production_companies'):
        score0 = 2
        return 2 + 0.5 * distance('vector_field_1', min_score=params['min_score'])
    return 0.0

In the above example, the score function will return 2+ 0.5 * KNN('vector_field_1') if the KNN score of 'vector_field_1' is above params['min_score'], and 2 otherwise. You need to provide the key "min_score" under the query "params" key.

You can access the KNN score by using the function ('vector_field', min_score) that returns the KNN distance, and the function

('vector_field', min_score) that filters according to distance. The default value of min_score is 1 for both functions.

Creating the Scoring Function
distance
knn_filter