Overview
An introduction to the Hyperspace hybrid search database.
Hyperspace is a search database for AI-driven applications, combining high performance with maximum search relevancy. This cloud-native, hybrid search database is managed for you, eliminating infrastructure complexities. Hyperspace features an easy-to-use API and excels in delivering query results with minimal latency, even when dealing with billions of documents.
Fast queries
Using designated processing units in the cloud, Hyperspace provides latencies 10 to 100 times faster than industry benchmarks, at a reduced cost. Its query syntax is native Python and supports advanced filtering and scoring functionality in keyword searches, vector searches, and hybrid searches.
Hyperspace is built to power a wide variety of AI applications, including real-time recommendations, search, generation, fraud prevention, and threat detection.
Relevant search results
Hyperspace's hybrid search combines vector-based search and keyword search, allowing a versatile and efficient approach to information retrieval. While vector search tends to excel at capturing semantic relationships, it behaves unexpectedly in many cases. A keyword search can pinpoint explicit matches and retrieve documents based on specific terms, improving relevancy when vector search falls short. By combining these two methods, hybrid search allows comprehensive results with high relevancy.
Hyperspace stores vectors and metadata
Hyperspace collections hold documents that contain fields with designated types. The list of supported types is available under data types. Most fields support the use of list types, where vector field type is used for vector and hybrid search. Other types (metadata) are used in lexical and hybrid search.
Creating collections and ingesting documents
Create and manage collections, upload and modify data:
Creating hybrid queries
The score function is the way you can specify the keyword search behavior. It allows filtering and scoring (including TF/IDF). Below is an example of a simple score function that filters the results based on two fields and applies score manipulation. More details on score functions in the next chapters.
Next, we specify the hybrid search behavior. Here is a simple keyword search followed by a vector search on the resulting documents. We specify how both scores will be merged via the boost parameter, in this example, the keyword search is given twice the weight of the vector search. Document 2 is used for the query documents.
Running the query
Lastly, we call the search API, in this call we specify the name of the hybrid query, the number of documents to return, the score function name, and the collection name. The result of this call goes to the 'result' dictionary, containing the top document ids along with their scores.
That's our first query!
In the following chapters, we'll discuss these features in greater detail.
Last updated