Overview
An introduction to the Hyperspace hybrid database
Hyperspace is a fully managed hybrid search engine, designed to make search faster, more cost-effective, and scalable. It offers a solution for organizations that were previously constrained, enabling them to handle larger data sizes without compromising business logic.
Unprecedented Latency
Using designated processing units in the cloud, Hyperspace provides unprecedented latency, often measuring 10 to 100 times faster than industry benchmarks, all at a reduced cost. Its query syntax is native Python and supports advanced candidate generation and scoring functionality in similarity keyword searches, vector searches, and hybrid searches.
Fully Managed Service
Delivered as a fully managed service in the cloud, Hyperspace simplifies the complexity of executing search algorithms. Its domain-specific computer architecture and purpose-built engine optimize it for speed and scale. By fusing Elasticsearch with vector databases, it significantly accelerates the processing of similarity search algorithms at a reduced cost.
Built to power a wide variety of search applications, Hyperspace is suitable for real-time prediction, recommendation, search, fraud detection, and cybersecurity.
Hyperspace hybrid search combines vector-based search and keyword matching, allowing a versatile and efficient approach to information retrieval. While vector search tends to excel at capturing semantic relationships, it may behave unexpectedly in certain cases. Keyword matching can pinpoint explicit matches and retrieve documents based on specific terms, thus assisting in handling the scenarios in which vector search underperforms. By combining these two methods, hybrid search allows comprehensive results with high accuracy.
Hyperspace primitive data points are documents, and objects that include combinations of metadata and vectors, and can support a variety of data types.
Create and manage collections, upload and modify data, in native Python syntax.
1 hyperspace_client.create_collection('schema.json', 'collection_name')
2 documents = [
3 {'document_id': '1',
4 'meta data field 1: 'value 1',
5 'meta data field 2: 'value 2',
6 'dense_vector 1': [0.85,0.2,0.2, 0.1]
7 },
8 {'document_id': '2',
9 'meta data field 1: 'value 4'
10 'dense_vector 1': [0.2,0.1,0.2, 0.85],
11 'dense_vector 2': [0.9,0.3,0.3, 0.1]
12 },
13 ]
15
15 batch = [hyperspace.document(str(i), data_point)Us
16 for i, data_point in enumerate(documents)]
17
18 hyperspace_client.add_batch(batch, collection_name)
implement complex query logic as score functions
1 def score_function ( Q , V ) :
2 score = 0.0
3 if match ( 'matadata 1' ):
4 score = 1.0
5 if match ( 'matadata 2' ) :
6 score = 2.0
7 else:
8 score = score + 1.0
9 return score
Create hybrid search queries with controlled weights, to obtain optimal results.
10 data_point = hyperspace_client.get_document(document_id='2',
11 collection_name=collection_name)
12 hybrid_query = {
13 'params': data_point,
14 'knn': {
15 'query': {'boost': 2},
16 'vector': {
17 'boost': 1
18 }
19 }
20 }
21 query = {
22 'params': input_document,
23 'knn': {
24 'query': {"boost": 1.0},
25 'vector':{"boost": 10}
26 }
27 }
28
29 results = hyperspace_client.search(query_with_knn,
30 size=15,
31 function_name='score_function',
32 collection_name=collection_name)
Go to the quickstart guide to get a production-ready hybrid search service up and running in minutes.
Last modified 15d ago