Hyperspace Docs
Hyperspace Homepage
  • Getting started
    • Overview
      • Hyperspace Advantages
      • Hyperspace Search
    • Quick Start
  • flows
    • Setting Up
      • Installing the Hyperspace API Client
      • Connecting to the Hyperspace Server
      • Creating a Database Schema Configuration File
        • Vector Similarity Metrics
        • Index Type Methods
      • Creating a Collection
      • Uploading Data to a Collection
      • Building and Running Queries
        • Building a Lexical Search Query
        • Building a Vector Search Query
        • Building a Hybrid Search Query
      • Retrieving Results
    • Data Collections
      • Uploading Data
      • Accessing Data
      • Supported Data Types
    • Queries
      • DSL Query interface
        • Aggregations
        • Bool Query
        • Candidate Generation and Metadata Filtering
        • Scoring and Ranking
  • Reference
    • Hyperspace Query Flow
    • Features and Benefits
    • Search Processing Unit (SPU)
    • Hyperspace Document Prototype
  • API Documentation
    • Hyperspace Client
      • add_batch
      • add_document
      • async_req
      • clear_collection
      • collections_info
      • commit
      • create_collection
      • delete_collection
      • delete_by_query
      • dsl_search
      • get_schema
      • get_document
      • reset_password
      • search
      • update_by_query
      • update_document
    • DSL Query Framework
      • Aggregations
        • Cardinality Aggregation
        • Date Histogram
        • Metric Aggregations
        • Terms Aggregation
      • Bool Queries
        • Free Text Search
        • 'match' Clause
        • 'filter' Clause
        • 'must' Clause
        • 'must_not' Clause
        • 'should' Clause
        • 'should_not' Clause
      • Candidate Generation and Metadata Filtering
        • Geo Coordinates Match
        • Range Match
        • Term Match
      • Scoring and Ranking
        • Boost
        • 'dis_max'
        • Function Score
        • Rarity Score (TF-IDF)
  • Releases
    • 2024 Releases
Powered by GitBook
On this page
  • 1. Install the Hyperspace API Client
  • 2. Create a local instance of the Hyperspace client
  • 3. Run Hyperspace queries
  1. Getting started

Quick Start

This guide explains how to set up the Hyperspace database in minutes.

PreviousHyperspace SearchNextSetting Up

Last updated 8 months ago

To start using Hyperspace, follow these steps:

1. Install the Hyperspace API Client

Run the following shell command in your code or your data terminal –

d host address, use the following code to connect to the database through the Hyperspace API.

pip install hyperspace-py
npm install https://github.com/hyper-space-io/hyperspace-js

for more information, see .

2. Create a local instance of the Hyperspace client

Once you receive credentials and host address, use the following code to connect to the database through the Hyperspace API.

hyperspace_client = hyperspace.HyperspaceClientApi(host=host_address,
                                                      username=username,
                                                      password=password)
import io.hyperspace.client.HyperspaceClient;
HyperspaceClient client = new HyperspaceClient(host, username, password);
const hs = require('hyperspace-js')
const hyperspaceClient = new hs.HyperspaceClient(host, username, password)

3. Run Hyperspace queries

Create a schema file

The schema files outline the data structure, index and metric types, and similar configurations. More info can be found in the section.

Create a collection

Copy the following code snippet to create a collection

collection_name = 'new_collection'
hyperspace_client.create_collection('schema.json', collection_name)
JsonObject schema = (JsonObject) 
JsonParser.parseReader(new FileReader("schema.json"));
client.createCollection(collectionName, schema);
const collection_name = 'new_collection'
await hyperspaceClient.createCollection('schema.json', collection_name)

Where –

  • 'schema.json' – Specifies the path to the configuration file that you created locally on your machine.

  • collection_name' – Specifies the name of the collection to be created in the Hyperspace database.

Alternatively, you can define the database config schema as a local python object

schema = {
        "configuration": {
            "name": {
                "type": "keyword"
            },
            "id": {
                "type": "keyword",
                "id": True,
            }
        }
    }
hyperspace_client.create_collection(schema, 'collection_name')
String schema = "{" +
                "  \"configuration\": {" +
                "  \"name\": {" +
                "               \"type\":\"keyword\"" +
                "            }" +
                "  \"id\":   {" +
                "               \"type\":\"keyword\"" +
                "               \"id\":\"true\"" +                
                "            }" +    
                "        }" +
                "      }";
                

hyperspaceClient.createCollection(collectionName, schema);
const schema = {
    "configuration": {
        "name": {
            "type": "keyword"
        },
        "id": {
            "type": "keyword",
            "id": true,
        }
    }
};

await hyperspaceClient.createCollection(collectionName, schema);

Where –

  • schema – Specifies the python dictionary that outlines the configuration schema.

  • 'collection_name' – Specifies the name of the collection to be created in the Hyperspace database.

Upload Data

Data can be uploaded in batches. Copy the following code snippet to upload data

batch_size = 250
batch = []

for i, data_point in enumerate(documents):
   batch.append(data_point)
   if (i+1) % batch_size == 0:
      response = hyperspace_client.add_batch(batch, collection_name)
      batch.clear()
      
if batch:
  response = hyperspace_client.add_batch(batch, collection_name)
  
hyperspace_client.commit(collection_name)
import java.util.ArrayList;
final int batchSize = 250;

for (int i= 0; index < documents.size(); i++) {
    batch.add(documents.get(i));
    if ((i+ 1) % batchSize == 0) {
          List<DataPoint> batchCopy = new ArrayList<>(batch);
          futures.add(hyperspaceClient.addBatch(batchCopy, collectionName));
          batch.clear();
      }    
}

if (!batch.isEmpty()) {
    futures.add(hyperspaceClient.addBatch(new ArrayList<>(batch), collectionName));
}
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
hyperspaceClient.commit(collectionName).join();
let BATCH_SIZE = 250;
let batch: any[] = [];
let collectionName = "new_collection";

for (const [i, document] of documents.entries()) {
    batch.push(document);
    if ((i + 1) % BATCH_SIZE == 0) {
        await client.addBatch(collectionName, batch);
        batch = [];
    }
}

if (batch.length != 0) {
    await client.addBatch(collectionName, batch);
}
await client.commit(collectionName)

Where –

  • data_point – Represents the document to upload. Each document must have dictionary like structure with a keys according to the database schema configuration file.

  • batch_size – Specifies the number of documents in a batch.

  • commit is required for vector search only

Build and run a query (Python only)

Hyperspace queries can be of one of the following types –

  • Lexical Search

  • Vector Search

  • Hybrid Search

 def score_function (params , doc) :
     score = 0.0
     if match ('metadata field 1'):
       score = 1.0
       if match ('metadata field 1'):
          score 2.0
 return score

To set a hybrid or lexical search query –

Specify that this score function file is to be used for the Search, as follows –

hyperspace_client.set_function(score_function_name,
                                collection_name=collection_name,
                                function_name='score_function')
String function = Files.readString(Paths.get("score_function.py"));
client.setFunction(collectionName, "score_function", function);
await hyperspaceClient.setFunction(score_function_name,
                                collection_name=collection_name,
                                function_name='score_function')

To run a hybrid or lexical search query –

define the query schema and run

params= {
         "name": "John"
        }
results = hyperspace_client.search(params,
                                   size=10,                 
                                   collection_name=collection_name
                                   function_name='score_function')
JsonObject params = new JsonObject();
params.add("name", new JsonPrimitive("John"));
JsonObject query = new JsonObject();
query.add("query", params);

Object response = client.search(collectionName, 10, query, "my_score_function");
const size = 10;
let params= {
    "name": "John"
}
let functionName = 'score_function';
await hyperspaceClient.search(collectionName, size, params, functionName)

query_body is the query in DSL syntax. query_body must have a similar structure to the database documents, according to the query schema config file. If query_body includes fields of type

To run a lexical search query in DSL syntax–

define the query schema and run

results = hyperspace_client.dsl_search({'params': query_body},
                                   size=10,                 
                                   collection_name=collection_name)
String queryJson =  "{" +
                    "  \"query\": {" +
                    "    \"bool\": {" +
                    "      \"must\": [" +
                    "        {" +
                    "          \"term\":{" +
                    "            \"name\":\"John\"" +
                    "           }" +
                    "        }" +
                    "      ]" +
                    "    }" +
                    "  }" +
                    "}";
JsonObject query = JsonParser.parseString(queryJson).getAsJsonObject();
Object response = hyperspaceClient.dslSearch(collectionName, 10, query));
JsonObject queryResponse = new Gson().toJsonTree(response).getAsJsonObject();
System.out.println(queryResponse);
const size = 10;
const query = {
    "query": {
        "bool": {
            "must": [
                {"term": {"name": "John"}}
            ]
        }
    }
}
await hyperspaceClient.search(collectionName, size, query)

query_body is the query in DSL syntax.

results is a dictionary with two keys – {'similarity': {}, 'took_ms': ..}

  • took_ms – is a float value that specifies how long the query took to run, such as 8.73ms

  • similarity – Returns a list. Each element of the list represents a matching document. For each document, it specifies the score and the vector_id that you can use to retrieve the document from the Collection.

Here is an example of what results might look like if they were printed on the screen –

print(results['similarity']) 

[{'score: 513.7000122070312, 'vector_id': '78254'}, {'score: 512.5500126784442, 'vector_id': '23091'}, {'score: 485.5471220787652, 'vector_id': '85432'}]

You can retrieve additional document fields in the query, using the "fields" keyword.

To run a lexical search query in DSL syntax–

define the query schema and run

query = {
    "query": {
        "bool": {
            "must": [
                {"term": {"name": "John"}}
            ]
        }
    }
}
results = hyperspace_client.search({'params': query_body},
                                   size=10,                 
                                   collection_name=collection_name
                                   function_name='score_function',
                                   fields = ["title", "date"])
String queryJson =  "{" +
                    "  \"query\": {" +
                    "    \"bool\": {" +
                    "      \"must\": [" +
                    "        {" +
                    "          \"term\":{" +
                    "            \"name\":\"John\"" +
                    "           }" +
                    "        }" +
                    "      ]" +
                    "    }" +
                    "  }" +
                    "}";
JsonObject query = JsonParser.parseString(queryJson).getAsJsonObject();
Object response = client.dslSearch(collectionName, 10, query));
const size = 10;
const query = {
    "query": {
        "bool": {
            "must": [
                {"term": {"name": "John"}}
            ]
        }
    }
}
await hyperspaceClient.dslSearch(collectionName, size, query,
                                    fields = ["title", "date"])

query_body is the query in DSL syntax.

In this scenario, each entry in results['similarity'] includes a key named "fields", that includes the fields "title" and "date" per retrieved document.

Lexical search can be performed in DSL syntax, or as using a of the following form:

a more detailed guide is available .

here
configuration file
score function
here