# Uploading Data

Data can be uploaded in batches or as single documents.

## Uploading a Single Document

**To upload a document into a Collection** –

Hyperspace documents must be of type dictionary.

Use the following to upload a a batch of documents –

Use the following to upload a single document  –

```python
hyperspace_client.add_document(document, collection_name)
```

{% tabs %}
{% tab title="Python" %}
{% code lineNumbers="true" %}

```python
document = {"title": "star wars", 
            "genres": ["Action","Adventure","Sc-ifi"],
            "embedded_description" = [0.01,0.003,...]}
              
hyperspace_client.add_document(document, collection_name)              
```

{% endcode %}
{% endtab %}

{% tab title="Java" %}
{% code lineNumbers="true" %}

```javascript
Document document = new Document();
document.putAdditionalProperty("title", "star wars");
document.putAdditionalProperty("genres", ["Action","Adventure","Sc-ifi"]);
document.putAdditionalProperty("embedded_description", [0.01,0.003,...])

hyperspaceClient.updateDocument(collectionName, doc, true, false);
```

{% endcode %}
{% endtab %}

{% tab title="JavaScript" %}
{% code lineNumbers="true" %}

```javascript
const document = {"title": "star wars", 
                "genres": ["Action","Adventure","Sc-ifi"],
                "embedded_description" = [0.01,0.003,...]};
              
hyperspaceClient.addDocument(document, collection_name);
```

{% endcode %}
{% endtab %}
{% endtabs %}

**Where –**

* <mark style="color:purple;">document</mark> – Contains the data to be uploaded in the structure specified in the data schema configuration file.
* <mark style="color:purple;">collection\_name</mark> – Specifies the name of the Collection into which to load the document.

## Uploading a Batch of Documents

Data can be uploaded in batches by conversion of the documents to a document object before uploading. The basic data point object for the Hyperspace database is a structure of python dictionaries.

**To upload a batch of documents into a Collection –**

We recommend that you upload data to a Collection in batches of many documents each, which has the structure specified in the data schema configuration file.

The following code snippet builds a list of documents in a temporary variable named batch and then uploads each batch using –

{% tabs %}
{% tab title="Python" %}
{% code lineNumbers="true" %}

```python
hyperspace_client.add_batch(batch, collection_name)
```

{% endcode %}
{% endtab %}

{% tab title="Java" %}
{% code lineNumbers="true" %}

```java
hyperspaceClient.addBatch(batch, collectionName);
```

{% endcode %}
{% endtab %}

{% tab title="JavaScript" %}

<pre class="language-javascript" data-line-numbers><code class="lang-javascript"><strong>hyperspaceClient.addBatch(batch, collectionName);
</strong></code></pre>

{% endtab %}
{% endtabs %}

The following example builds batches of 250 random documents for Hybrid Search. Each time it creates a random document, it loads it into a batch and then uploads the batch. Once a batch reaches 250 documents, it's uploaded to the Hyperspace Collection. Replace the yellow highlighted line below with code that retrieves the next document to be uploaded.

**Copy the following code snippet -**

{% tabs %}
{% tab title="Python" %}
{% code lineNumbers="true" %}

```python
BATCH_SIZE = 250
batch = []

for i, data_point in enumerate(documents):
   batch.append(data_point)
   if (i+1) % BATCH_SIZE == 0:
      response = hyperspace_client.add_batch(batch, collection_name)
      print(i + 1, response)
      batch.clear()
if batch:
  response = hyperspace_client.add_batch(batch, collection_name)
  hyperspace_client.commit(collection_name)
```

{% endcode %}
{% endtab %}

{% tab title="Java" %}
{% code lineNumbers="true" %}

```java
import java.util.ArrayList;
final int batchSize = 250;

for (int i= 0; index < documents.size(); i++) {
    batch.add(documents.get(i));
    if ((i+ 1) % batchSize == 0) {
          List<DataPoint> batchCopy = new ArrayList<>(batch);
          futures.add(hyperspaceClient.addBatch(batchCopy, collectionName));
          batch.clear();
      }    
}

if (!batch.isEmpty()) {
    futures.add(hyperspaceClient.addBatch(new ArrayList<>(batch), collectionName));
}
CompletableFuture.allOf(futures.toArray(new CompletableFuture[0])).join();
hyperspaceClient.commit(collectionName).join();
```

{% endcode %}
{% endtab %}

{% tab title="JavaScript" %}
{% code lineNumbers="true" %}

```javascript
const batchSize = 250;
let batch = [];

documents.forEach((dataPoint, index) => {
    batch.push(dataPoint);
    if ((index + 1) % batchSize === 0) {
        await hyperspaceClient.addBatch(batch, collectionName);
        batch = [];
    }
});

if (batch.length > 0) {
    await hyperspaceClient.addBatch(collectionName, documents)
};
hyperspaceClient.commit(collectionName);
```

{% endcode %}
{% endtab %}
{% endtabs %}

In the above example, Hyperspace will assign the each document with a random id. If you want to manually assign id , each document must include an id type field, as explained in [Database schema config file](https://docs.hyper-space.io/hyperspace-docs/projects/setting-up/creating-a-database-schema-configuration-file). The id must be of type keyword/string.

{% hint style="info" %}
At the moment, it is not possible to upload additional documents after commit. This will be changed in next versions.
{% endhint %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.hyper-space.io/hyperspace-docs/flows/data-collections/uploading-data.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
