Bool Query

The Hyperspace 'bool' query enables you to construct complex queries by combining multiple sub-queries and conditions. The bool query supports sub-clauses such as must, should, must_not, should_not and filter.

'must' Clause

The 'must' clause specifies conditions that must be satisfied for a document to be considered a match. In terms of logical operators, it corresponds to the 'and' operator.

'must' Clause Score

Under the 'must' clause, each element is assigned a probabilistic rarity score, and the total score for the 'must' clause is calculated by combining these individual scores. Unless specified otherwise, the score is based on the TF-IDF scoring model. The 'must' clause is used when you want all specified conditions to be satisfied for a document to be considered a match. Since each score represents a different probability, the combined score is a product of the individual scores.

Score = score1 * score2 * score3...

Note – This scoring method assumes that 'must' clauses are independent of one another.

Example –

{
  "query": {
    "bool": {
      "must": [
        { "term": { "Bird": "Asian Koel" } },
        { "range": { "price": { "gte": 10} } },
        { "term": { "In Stock": "true" } }
      ]
    }
  }
}

In the example above, all candidates must satisfy all three conditions –

Exact match over the 'bird' field, with a score that equals the TF-IDF score for { "Bird": "Asian Koel" }.
The 'price' field must be greater than or equal to 10.
A match with the "In Stock" field, with a score that equals the TF-IDF score for { "In Stock": "true" }.

The overall score will be a product of the individual scores.

'must_not' Clause

In Hyperspace, the 'must_not' clause specifies conditions that must not be satisfied for a document to be considered a match. In terms of logical operators, it corresponds to 'not' or 'and not' operators. The must_not clause is used when you do not want any of the specified conditions to be satisfied for a document to be considered a match.

Example –

In the example below, all candidates must satisfy the following condition –

Exact match over the 'Color' field

All candidates must also fail to meet any of the three conditions –

Exact match of the 'bird' field
The 'price' field must be less than 10
Exact match of the "In Stock" field

{
  "query": {
    "bool": {
     "must": 
        { "term": { "Color": "Black" } },
      "must_not": [
        { "term": { "Bird": "Asian Koel" } },
        { "range": { "price": { "gte": 10} } },
        { "term": { "In Stock": "True" } }
      ]
    }
  }
}

'should' Clause

In Hyperspace, the should clause in a bool query is used to specify conditions that are optional for a document to be considered a match. Unlike the must clause, which imposes mandatory conditions, the should clause only modifies the document score, and allows for flexibility by indicating that any of the specified conditions can be satisfied for a document to contribute to the search results. The should clause is often used to express optional or desirable conditions.

'should' Clause Score

Within the 'should' clause, each condition is associated with a designated score, and the overall score for the 'should' clause is determined by combining these individual scores. If not explicitly specified otherwise, scoring follows the TF-IDF scoring model. The 'should' clause is employed when you desire flexibility, as it allows for documents to be considered a match if they satisfy any of the specified conditions.

The combined score is the sum of the individual scores.

Score = score1 + score2 + score3...

Example -

{
  "query": {
    "bool": {
      "must": { "Bird": "Asian Koel" }
      "should": [
        { "term": { "Country": "India" } },
        { "term": { "Color": "Black" } }
      ]
    }
  }
}

In the example above, all candidates must satisfy the following conditions –

Exact match of the 'bird' field, with a score that equals the TF-IDF score for { "Bird": "Asian Koel" }

In addition, any documents that satisfy the following conditions are assigned a higher score. The overall score is the sum of the individual scores.

Exact match of the 'Country' field, with a score that equals the TF-IDF score for { "Bird": "Asian Koel" }
Exact match of the 'Color' field, with a score that equals the TF-IDF score for { "In Stock": "true" }

'should_not' Clause

The 'should_not' clause in a 'bool' query specifies conditions that a document should not meet, reducing its score if these conditions are met. Conversely, the 'should' clause increases the score for documents meeting optional or preferred conditions, enhancing their relevance.

'should_not' Clause Score

In the 'should_not' clause, each condition is associated with a specific score, and the clause's overall score is determined by subtracting these individual scores. Unless explicitly specified otherwise, scoring is based on the TF-IDF scoring model. The 'should_not'clause reduces a document's match potential if it meets any of the specified conditions.

The combined score is calculated by subtracting the individual scores –

Score = -score1 - score2 - score3...

Example -

{
  "query": {
    "bool": {
      "must": { "Bird": "Asian Koel" }
      "should_not": [
        { "term": { "Country": "India" } },
        { "term": { "Color": "Black" } }
      ]
    }
  }
}

In the example above , all candidates must satisfy the conditions -

Exact match of the 'bird' field, with a score that equals the TF-IDF score for { "Bird": "Asian Koel" }

In addition, any documents that satisfy the following conditions are assigned a lower score –

Exact match of the 'Country' field, with a score that equals the TF-IDF score for { "Bird": "Asian Koel" }
Exact match of the "Color" field, with a score that equals the TF-IDF score for { "In Stock": "true" }

The overall score is the sum of the individual scores.

'filter' Clause

The filterclause in a bool query specifies mandatory conditions for a document to be considered a match in a similar manner to the'must' clause. However, unlike the mustclause, the filter clause does not affect the document score.

Example -

{
  "query": {
    "bool": {
      "must": 
        { "term": { "Bird": "Asian Koel" } }
      ,
      "filter": [
        { "term": { "Country": "India" } },
        { "term": { "Color": "Black" } }
      ]
    }
  }
}

In the above example, all candidates must satisfy the following conditions –

Exact match of the 'bird' field, with a score that equals the TF-IDF score for { "Bird": "Asian Koel" }

In addition, any matched documents must satisfy the following conditions –

exact match of the 'Country' field, with a score that equals the TF-IDF score for { "Bird": "Asian Koel" }
Exact match of the "Color" field, with a score that equals the TF-IDF score for { "In Stock": "true" }

The overall score will be the rarity score, determined by the "must": { "term": { "Bird": "Asian Koel" } } clause.

PreviousAggregations NextCandidate Generation and Metadata Filtering

Last updated 1 year ago