Free Text Search

Hyperspace queries support free text search using the standard analyzer. This analyzer includes the following functionalities

  • Tokenization - Splits text into individual terms based on whitespace and punctuation, using the "standard tokenizer."

  • Lowercasing - Converts all tokens to lowercase.

  • Removing Punctuation - Strips most punctuation from the tokens.

  • Removing Accents - Strips accents from characters (e.g., é becomes e).

Currently, only the English stemmer is supported.

Example

{
  "query": {
    "bool": {
      "match": {
          "content": "The naïve approach to text search can be sufficient in many cases" 
        }
    }
  }
}

In the above example, the text will converted to the following list of keywords

["The", "naive", "approach", "to", "text", "search", "can", "be", "sufficient", "in", "many", "cases"]

These keywords will then be matched with the field "content" of database documents.

Last updated