Vector Search Metrics
Last updated
Last updated
The Metric defines the distance measure between vectors during the vector search. From a mathematical point of view, the K nearest neighbors are those with the smallest distance from the search query vector. By contrast, in vector search higher score reflects greater similarity. Hyperspace bridges this gap using algebraic relations between the distance and score, as described below.
l2
ip
hamming
The metric represents the distance between 2 points in a Euclidean space. The metric is derived from the Pythagorean theorem and represents the length of the shortest path between the two points.
The metric is a special case of the metric, given by
In Hyperspace, we use the squared metric.
The inner product metric, denoted as , is a measure of similarity between two vectors. The mathematical operation calculates the projection of one vector on the other, which is 1 if the vectors are parallel and 0 if they are perpendicular. In a Euclidean distance, the inner product is given by
The Hamming distance quantifies the difference between two strings of equal length by measuring the number of positions at which the corresponding symbols differ. For binary strings, it counts the bits that are different between two binary vectors. The lower the Hamming distance, the more similar the strings are considered to be, and vice versa.
The hamming distance is calculated as
where is the Kronecker delta. The above formula counts the number of characters that are different between and .