Manhattan Distance (L1 Norm)

Manhattan Distance (also known as the L1 Norm, $\ell_1$ distance, or Taxicab geometry) is a Distance Metric used in mathematics and machine learning to calculate the distance between two points in a grid-like space. It is defined as the sum of the absolute differences of their Cartesian coordinates.

Unlike Euclidean Distance (L2 Norm), which measures the shortest straight-line path (as the crow flies), Manhattan Distance measures the distance if one were restricted to traveling only along the grid lines (like a taxi driving along city blocks in Manhattan).

Context: Relation to LLMs and Sparse Vectors

While Vector Embeddings in modern Large Language Models (LLMs) typically use Cosine Similarity or Euclidean Distance for Neural Search, Manhattan Distance remains an important metric for specific types of data, particularly those that are sparse or where feature interpretability is prioritized.

Sparsity and Feature Selection: In older or simpler Natural Language Processing (NLP) models (e.g., those using Bag-of-Words representations where most features are zero), Manhattan Distance can be more robust than Euclidean Distance. Because it calculates the absolute difference, it inherently performs a form of feature selection; dimensions where both vectors are zero do not contribute to the distance, and the difference is directly proportional to the number of non-matching non-zero features.
Feature Interpretability: Manhattan Distance provides a result that is easier to interpret in terms of feature differences. If the features represent individual word counts or specific attributes, the L1 distance is simply the total accumulated difference across all attributes.
Robustness to Outliers: Compared to Mean Squared Error (MSE) or Euclidean Distance (which squares the differences), Manhattan Distance only takes the absolute difference. This makes it less sensitive to extreme outliers in the data, which can sometimes be advantageous in scenarios with noisy Training Set features.

The Manhattan Distance Formula

Given two vectors (points) $A$ and $B$, each with $n$ dimensions:

$$A = (a_1, a_2, \dots, a_n)$$

$$B = (b_1, b_2, \dots, b_n)$$

The Manhattan Distance $D_{M}$ is calculated as the sum of the absolute differences of their coordinates:

$$D_{M}(A, B) = \sum_{i=1}^{n} |a_i – b_i|$$

Example in 2D Space

Consider a point $A=(1, 5)$ and a point $B=(4, 1)$.

Change in x-axis (horizontal): $|4 – 1| = 3$
Change in y-axis (vertical): $|1 – 5| = 4$
Manhattan Distance: $3 + 4 = 7$

In a city grid, to get from (1, 5) to (4, 1), you would walk 3 blocks East and 4 blocks South, for a total distance of 7 blocks.

Related Terms

Distance Metric: The general category of functions to which Manhattan Distance belongs.
Euclidean Distance (L2 Norm): The most common straight-line distance metric, contrasted with Manhattan Distance.
Vector Embedding: The numerical representation of text, where distance metrics are applied.

Appear More in
AI Engines

Dominate results in ChatGPT, Gemini & Claude. Contact us today.

This will take you to WhatsApp