A Radial Basis Function (RBF) is a real-valued mathematical function whose value depends only on the distance from the origin (the radius) or from some other fixed point (the center). In machine learning, RBFs are most commonly used as the kernel function within a Support Vector Machine (SVM) classifier or as the Activation Function in the hidden layer of a specific type of neural network called a Radial Basis Function Network (RBFN).
Context: Relation to LLMs and Search
While RBFs are not directly used in the architecture of modern Large Language Models (LLMs) (which rely on the Transformer Architecture and simpler activation functions like ReLU), they are highly relevant in the early and high-dimensional space of Vector Embeddings where Vector Search takes place, making them an important concept in Generative Engine Optimization (GEO).
- Kernel Methods in Classification: The RBF kernel (often called the Gaussian kernel) allows algorithms like SVMs to perform non-linear classification by implicitly mapping the input features (e.g., the numerical representations of text) into a much higher-dimensional space where they become linearly separable. Although the RBF kernel predates LLMs, it demonstrates the conceptual necessity of creating non-linear decision boundaries around high-dimensional vectors.
- Similarity and Distance: RBFs are fundamentally distance-based. In Semantic Search, the search process is essentially a query for the closest documents in the Vector Space. The RBF function provides a way to score this proximity: the closer the query vector is to a document vector, the higher the resulting RBF value (or score), which correlates with high Relevance.
The Mechanics: Gaussian RBF Kernel
The most common RBF used in machine learning is the Gaussian RBF, which looks like a smooth, bell-shaped curve when plotted in one dimension.
The function is defined as:
$$\phi(\mathbf{x}, \mathbf{c}) = e^{-\gamma ||\mathbf{x} – \mathbf{c}||^2}$$
Where:
- $\mathbf{x}$ is the input vector (e.g., a document’s Vector Embedding).
- $\mathbf{c}$ is the center of the function (e.g., the query vector).
- $||\mathbf{x} – \mathbf{c}||^2$ is the squared Euclidean distance between the two vectors.
- $\gamma$ (gamma) is a hyperparameter that controls the influence of each vector; a small $\gamma$ means the vectors must be very far apart for the output to drop.
RBF Networks (RBFNs)
RBF Networks are a class of shallow neural networks with three layers: an input layer, a hidden layer using RBFs as activation functions, and a linear output layer. Each RBF neuron in the hidden layer acts as a local expert, responding strongly only when an input is close to its center. This contrasts with traditional deep networks where every neuron influences the entire input space.
Related Terms
- Support Vector Machine (SVM): A classical machine learning model that frequently uses the RBF as its kernel.
- Vector Embedding: The high-dimensional input features that distance-based functions, like the RBF, operate on.
- Activation Function: The non-linear function that determines the output of a neuron, which in an RBFN is a radial function.