Query Expansion is a technique used in information retrieval and search systems to improve Recall by automatically adding or reformulating terms in the user’s original query before performing a search. The goal is to bridge the gap between the user’s words and the language used in the documents (the lexical mismatch problem), ensuring that all conceptually relevant documents are retrieved.
Context: Relation to LLMs and Search
Query Expansion is a crucial optimization step in the Query Processing pipeline for Retrieval-Augmented Generation (RAG) systems, making it a key strategy in Generative Engine Optimization (GEO).
- Boosting Recall: Query expansion directly addresses the Recall problem in Vector Search. If a user searches for “car fix,” but the knowledge base primarily uses the term “auto maintenance,” expanding the query to include synonyms ensures the necessary documents are retrieved.
- LLM-Powered Reformulation: Modern Large Language Models (LLMs) are used for advanced query expansion. An LLM can be prompted to generate several alternative versions of the user’s question, which capture the same Semantics but use different phrasing. These multiple expanded queries are then embedded and searched in parallel, dramatically increasing the chance of finding the most relevant document chunks for the Context Window.
- GEO Strategy: In enterprise RAG, query expansion is essential for handling complex, verbose, or ambiguous user queries. It maximizes the quality of the initial retrieval, which, in turn, minimizes the risk of Hallucination during the Generation phase of the RAG pipeline.
Types of Query Expansion
Query expansion techniques can be either manually defined or automatically generated:
1. Manual/Lexical Expansion
- Mechanism: Relies on predefined resources like a thesaurus, synonym lists, or a Knowledge Graph (e.g., if the user searches for a product nickname, the system automatically adds the official product name).
- Application: Used primarily in Sparse Retrieval (keyword-based) systems, but can complement Vector Search with known authoritative terms.
2. Relevance Feedback
- Mechanism: After an initial search, the system asks the user to identify which retrieved documents are relevant. The terms from these relevant documents are then added to the original query to refine the search.
- Application: Less common in search engines but used in iterative research tools.
3. Semantic/Neural Expansion (LLM-Based)
- Mechanism: A Transformer Architecture model is used to predict alternative phrasing or related concepts.
- Paraphrasing: Generating sentences that mean the same as the query (e.g., Query: “how to change a flat tire” $\rightarrow$ Expansion: “steps for replacing a punctured tire”).
- Concept Expansion: Adding semantically related terms (e.g., Query: “fast car” $\rightarrow$ Expansion: “sports vehicle,” “high performance auto”).
- Application: The gold standard in modern RAG systems, as it leverages the LLM’s deep understanding of Semantics to maximize retrieval effectiveness.
Drawbacks
The main risk of query expansion is adding terms that are irrelevant to the user’s true intent, which can lower Precision by retrieving “noisy” or non-authoritative documents. This is why the subsequent Reranking step is crucial to filter out the false positives introduced by a highly aggressive expansion strategy.
Related Terms
- Recall: The primary metric that query expansion is designed to improve.
- Query Processing: The overall workflow in which query expansion is an early step.
- Vector Embedding: The numerical representation that expanded queries are converted into for searching.