Recall

Recall (also known as sensitivity or completeness) is a critical metric in information retrieval and machine learning used to measure the proportion of all truly relevant items (documents, data points, or entities) that a system successfully retrieved or identified. In essence, it measures how complete the system’s search results are.

Context: Relation to LLMs and Search

Recall is one of the two core metrics (along with Precision) used to evaluate the performance of the Retrieval phase in a Retrieval-Augmented Generation (RAG) system, making it essential for Generative Engine Optimization (GEO).

Completeness of Context: In a RAG pipeline, high recall means the system successfully found and retrieved all the document chunks that contain the facts necessary to answer the user’s query. If a key fact is missing, the LLM will not have it in its Context Window and may fail to generate a complete or accurate Generative Snippet.
Mitigating Hallucination: While Precision focuses on filtering out garbage, recall focuses on ensuring the model has enough relevant signal. Low recall can lead to Hallucination because the LLM will fall back on its internal, potentially outdated, knowledge when the required information is not present in the retrieved context.
Semantic Search Effectiveness: Recall is often improved by moving from keyword-based search to Semantic Search (Vector Search). By using Vector Embeddings to match the meaning of the query, the system can find relevant documents even if they use different vocabulary, thereby increasing the number of relevant items retrieved and boosting recall.

The Mechanics: The Formula

Recall is defined mathematically as:

$$\text{Recall} = \frac{\text{Number of Relevant Documents Retrieved}}{\text{Total Number of Relevant Documents in Corpus}}$$

This can also be expressed in terms of True Positives (TP) and False Negatives (FN):

$$\text{Recall} = \frac{\text{TP}}{\text{TP} + \text{FN}}$$

True Positive (TP): A relevant document that was correctly retrieved.
False Negative (FN): A relevant document that the system failed to retrieve (a miss).

Recall vs. Precision

Recall and Precision are typically in a trade-off relationship, and GEO optimization efforts involve finding the ideal balance for a given application.

Feature	Recall	Precision
Focus	Completeness (Minimizing misses/False Negatives).	Accuracy (Minimizing noise/False Positives).
Question	Out of all the answers that exist, how many did the system find?	Out of the answers the system found, how many were correct?
Strategy	Broaden the search scope (e.g., retrieve more vectors).	Narrow the search scope (e.g., use Reranking).

A perfect recall score of 1.0 means the system found every single relevant document in the entire corpus, regardless of how many irrelevant documents were also retrieved (which would impact precision).

Related Terms

Precision: The sister metric to recall, measuring the quality of the retrieved set.
Ground Truth: The human-labeled, correct set of relevant documents used as the denominator to calculate recall.
F1-Score: The harmonic mean of precision and recall, often used as a single, combined metric to evaluate overall retrieval performance.

Appear More in
AI Engines

Dominate results in ChatGPT, Gemini & Claude. Contact us today.

This will take you to WhatsApp