Skip to content

RAG Performance

The RAG Performance report evaluates how well the retrieval-augmented generation pipeline serves agents. It covers answer quality, response speed, and per-knowledge-base effectiveness.

imageRAG performance dashboard with acceptance rate chart, response time trends, source relevance histogram, and KB effectiveness table
RAG Performance report overview

Answer Quality Metrics

MetricDescription
Acceptance RatePercentage of RAG-generated suggestions that agents accepted.
Rejection ReasonsBreakdown of why agents rejected suggestions (irrelevant, inaccurate, incomplete, other).

The acceptance rate is the primary indicator of RAG quality. A declining rate signals that knowledge base content may need review.

Two timing metrics are tracked over the selected date range:

MetricDescription
TTFT (Time to First Token)Elapsed time from query submission to the first token of the generated answer.
Total Generation TimeElapsed time from query submission to complete answer delivery.

Trend charts display these values over time, making it easy to spot latency regressions.

TIP

A sudden increase in TTFT often points to infrastructure issues (database latency, LLM provider slowdowns), while a gradual increase in total generation time may indicate growing answer complexity.

Source Relevance Distribution

A histogram shows the distribution of relevance scores for retrieved document chunks. This helps you understand whether the vector search is returning high-quality matches or marginal content.

KB Effectiveness

The per-knowledge-base breakdown compares accepted and rejected suggestion rates for each KB:

ColumnDescription
Knowledge BaseName of the knowledge base.
QueriesNumber of RAG queries that retrieved content from this KB.
AcceptedNumber of accepted suggestions sourced from this KB.
RejectedNumber of rejected suggestions sourced from this KB.
Acceptance RateAccepted / (Accepted + Rejected) as a percentage.

Top and Underperforming KBs

The report highlights:

  • Top performing -- Knowledge bases with the highest acceptance rates and query volumes.
  • Underperforming -- Knowledge bases with acceptance rates below the tenant average.

WARNING

An underperforming knowledge base does not necessarily contain bad content. It may be matched against queries outside its intended scope. Review the Gap Analysis report to determine whether content updates or scope adjustments are needed.

Feedback-Based Reranking Impact

If feedback-based reranking is enabled, the RAG pipeline adjusts chunk rankings based on historical agent feedback. Over time, this should improve the acceptance rate as consistently unhelpful chunks are deprioritized. Compare acceptance rates before and after enabling reranking to measure the impact.

OmniBots Agent Assist