RAG Performance

The RAG Performance report evaluates how well the retrieval-augmented generation pipeline serves agents. It covers answer quality, response speed, and per-knowledge-base effectiveness.

imageRAG performance dashboard with acceptance rate chart, response time trends, source relevance histogram, and KB effectiveness table

RAG Performance report overview

Answer Quality Metrics

Metric	Description
Acceptance Rate	Percentage of RAG-generated suggestions that agents accepted.
Rejection Reasons	Breakdown of why agents rejected suggestions (irrelevant, inaccurate, incomplete, other).

The acceptance rate is the primary indicator of RAG quality. A declining rate signals that knowledge base content may need review.

Response Time Trends

Two timing metrics are tracked over the selected date range:

Metric	Description
TTFT (Time to First Token)	Elapsed time from query submission to the first token of the generated answer.
Total Generation Time	Elapsed time from query submission to complete answer delivery.

Trend charts display these values over time, making it easy to spot latency regressions.

TIP

A sudden increase in TTFT often points to infrastructure issues (database latency, LLM provider slowdowns), while a gradual increase in total generation time may indicate growing answer complexity.

Source Relevance Distribution

A histogram shows the distribution of relevance scores for retrieved document chunks. This helps you understand whether the vector search is returning high-quality matches or marginal content.

KB Effectiveness

The per-knowledge-base breakdown compares accepted and rejected suggestion rates for each KB:

Column	Description
Knowledge Base	Name of the knowledge base.
Queries	Number of RAG queries that retrieved content from this KB.
Accepted	Number of accepted suggestions sourced from this KB.
Rejected	Number of rejected suggestions sourced from this KB.
Acceptance Rate	Accepted / (Accepted + Rejected) as a percentage.

Top and Underperforming KBs

The report highlights:

Top performing -- Knowledge bases with the highest acceptance rates and query volumes.
Underperforming -- Knowledge bases with acceptance rates below the tenant average.

WARNING

An underperforming knowledge base does not necessarily contain bad content. It may be matched against queries outside its intended scope. Review the Gap Analysis report to determine whether content updates or scope adjustments are needed.

Feedback-Based Reranking Impact

If feedback-based reranking is enabled, the RAG pipeline adjusts chunk rankings based on historical agent feedback. Over time, this should improve the acceptance rate as consistently unhelpful chunks are deprioritized. Compare acceptance rates before and after enabling reranking to measure the impact.

RAG Performance ​

Answer Quality Metrics ​

Response Time Trends ​

Source Relevance Distribution ​

KB Effectiveness ​

Top and Underperforming KBs ​

Feedback-Based Reranking Impact ​