Reranking & Feedback Scoring

Agent Assist uses a multi-stage ranking pipeline to ensure the most relevant knowledge base content surfaces for each query. Agent feedback (thumbs up/down on sources) directly influences future search results through a feedback scoring system.

Search Pipeline Overview

Customer utterance
  │
  ▼
Classifier (is this a meaningful query?)
  │                          ┌──────────────────────┐
  ▼                          │ Runs in PARALLEL      │
Vector Search (pgvector)     │ to save ~150ms        │
  │                          └──────────────────────┘
  ▼
Min Score Filter (drop low-relevance chunks)
  │
  ▼
Feedback Boost (re-sort by feedback-adjusted score)
  │
  ▼
Top K Results → Context → LLM → Answer

Stage 1: Vector Similarity Search

The foundation of every search. Each query is converted to a vector embedding and compared against all chunk embeddings using cosine similarity.

sql

SELECT content, 1 - (embedding <=> query_embedding) AS score
FROM chunks
WHERE tenant_id = :tenant_id
  AND knowledge_base_id = ANY(:kb_ids)
  AND embedding_status = 'completed'
ORDER BY embedding <=> query_embedding
LIMIT :top_k

Score range: 0.0 (no match) to 1.0 (identical)
Min score threshold: Configurable per tenant (default 0.5). Chunks below this score are discarded.
Embedding model: Vertex AI gemini-embedding-001 (3072 dimensions)
Suppressed chunks excluded: Chunks with embedding_status = 'suppressed' are filtered out at the SQL level and never appear in results.

Stage 2: Feedback-Boosted Reranking

When enabled, agent feedback adjusts the ranking of search results. Chunks that agents consistently find helpful are boosted higher; chunks they reject sink lower.

The Formula

combined_score = vector_score × (1 + weight × feedback_score × confidence)

Where:

vector_score: Original cosine similarity (0.0 to 1.0)
weight: How much influence feedback has (default: 0.15 = 15%)
feedback_score: Running average of agent votes (-1.0 to +1.0)
confidence: How much to trust the feedback score, based on vote count

confidence = min(feedback_count, max_influence) / max_influence

The max_influence cap (default: 20 votes) prevents any single chunk from having outsized influence. After 20 votes, additional votes still update the running average but don't increase confidence further.

Example

Chunk	Vector Score	Feedback Score	Vote Count	Confidence	Combined Score
A	0.82	+0.6	15	0.75	0.82 × (1 + 0.15 × 0.6 × 0.75) = 0.875
B	0.85	-0.3	8	0.40	0.85 × (1 + 0.15 × -0.3 × 0.40) = 0.835
C	0.80	+0.8	25	1.00	0.80 × (1 + 0.15 × 0.8 × 1.0) = 0.896

Result: Chunk C (lower vector score but strong positive feedback) ranks first. Chunk B (highest vector score but negative feedback) drops to last.

Configuration

Configure in Settings > Feedback Rerank in the Agent Assist Portal:

Setting	Default	Range	Description
Feedback Rerank Enabled	Off	on/off	Toggle feedback-adjusted scoring
Feedback Weight	0.15	0.0 – 1.0	How much feedback influences ranking. 0.0 = no influence, 1.0 = feedback dominates.
Max Influence	20	1 – 100	Vote count cap for confidence calculation. Higher = requires more votes to reach full confidence.

TIP

Start with the defaults (15% weight, 20 max influence). Only increase the weight if you have high feedback volume and trust your agents' judgment. Setting it too high can cause feedback bias — popular but generic content may outrank specific, relevant content.

Feedback Score Calculation

Every time an agent clicks thumbs up or thumbs down on a source citation, the chunk's feedback score is updated.

Running Average

new_score = (old_score × old_count + vote_value) / (old_count + 1)

Where vote_value = +1.0 for thumbs up, -1.0 for thumbs down.

The score is clamped to the range [-1.0, +1.0].

Example Progression

Action	Score	Count
Initial state	0.0	0
Agent A: thumbs up	1.0	1
Agent B: thumbs down	0.0	2
Agent C: thumbs up	0.333	3
Agent D: thumbs up	0.5	4
Agent E: thumbs down	0.2	5

The running average naturally smooths out individual opinions. A chunk needs sustained negative feedback to drop significantly.

Automatic Chunk Suppression

Chunks with consistently negative feedback are automatically suppressed — removed from search results entirely.

Suppression Rules

Condition	Threshold	Action
Suppress	feedback_score ≤ -0.7 AND feedback_count ≥ 5	Set `embedding_status = 'suppressed'`
Restore	feedback_score > -0.3 (after being suppressed)	Set `embedding_status = 'completed'`

Why two thresholds? The suppress threshold (-0.7) is much stricter than the restore threshold (-0.3) to prevent chunks from flipping between states. A chunk must be significantly rehabilitated by positive votes before it returns to search results.

What Suppression Means

The chunk is not deleted — its content, embedding, and metadata remain in the database
It is invisible to search — the SQL WHERE clause filters out embedding_status = 'suppressed'
It can be restored if feedback improves (e.g., new agents find it useful)
It appears in the knowledge base document list (operations portal) with a "suppressed" indicator

When Suppression Helps

An outdated policy document that agents keep rejecting
A chunk that's technically accurate but never answers the actual question agents need
Duplicate or low-quality content that dilutes search results

The Complete Feedback Loop

1. Agent sees suggestion with source citations
   │
   ├─ Thumbs UP on source
   │  └─ Socket event: "rag-source-feedback" (vote: "up")
   │
   └─ Thumbs DOWN on source
      └─ Socket event: "rag-source-feedback" (vote: "down")
      └─ Optional: reason code (irrelevant, incorrect, too generic, misleading)
      └─ Optional: comment
         │
         ▼
2. Connector service processes feedback
   ├─ Insert conversation_event (source_accepted / source_rejected)
   │  └─ Stored with query, chunk_id, doc_id, vote, reason, comment
   │
   └─ Update chunk in database
      ├─ feedback_count += 1
      ├─ feedback_score = running_average(old, vote)
      ├─ Check suppression: score ≤ -0.7 AND count ≥ 5 → suppress
      └─ Check restoration: score > -0.3 AND was suppressed → restore
         │
         ▼
3. Next search for same query
   ├─ Vector search returns candidates
   ├─ Suppressed chunks excluded (SQL WHERE)
   ├─ Feedback boost applied (if enabled):
   │  combined = vector_score × (1 + weight × feedback_score × confidence)
   ├─ Results re-sorted by combined_score
   └─ Top K returned to agent
      │
      ▼
4. Agent sees improved results
   ├─ Previously helpful chunks ranked higher
   ├─ Previously unhelpful chunks ranked lower or suppressed
   └─ Cycle continues — each vote makes future results more relevant

Analytics & Monitoring

Feedback data is tracked in two places:

1. Conversation Events (Analytics)

Every vote is recorded as a source_accepted or source_rejected event with full metadata. This powers:

Feedback Analytics view: Acceptance rates, rejection reasons, per-agent breakdown
Gap Analysis: Queries with consistently rejected sources are flagged as knowledge gaps
Source Performance: Which documents/chunks are most/least helpful

2. Chunk Model (Scoring)

The feedback_score and feedback_count on each chunk enable:

Feedback-boosted reranking: Real-time score adjustment in search
Auto-suppression: Removing consistently unhelpful content
KB health monitoring: Identify chunks that need updating

Best Practices

Enable Feedback Reranking After 2 Weeks

Don't enable feedback reranking on day one — you need enough feedback data for it to be meaningful. Wait until you have at least 100 feedback events across your knowledge base, then enable with the default settings.

Monitor Suppressed Chunks

Regularly check the operations portal for suppressed chunks. They may indicate content that needs updating rather than removing. If a chunk about a valid topic keeps getting rejected, the content may be accurate but poorly written — rewrite it instead of leaving it suppressed.

Use Reason Codes

Encourage agents to select a reason when rejecting sources (irrelevant, incorrect, too generic, misleading). This data appears in Gap Analysis and helps you understand WHY content is failing, not just that it failed.

Don't Set Weight Too High

A feedback weight above 0.3 can cause popular-but-generic content to outrank specific, relevant content. The feedback signal is noisy — agents sometimes reject good content because they already knew the answer, not because the content was wrong.

Reranking & Feedback Scoring ​

Search Pipeline Overview ​

Stage 1: Vector Similarity Search ​

Stage 2: Feedback-Boosted Reranking ​

The Formula ​

Example ​

Configuration ​

Feedback Score Calculation ​

Running Average ​

Example Progression ​

Automatic Chunk Suppression ​

Suppression Rules ​

What Suppression Means ​

When Suppression Helps ​

The Complete Feedback Loop ​

Analytics & Monitoring ​

1. Conversation Events (Analytics) ​

2. Chunk Model (Scoring) ​

Best Practices ​