Skip to content

Reranking & Feedback Scoring

Agent Assist uses a multi-stage ranking pipeline to ensure the most relevant knowledge base content surfaces for each query. Agent feedback (thumbs up/down on sources) directly influences future search results through a feedback scoring system.

Search Pipeline Overview

Customer utterance


Classifier (is this a meaningful query?)
  │                          ┌──────────────────────┐
  ▼                          │ Runs in PARALLEL      │
Vector Search (pgvector)     │ to save ~150ms        │
  │                          └──────────────────────┘

Min Score Filter (drop low-relevance chunks)


Feedback Boost (re-sort by feedback-adjusted score)


Top K Results → Context → LLM → Answer

The foundation of every search. Each query is converted to a vector embedding and compared against all chunk embeddings using cosine similarity.

sql
SELECT content, 1 - (embedding <=> query_embedding) AS score
FROM chunks
WHERE tenant_id = :tenant_id
  AND knowledge_base_id = ANY(:kb_ids)
  AND embedding_status = 'completed'
ORDER BY embedding <=> query_embedding
LIMIT :top_k
  • Score range: 0.0 (no match) to 1.0 (identical)
  • Min score threshold: Configurable per tenant (default 0.5). Chunks below this score are discarded.
  • Embedding model: Vertex AI gemini-embedding-001 (3072 dimensions)
  • Suppressed chunks excluded: Chunks with embedding_status = 'suppressed' are filtered out at the SQL level and never appear in results.

Stage 2: Feedback-Boosted Reranking

When enabled, agent feedback adjusts the ranking of search results. Chunks that agents consistently find helpful are boosted higher; chunks they reject sink lower.

The Formula

combined_score = vector_score × (1 + weight × feedback_score × confidence)

Where:

  • vector_score: Original cosine similarity (0.0 to 1.0)
  • weight: How much influence feedback has (default: 0.15 = 15%)
  • feedback_score: Running average of agent votes (-1.0 to +1.0)
  • confidence: How much to trust the feedback score, based on vote count
confidence = min(feedback_count, max_influence) / max_influence

The max_influence cap (default: 20 votes) prevents any single chunk from having outsized influence. After 20 votes, additional votes still update the running average but don't increase confidence further.

Example

ChunkVector ScoreFeedback ScoreVote CountConfidenceCombined Score
A0.82+0.6150.750.82 × (1 + 0.15 × 0.6 × 0.75) = 0.875
B0.85-0.380.400.85 × (1 + 0.15 × -0.3 × 0.40) = 0.835
C0.80+0.8251.000.80 × (1 + 0.15 × 0.8 × 1.0) = 0.896

Result: Chunk C (lower vector score but strong positive feedback) ranks first. Chunk B (highest vector score but negative feedback) drops to last.

Configuration

Configure in Settings > Feedback Rerank in the Agent Assist Portal:

SettingDefaultRangeDescription
Feedback Rerank EnabledOffon/offToggle feedback-adjusted scoring
Feedback Weight0.150.0 – 1.0How much feedback influences ranking. 0.0 = no influence, 1.0 = feedback dominates.
Max Influence201 – 100Vote count cap for confidence calculation. Higher = requires more votes to reach full confidence.

TIP

Start with the defaults (15% weight, 20 max influence). Only increase the weight if you have high feedback volume and trust your agents' judgment. Setting it too high can cause feedback bias — popular but generic content may outrank specific, relevant content.

Feedback Score Calculation

Every time an agent clicks thumbs up or thumbs down on a source citation, the chunk's feedback score is updated.

Running Average

new_score = (old_score × old_count + vote_value) / (old_count + 1)

Where vote_value = +1.0 for thumbs up, -1.0 for thumbs down.

The score is clamped to the range [-1.0, +1.0].

Example Progression

ActionScoreCount
Initial state0.00
Agent A: thumbs up1.01
Agent B: thumbs down0.02
Agent C: thumbs up0.3333
Agent D: thumbs up0.54
Agent E: thumbs down0.25

The running average naturally smooths out individual opinions. A chunk needs sustained negative feedback to drop significantly.

Automatic Chunk Suppression

Chunks with consistently negative feedback are automatically suppressed — removed from search results entirely.

Suppression Rules

ConditionThresholdAction
Suppressfeedback_score ≤ -0.7 AND feedback_count ≥ 5Set embedding_status = 'suppressed'
Restorefeedback_score > -0.3 (after being suppressed)Set embedding_status = 'completed'

Why two thresholds? The suppress threshold (-0.7) is much stricter than the restore threshold (-0.3) to prevent chunks from flipping between states. A chunk must be significantly rehabilitated by positive votes before it returns to search results.

What Suppression Means

  • The chunk is not deleted — its content, embedding, and metadata remain in the database
  • It is invisible to search — the SQL WHERE clause filters out embedding_status = 'suppressed'
  • It can be restored if feedback improves (e.g., new agents find it useful)
  • It appears in the knowledge base document list (operations portal) with a "suppressed" indicator

When Suppression Helps

  • An outdated policy document that agents keep rejecting
  • A chunk that's technically accurate but never answers the actual question agents need
  • Duplicate or low-quality content that dilutes search results

The Complete Feedback Loop

1. Agent sees suggestion with source citations

   ├─ Thumbs UP on source
   │  └─ Socket event: "rag-source-feedback" (vote: "up")

   └─ Thumbs DOWN on source
      └─ Socket event: "rag-source-feedback" (vote: "down")
      └─ Optional: reason code (irrelevant, incorrect, too generic, misleading)
      └─ Optional: comment


2. Connector service processes feedback
   ├─ Insert conversation_event (source_accepted / source_rejected)
   │  └─ Stored with query, chunk_id, doc_id, vote, reason, comment

   └─ Update chunk in database
      ├─ feedback_count += 1
      ├─ feedback_score = running_average(old, vote)
      ├─ Check suppression: score ≤ -0.7 AND count ≥ 5 → suppress
      └─ Check restoration: score > -0.3 AND was suppressed → restore


3. Next search for same query
   ├─ Vector search returns candidates
   ├─ Suppressed chunks excluded (SQL WHERE)
   ├─ Feedback boost applied (if enabled):
   │  combined = vector_score × (1 + weight × feedback_score × confidence)
   ├─ Results re-sorted by combined_score
   └─ Top K returned to agent


4. Agent sees improved results
   ├─ Previously helpful chunks ranked higher
   ├─ Previously unhelpful chunks ranked lower or suppressed
   └─ Cycle continues — each vote makes future results more relevant

Analytics & Monitoring

Feedback data is tracked in two places:

1. Conversation Events (Analytics)

Every vote is recorded as a source_accepted or source_rejected event with full metadata. This powers:

  • Feedback Analytics view: Acceptance rates, rejection reasons, per-agent breakdown
  • Gap Analysis: Queries with consistently rejected sources are flagged as knowledge gaps
  • Source Performance: Which documents/chunks are most/least helpful

2. Chunk Model (Scoring)

The feedback_score and feedback_count on each chunk enable:

  • Feedback-boosted reranking: Real-time score adjustment in search
  • Auto-suppression: Removing consistently unhelpful content
  • KB health monitoring: Identify chunks that need updating

Best Practices

Enable Feedback Reranking After 2 Weeks

Don't enable feedback reranking on day one — you need enough feedback data for it to be meaningful. Wait until you have at least 100 feedback events across your knowledge base, then enable with the default settings.

Monitor Suppressed Chunks

Regularly check the operations portal for suppressed chunks. They may indicate content that needs updating rather than removing. If a chunk about a valid topic keeps getting rejected, the content may be accurate but poorly written — rewrite it instead of leaving it suppressed.

Use Reason Codes

Encourage agents to select a reason when rejecting sources (irrelevant, incorrect, too generic, misleading). This data appears in Gap Analysis and helps you understand WHY content is failing, not just that it failed.

Don't Set Weight Too High

A feedback weight above 0.3 can cause popular-but-generic content to outrank specific, relevant content. The feedback signal is noisy — agents sometimes reject good content because they already knew the answer, not because the content was wrong.

OmniBots Agent Assist