Skip to content

Architecture

Agent Assist is composed of several services that work together to deliver real-time AI suggestions, coaching, summaries, and analysis to contact center agents. This page explains what each component does, how they communicate, and how the two operating modes differ.

System Overview

imageAgent Assist architecture diagram showing CCaaS platform, connector service, RAG service, widget, portal, and database
High-level architecture of Agent Assist
┌─────────────────┐     Pub/Sub      ┌──────────────────────────┐       HTTP        ┌──────────────────┐
│  CCaaS Platform  │ ──────────────> │  assist-connector-service │ <──────────────── │  agent-assist-    │
│  (Genesys, 5nine │                  │  (Socket.IO hub)          │                   │  portal (admin)   │
│   8x8, SF, etc.) │                  └────────────┬─────────────┘                   └──────────────────┘
└─────────────────┘                                │
                                         Socket.IO │

                        ┌──────────────────────────┼──────────────────────┐
                        │                          │                      │
                        ▼                          ▼                      ▼
               ┌─────────────────┐     ┌───────────────────┐   ┌──────────────────┐
               │   rag-service    │     │ agent-assist-widget│   │  AlloyDB + Redis  │
               │ (classifier +    │     │ (Vue 3 frontend)   │   │  (persistence +   │
               │  search + LLM)   │     └───────────────────┘   │   session/cache)   │
               └─────────────────┘                              └──────────────────┘

               ┌─────────────────┐     ┌───────────────────┐             │
               │ assist-audiohook │     │ assist-middleware  │ ───────────┘
               │ (voice STT)      │     │ (CCaaS gateway)    │
               └─────────────────┘     └───────────────────┘

Services

ServiceRoleTechnology
assist-connector-serviceCentral hub. Receives CCaaS events via Pub/Sub, runs the RAG pipeline in omni mode, manages sessions, orchestrates coaching, and pushes suggestions to the widget over Socket.IO.FastAPI + python-socketio
rag-serviceHandles utterance classification, document embedding, vector search (pgvector), and LLM answer streaming. Called by the connector service for each qualifying message.FastAPI + pgvector
assist-audiohook-serviceStreams real-time audio from voice calls, transcribes speech via STT, and forwards transcripts to the connector service via Pub/Sub for processing.FastAPI + WebSocket
assist-middleware-serviceGenesys Cloud integration gateway. Manages CCaaS connections, queue-to-KB mappings, and conversation lifecycle bridging.FastAPI
agent-assist-widgetVue 3 frontend embedded in the agent desktop (Genesys, Salesforce, 8x8, Five9, Google CCaaS, or standalone). Connects to the connector service over Socket.IO. Renders suggestions, transcript, summary, analysis, and coaching panels.Vue 3 + Vite
agent-assist-portalVue 3 admin dashboard for configuration, analytics, coaching playbook management, knowledge base administration, and service health monitoring.Vue 3 + PrimeVue + Vite
Dialogflow CCAIGoogle-managed suggestion engine used in google_native mode. Publishes suggestion events to Pub/Sub, which the connector service relays to the widget.Google Cloud

Operating Modes

Agent Assist supports two modes, configured per tenant via the assist_mode setting.

Omni Mode (omni)

In omni mode, the connector service runs the full RAG pipeline locally:

  1. Classify -- A lightweight LLM classifier determines whether a knowledge base lookup is warranted, extracts a clean query, and optionally generates quick replies for noise messages.
  2. Search -- The RAG service performs a vector similarity search against the tenant's knowledge bases, returning the top-k relevant document chunks with feedback-based reranking.
  3. Stream -- An LLM generates a suggestion grounded in the retrieved chunks. The answer streams chunk-by-chunk over Socket.IO so the agent sees it progressively.
  4. Deliver -- Source citations, follow-up questions, and quick replies are delivered alongside the answer.
  5. Coach -- If coaching is enabled, the coaching engine evaluates the utterance against active playbooks or generates AI-based guidance.
Customer message
  → Pub/Sub event (or audiohook transcript)
    → assist-connector-service
      → Check classifier cache (Redis)
        → rag-service /retrieve (classify + vector search)
          → rag-service /stream (LLM answer via SSE)
            → Socket.IO → widget (streaming chunks)
      → Coaching engine (async, parallel)
        → Socket.IO → widget (coaching suggestion)

When to use omni mode

Use omni mode when you want full control over the knowledge sources, embedding models, and LLM used for answer synthesis. This is the default and recommended mode for most deployments.

Google Native Mode (google_native)

In google_native mode, Dialogflow CCAI handles suggestion generation. The connector service acts as a pass-through:

  1. Dialogflow processes the conversation transcript and generates suggestions using its own knowledge connectors.
  2. Pub/Sub delivers suggestion events to the connector service.
  3. Connector relays the suggestions to the widget over Socket.IO without modification.
Customer message
  → Dialogflow CCAI (processes + generates suggestion)
    → Pub/Sub event
      → assist-connector-service (relay)
        → Socket.IO → widget

Limitations of google_native mode

In google_native mode, you cannot use OmniBots knowledge bases or customize the RAG pipeline. All suggestion logic is managed by Google Dialogflow CCAI. Feedback analytics are still collected but do not influence suggestion quality directly.

Event Flow

All communication between the CCaaS platform and the agent widget passes through the connector service. Here is the detailed event flow for a typical interaction.

1. Connection & Session

When an agent opens the widget:

  1. The widget authenticates via embed token + provider OAuth (e.g., Genesys PKCE, Salesforce token exchange)
  2. The connector service validates credentials and issues a session JWT
  3. The widget connects via Socket.IO and emits join-conversation with the conversation name
  4. The connector service resolves session config (KB IDs, system prompt, LLM integration) from queue mappings or deployment key
  5. The widget receives conversation-joined and session-updated events

2. Customer Message (Omni Mode)

When the customer sends a message or the audiohook service transcribes speech:

  1. A NEW_RECOGNITION_RESULT or NEW_MESSAGE event arrives via Pub/Sub
  2. The connector service checks the classifier cache (Redis, 10-min TTL)
  3. On cache miss: calls rag-service /retrieve which runs classification + vector search in parallel
  4. If classified as noise: emits rag-quickreplies-event with quick replies and returns
  5. If meaningful: emits rag-content-event with sources, then streams answer chunks as rag-content-event messages
  6. On stream complete: emits rag-complete-event with the final answer, sources, and timing metrics
  7. Follow-up questions are emitted as rag-followup-event
  8. Results are cached in Redis (5-min TTL) for identical queries

3. Agent Feedback

When an agent rates a suggestion or source:

  1. The widget emits rag-answer-feedback (thumbs up/down with optional reason code and comment) or rag-source-feedback (per-source rating)
  2. The connector service persists feedback to AlloyDB
  3. Source feedback updates kb_documents.feedback_score -- chunks with low scores are automatically suppressed from future results
  4. The widget receives a feedback-received acknowledgement

4. Coaching

When coaching is enabled for the tenant:

  1. On each customer utterance, the coaching engine evaluates against active playbooks (async, parallel to RAG)
  2. Deterministic mode: Matches playbook conditions and emits step-by-step guidance
  3. Generative mode: Uses LLM + KB context to generate situation-aware coaching
  4. Hybrid mode: Tries playbook match first, falls back to generative
  5. The widget receives coach-suggestion and coach-step-update events
  6. Agents provide coaching feedback which is tracked for playbook effectiveness

5. Summary & Analysis

On-demand or auto-refreshing:

  1. The widget or portal requests a summary via POST /conversations/{id}/summary
  2. The connector service builds a transcript from stored messages and calls an LLM to generate situation/action/next-steps
  3. Analysis (POST /conversations/{id}/analysis) generates customer sentiment, conversation quality scores, key topics, talk time breakdown, and compliance risk assessment
  4. Results are stored in assist_conversation_analyses and returned to the widget

6. Conversation End

When the conversation closes:

  1. A leave-conversation event is emitted (or disconnect detected)
  2. The connector service marks the conversation as completed in AlloyDB
  3. If applicable, triggers export to CCAI Insights for reporting
  4. Session state is cleaned up from Redis

Audio Streaming

For voice conversations, the assist-audiohook-service provides real-time transcription:

StageDescription
Audio captureThe CCaaS platform streams audio via WebSocket (typically 16kHz PCM)
TranscriptionThe audiohook service uses a speech-to-text engine to produce interim and final transcripts
PII redactionTranscripts are passed through the PII redactor before storage
ForwardingFinal transcripts are published to Pub/Sub and received by the connector service
ProcessingThe connector service runs the same RAG pipeline used for chat messages

TIP

The audiohook service supports the Genesys AudioHook protocol and generic WebSocket audio streams. Configure the audio format and encoding in the integration settings.

Data Storage

DataStorageRetention
Conversation sessions & messagesAlloyDBConfigurable per tenant (default 90 days)
Suggestions, feedback, and analysesAlloyDBSame as conversation retention
Document embeddingsAlloyDB with pgvectorPersisted until document is deleted
Real-time session stateRedisDuration of conversation + 1 hour TTL
Classifier cacheRedis10-minute TTL
RAG result cacheRedis5-minute TTL
Coaching step stateRedisDuration of conversation
Widget translationsRedis24-hour TTL
Audio streamsNot persistedProcessed in real-time, discarded after transcription
LLM token usageAlloyDB (ai_usage_records)Indefinite (billing records)

Security

  • All service-to-service communication uses internal networking (no public endpoints for backend services)
  • The widget authenticates via provider-specific OAuth (Genesys PKCE, Salesforce token exchange, etc.) followed by a tenant-scoped JWT
  • Embed tokens are signed JWTs with allowed origin restrictions, validated on every bootstrap request
  • Pub/Sub subscriptions use GCP IAM service accounts
  • PII in conversation transcripts is redacted before storage using configurable redaction rules
  • Tenant isolation is enforced on every database query via tenant_id filtering
  • Feedback-based chunk suppression prevents low-quality sources from surfacing

Key Files

FileDescription
backend/services/assist-connector-service/app/socket_handlers.pySocket.IO event registration
backend/services/assist-connector-service/app/handlers/rag_trigger.pyRAG pipeline: classify → retrieve → stream → deliver
backend/services/assist-connector-service/app/handlers/coach_engine.pyCoaching orchestration (deterministic, generative, hybrid)
backend/services/assist-connector-service/app/handlers/session.pySession config resolution (queue, deployment key, provider)
backend/services/assist-connector-service/app/handlers/feedback.pyAnswer and source feedback persistence
backend/services/assist-connector-service/app/services/session_manager.pyRedis-backed session and transcript history
backend/services/assist-connector-service/app/services/pubsub_subscriber.pyPub/Sub event routing and PII redaction
backend/services/rag-service/app/routes/assist.py/retrieve and /stream endpoints
backend/services/rag-service/app/services/assist_classifier.pyLLM-based utterance classifier
frontend/agent-assist-widget/src/views/WidgetView.vueMain widget layout (transcript, suggestions, summary, analysis, coaching)
frontend/agent-assist-widget/src/services/socket.tsSocket.IO client event handling
frontend/agent-assist-portal/src/views/DashboardView.vueAdmin analytics dashboard

Next Steps

Last updated:

OmniBots Agent Assist