Theme
Cost & Usage
The Cost & Usage report provides visibility into the resources consumed by Agent Assist. Use it to understand spending patterns, attribute costs to individual agents and features, and forecast future usage.
KPI Cards
The top-level summary cards show period totals:
| Metric | Description |
|---|---|
| Search Queries | Total number of knowledge search queries triggered |
| LLM Tokens | Total input + output tokens consumed. The card shows the input/output breakdown (e.g., "45.0K in / 4.4K out"). If prompt caching is enabled, cached tokens are shown separately with a savings indicator (e.g., "21.4K cached (90% savings)"). |
| STT Minutes | Total minutes of audio processed by the speech-to-text provider. Voice calls only — zero for chat-only deployments. |
| Real-time Events | Total messages published for real-time event delivery |
Estimated Total Cost
A banner card shows the total estimated cost across all services (LLM, STT, events) for the selected period. If a budget threshold is configured, it displays alongside with a visual indicator when spending exceeds the limit.
Token Usage by Feature
A stacked bar chart breaks down token consumption by feature and direction (input vs output):
| Feature | Description |
|---|---|
| Classifier | Tokens used by the LLM classifier to evaluate each utterance |
| Answer Generation | Tokens used for knowledge search context assembly and answer streaming |
| Summary | Tokens used for conversation summarization |
| Analysis | Tokens used for conversation quality analysis |
| Coaching | Tokens used for coaching tip generation (only when coaching is enabled) |
Cost Breakdown
A cost breakdown by service category shows dollar amounts and percentages:
| Category | Description |
|---|---|
| LLM | Cost of all LLM token consumption |
| STT | Cost of speech-to-text processing minutes |
| RAG | Estimated cost of knowledge search queries |
Token Efficiency
Compares accepted vs rejected suggestions to identify wasted tokens:
| Metric | Description |
|---|---|
| Accepted count + avg latency | Suggestions the agent found helpful |
| Rejected count + avg latency | Suggestions the agent rejected — high-latency rejected suggestions represent wasted tokens |
Burn Rate Projection
Based on current usage patterns, projects daily average and monthly estimated cost. Highlights red if the projected monthly cost exceeds the configured budget threshold.
Daily Usage Trend
A daily bar chart shows knowledge search query volume over the selected period. Hover over a bar to see the exact date and count. Use this to spot usage spikes tied to specific events or campaigns.
Per-Agent Usage
A table showing resource consumption by individual agent:
| Column | Description |
|---|---|
| Agent | Agent display name |
| Search Queries | Number of knowledge search queries triggered |
| LLM Tokens | Total tokens consumed |
| STT Minutes | Audio processing minutes (voice only) |
| Est. Cost | Calculated cost based on usage and pricing |
Filters
| Filter | Description |
|---|---|
| Queue | Filter all metrics to a specific CCaaS queue |
| Date range | Custom start and end date pickers |
Budget Alert Threshold
Set a monthly spending cap. When estimated cost exceeds this amount, the system alerts you. Configure the threshold directly on this page or in Tenant Configuration.
WARNING
Cost data reflects estimates based on token counts and configured pricing. Refer to your LLM provider invoices for actual billed amounts.
