Skip to main content

Elevator pitch

What are you proposing to change?
Add a standardized usage_update session notification that lets agents report the current context window utilization and optional cumulative session cost. This RFD is intentionally scoped to session-level context size and cost. End-turn token accounting is covered separately in the End-Turn Token Usage RFD.

Status quo

How do things work today and what problems does this cause? Why would we change things?
ACP currently has no standardized way for agents to communicate:
  1. Context window status - How much of the model’s context window is being used
  2. Cost information - Estimated cumulative cost for API usage
This creates several problems:
  • No context management - Clients cannot warn users when a session is approaching context limits or suggest compaction
  • No cost transparency - Users cannot track spending or estimate whether a long-running session should continue
  • Inconsistent implementations - Each agent reports context and cost differently, if it reports them at all
Common AI coding tool behavior suggests users benefit from:
  • Context percentage indicators such as “19%”
  • Absolute token details such as “31.4K of 200K tokens”
  • Warning thresholds around high context utilization
  • Auto-compaction or handoff suggestions near context limits
  • Cumulative session cost tracking rather than only per-turn cost breakdowns

What we propose to do about it

What are you proposing to improve the situation?
Agents send context window and cost information via session/update notifications with sessionUpdate: "usage_update":
{
  "jsonrpc": "2.0",
  "method": "session/update",
  "params": {
    "sessionId": "sess_abc123",
    "update": {
      "sessionUpdate": "usage_update",
      "used": 53000,
      "size": 200000
    }
  }
}

Context Window Fields

  • used (number, required) - Tokens currently in context
  • size (number, required) - Total context window size in tokens
Clients can compute remaining as size - used and percentage as used / size * 100. If a model has a dynamic context window, agents should report the current effective limit and send another usage_update if that limit changes. If an agent cannot provide a meaningful context window size, it should not send usage_update; this RFD does not define a null or omitted size state.

Cost Field

  • cost (object, optional, nullable) - Cumulative session cost
    • amount (number, required) - Total cumulative cost for the session
    • currency (string, required) - ISO 4217 currency code, such as "USD" or "EUR"
Example with optional cost:
{
  "jsonrpc": "2.0",
  "method": "session/update",
  "params": {
    "sessionId": "sess_abc123",
    "update": {
      "sessionUpdate": "usage_update",
      "used": 53000,
      "size": 200000,
      "cost": {
        "amount": 0.045,
        "currency": "USD"
      }
    }
  }
}
Agents send updates when they have current data available:
  • After session/new, once the session ID exists
  • After session/load, session/resume, or session/fork for restored sessions
  • After session/prompt, when usage data becomes available
  • Anytime context window state or cumulative cost changes significantly
This approach provides flexibility for different agent implementations:
  • Agents that can query current usage without a prompt can send updates immediately after session setup
  • Agents that only receive usage data while prompting can send updates after prompt completion
  • Agents that cannot calculate cost can still report context window status

Design Principles

  1. Session state - Context size and cumulative cost describe the session, not one prompt response
  2. Agent-pushed notifications - Agents proactively send updates when data becomes available, following the same pattern as dynamic session properties such as available_commands_update, current_mode_update, and session_info_update
  3. Agent calculates, client can verify - Agents know their model and tokenizer best, but expose raw values so clients can compute derived display values
  4. Flexible cost reporting - Cost is optional because not all agents track it, and the currency is explicit because agents may bill in different currencies
  5. Optional but recommended - Context and cost reporting remains optional to preserve compatibility

Shiny future

How will things will play out once this feature exists?
For Users:
  • Visibility: Users see current context window usage with percentage indicators
  • Cost awareness: Users can track cumulative spending at any time
  • Better planning: Users know when to start a new session, compact context, or hand off work
  • Transparency: Users get a consistent view of resource consumption across agents
For Client Implementations:
  • Consistent UI: Clients can show usage through standard progress indicators, detail hovers, and warning states
  • Smart warnings: Clients can warn users as sessions approach high context utilization
  • Cost controls: Clients can implement budget alerts and session-level spend displays
  • Reactive updates: Clients update UI immediately when agents push new data
  • No polling needed: Clients do not need a separate status request to keep context displays current
For Agent Implementations:
  • Standard reporting: Agents get a clear contract for reporting context window and cost state
  • Flexibility: Agents can report only the cost information they can calculate
  • Model diversity: The shape works across providers and local models

Implementation details and plan

Tell me more about your implementation. What is your detailed implementation plan?
  1. Add a UsageUpdate variant to SessionUpdate with discriminator sessionUpdate: "usage_update".
  2. Add a Cost type with amount and currency fields.
  3. Gate this Rust crate surface behind the unstable_session_usage feature.
  4. Document the session/update notification variant in the protocol docs.
  5. Add examples showing when agents should send context window and cost updates.
This RFD does not add end-turn token usage to PromptResponse. That proposal remains in draft in the End-Turn Token Usage RFD.

Frequently asked questions

What questions have arisen over the course of authoring this document or during subsequent discussions?

Why split this from end-turn token usage?

Context window size and cumulative cost are session-level state that clients may need before, during, or after a prompt. Token accounting for a completed turn has different open questions, including whether values should be per-turn or cumulative and how cache metrics should map across providers.

How do users know when to hand off or compact context?

The usage_update notification provides the raw values clients need:
  • used and size give absolute numbers for precise tracking
  • Clients can compute remaining as size - used
  • Clients can compute percentage as used / size * 100
Recommended client behavior:
PercentageAction
< 75%Normal operation
75-90%Yellow indicator, suggest “Context filling up”
90-95%Orange indicator, recommend “Start new session or summarize”
> 95%Red indicator, warn “Next prompt may fail - handoff recommended”

Why is cost in session/update instead of PromptResponse?

Cost is cumulative session state, similar to context window utilization:
  • Users usually want total spending, not only per-turn costs
  • Both cost and context window are session-level metrics that can change when agents compact, switch models, or restore sessions
  • Cost is optional because not all agents track it

Why not assume USD for cost?

Agents may bill in different currencies:
  • European agents might bill in EUR
  • Asian agents might bill in JPY or CNY
  • Some agents might use credits or points
  • Currency conversion rates change
Agents should report the actual billing currency and let clients convert if needed.

How does this work with streaming responses?

Agents may send usage_update notifications during streaming if context state changes and they have reliable current data. They should also send an update after prompt completion when final context size or cost data becomes available.

What about models without fixed context windows?

Agents should report the current effective context window size. For models with dynamic windows, agents should report the current limit and send another update if it changes. If there is no meaningful context window size, agents should not send usage_update; size is required and non-null.

What about rate limits and quotas?

This RFD focuses on context window utilization and cumulative cost. Rate limits and quotas are separate concerns that could be addressed in a future RFD.

Should cached tokens count toward context window usage?

Yes. Cached tokens still occupy context window space even when they are cheaper to process. Context window usage should include all tokens currently in context.

Why a notification instead of a request?

Using session/update notifications instead of a session/status request provides several benefits:
  1. Consistency: Follows the same pattern as other dynamic session properties
  2. Agent flexibility: Agents can send updates when they have data available
  3. No polling: Clients receive updates reactively
  4. Real-time updates: Updates flow naturally as part of the session lifecycle

What if the client connects mid-session?

When a client connects to an existing session through session/load or session/resume, agents should send a usage_update notification if they have current context window data available. This lets the client UI immediately display accurate context and cost information. For agents that only provide usage during active prompting, the client UI may not show usage until after the first prompt is sent.

What alternative approaches did you consider, and why did you settle on this one?

Everything in PromptResponse - Simpler, but context window and cost are session state that users may want independently of prompt turns. Request/response with session/status - Requires clients to poll, and some agents do not have APIs to query current status without a prompt. Client calculates everything - Rejected because clients do not know the agent’s tokenizer, exact context window size, model routing, or pricing. Only percentage, no raw tokens - Rejected because users want absolute numbers, clients cannot verify calculations, and it is less transparent.

Revision history

  • 2025-12-07: Initial draft
  • 2025-12-13: Changed from session/status request method to session/update notification with sessionUpdate: "context_update". Made cost optional and removed remaining field (clients can compute as size - used). This approach provides more flexibility for agents and follows the same pattern as other dynamic session properties.
  • 2025-12-17: Renamed reasoning_tokens to thought_tokens for consistency with ACP terminology. Removed percentage field (clients can compute as used / size * 100).
  • 2025-12-19: Renamed sessionUpdate: "context_update" to sessionUpdate: "usage_update" to better reflect the payload semantics (includes both context window info and cumulative cost).
  • 2026-06-02: Split end-turn token accounting into a separate draft RFD.
  • 2026-06-03: Moved to Preview.