AAEP Subscribers' Guide
For assistive technology vendors and accessibility-tool engineers building AAEP-aware consumers.
This guide is for the people who build the software that announces AAEP events to users: screen readers (Narrator, NVDA, JAWS, VoiceOver, TalkBack, Orca), voice-control systems (Voice Access, Voice Control, Dragon), switch-input software, refreshable-braille displays, captioning services, and other assistive technology.
If you are an agent or framework engineer, read the Implementer's Guide instead. This guide is for the opposite side of the protocol.
Table of contents
- Your role in AAEP
- The subscriber lifecycle
- Building the subscription handshake
- Receiving and routing events
- Implementing the reply channel
- Coalescing strategies
- Cognitive-load adaptation
- Handling multilingual content
- Multiple concurrent producers
- AT-specific guidance
- Privacy and data handling
- Conformance and certification
1. Your role in AAEP
A subscriber sits between a producer (the agent) and the user. Your responsibilities, in order of importance:
-
Reliably surface critical events to the user. Confirmation requests, errors, and handoffs MUST reach the user. Other events MAY be filtered, suppressed, or coalesced based on user preference.
-
Translate events into modality-appropriate announcements. AAEP gives you structured event data; you decide whether to speak, display, vibrate, present braille, or render to a visual UI.
-
Reply on the user's behalf when prompted. Confirmations and clarifications block the producer until you respond. Latency matters: a slow reply degrades the user's experience.
-
Negotiate sensible defaults during the handshake. Your capability declaration tells the producer how to adapt to you.
-
Protect the user from event flood. Producers may emit hundreds of events per second. You decide what reaches the user and what gets summarized or dropped.
AAEP does not tell you how to render announcements. A screen reader speaks them. A braille display refreshes its cells. A switch-input system might present them as a yes/no question with a single switch press to confirm. AAEP only specifies what information is available; the modality is your choice.
2. The subscriber lifecycle
A typical subscriber lifecycle:
1. Subscriber discovers a producer (via manifest URL, transport endpoint, or platform mechanism)
2. Subscriber sends subscription.request with capability declaration
3. Producer responds:
- subscription.accepted with subscription_id and honored_capabilities → proceed
- subscription.rejected with reason_code → handle the failure
4. Producer emits events; subscriber receives, processes, and announces them
5. When a confirmation or clarification event arrives, subscriber:
a. Surfaces the request to the user
b. Awaits the user's response (in your modality)
c. Sends confirmation.reply or clarification.reply
6. (Optional) Subscriber renegotiates capabilities mid-session
7. Subscriber closes the subscription when the user disengages
For Conformance Level 1 producers, there is no handshake — events arrive immediately. You should still implement steps 4-5 correctly.
3. Building the subscription handshake
3.1 Constructing your capability declaration
Your capabilities tell the producer how to adapt. The shape is fixed by the spec; the values depend on your AT product and the user's configuration.
{
"type": "subscription.request",
"aaep_version": "1.0.0",
"subscriber_id": "your-at-product-id",
"subscriber_name": "Your AT Product Name and Version",
"capabilities": {
"max_events_per_second": 3,
"preferred_verbosity": "normal",
"languages": ["en-US"],
"supports_confirmation_reply": true,
"supports_clarification_reply": true,
"coalesce_boundaries": ["sentence", "completion"],
"supported_conformance_levels": [1, 2],
"cognitive_load": "medium",
"pace_wpm": 180
}
}
3.2 Picking sensible defaults for your AT type
| Field | Screen reader default | Voice control default | Braille display default | Captioning service default |
|---|---|---|---|---|
max_events_per_second |
2-5 | 5-10 | 1-2 (slower update cycle) | 10+ |
preferred_verbosity |
normal | terse | normal | detailed |
coalesce_boundaries |
["sentence", "completion"] | ["completion"] | ["sentence", "paragraph", "completion"] | ["word", "sentence"] |
cognitive_load |
medium | medium | low (less buffer space) | medium |
pace_wpm |
follow user's TTS rate | 200 | follow refresh rate | not applicable |
These are defaults. Always allow user override via your settings UI.
3.3 Honoring user preferences in the handshake
If your user has configured "low cognitive load" mode (often called "Quiet mode" or "Focus mode"), declare:
{
"cognitive_load": "low",
"max_events_per_second": 1,
"event_filters": {
"exclude": [
"aaep:agent.state.changed",
"aaep:agent.progress.updated"
]
}
}
The producer will reduce verbosity and suppress non-critical state transitions. Critical events still arrive.
4. Receiving and routing events
4.1 The router pattern
Most subscribers benefit from a router that dispatches each event to a type-specific handler:
class AAEPRouter:
def __init__(self):
self.handlers = {}
def handler(self, event_type):
def decorator(fn):
self.handlers[event_type] = fn
return fn
return decorator
def dispatch(self, event):
event_type = event.get("type", "<unknown>")
handler = self.handlers.get(event_type)
if handler:
handler(event)
else:
self.unknown_event_handler(event)
def unknown_event_handler(self, event):
# MUST handle gracefully — extensions exist
pass
router = AAEPRouter()
@router.handler("aaep:agent.awaiting.confirmation")
def on_confirmation(event):
surface_confirmation_to_user(event)
@router.handler("aaep:agent.output.streaming")
def on_streaming(event):
queue_for_announcement(event["chunk"])
# ... etc
4.2 Handling unknown event types
You MUST handle unknown event types gracefully. Extensions add new event types over time, and producers may emit events from extensions you have not implemented.
The safe default: if the event has a summary_normal field, announce it. If not, ignore it. Critical events that require special handling always have well-known types (agent.awaiting.confirmation, agent.session.errored, agent.handoff.requested) — these are stable.
4.3 Respecting urgency
The urgency field tells you the producer's recommended priority:
| Urgency | Recommended subscriber behavior |
|---|---|
background |
May be silently suppressed in low-cognitive-load mode |
normal |
Announce per user's preferences |
critical |
MUST be announced promptly. MUST bypass rate limits and filters |
Critical events represent the user's safety: confirmations, errors, handoffs. Suppressing them under load violates the protocol.
5. Implementing the reply channel
5.1 Confirmation replies
When you receive agent.awaiting.confirmation:
- Surface the
actionandconsequenceto the user in your modality. - Wait for the user's decision (in any way appropriate for your AT).
- Send a
confirmation.replywith the user's decision.
def handle_confirmation(event):
action = event["action"]
consequence = event["consequence"]
reply_token = event["reply_token"]
timeout = event["timeout_seconds"]
# Surface to user (modality-specific)
speak(f"Confirmation required. {action}. {consequence}.")
# Get user decision (modality-specific; here using stub)
decision = await_user_decision(timeout_seconds=timeout)
# Send reply
reply = {
"type": "confirmation.reply",
"reply_token": reply_token,
"decision": decision, # "accept" or "reject"
"subscription_id": current_subscription_id,
"timestamp": now_rfc3339(),
"decided_by": f"user:{current_user_id}",
}
transport.send(reply)
5.2 Clarification replies
Clarifications differ from confirmations: they collect free-form information rather than authorize an action. The accepted_response_kinds field tells you what input format the producer expects:
| accepted_response_kinds value | How to surface |
|---|---|
freetext |
Open-ended text input |
yes_no |
Two-button question or yes/no voice prompt |
multiple_choice |
Render choices as selectable list |
numeric |
Number entry input |
Your AT may not support all input methods. If you can't render one of the accepted kinds, escalate to the user (e.g., "The agent needs you to provide a number; please type it on your keyboard").
5.3 Preventing reply duplication
Each reply_token is single-use. Track tokens you have already replied to and avoid duplicate replies (a duplicate is harmless because the producer ignores it, but it wastes bandwidth).
5.4 Reply latency targets
| Action type | Recommended latency target |
|---|---|
| Confirmation reply (after user decides) | < 100ms transmission |
| Clarification reply | < 100ms transmission |
| User decision time | Bounded by timeout_seconds; usually 30-300 seconds |
Long user-decision time is expected. Long transmission time is not.
6. Coalescing strategies
Streaming output is where naive subscribers fail. Producers emit chunks as fast as the LLM generates them — often 30-100 tokens per second. Your AT cannot announce that fast.
6.1 Sentence-boundary coalescing (recommended default)
Buffer incoming chunks until a sentence boundary, then announce the complete sentence:
class SentenceCoalescer:
SENTENCE_ENDS = {".", "!", "?"}
def __init__(self):
self.buffer = ""
def add(self, chunk, coalesce_hint, complete):
self.buffer += chunk
if complete or coalesce_hint == "sentence":
self._flush()
elif coalesce_hint == "paragraph":
self._flush()
# Otherwise wait for next chunk
def _flush(self):
if self.buffer:
announce(self.buffer)
self.buffer = ""
6.2 Producer-hint-driven coalescing
The coalesce_hint field tells you what boundary this chunk represents:
| coalesce_hint | What it means | Recommended action |
|---|---|---|
none |
Mid-content | Buffer |
word |
End of word | Buffer or announce per user preference |
sentence |
End of sentence | Announce |
paragraph |
End of paragraph | Announce with pause |
completion |
End of entire output | Announce final segment |
6.3 Cognitive-load-driven coalescing
If user has configured low cognitive load:
if user_cognitive_load == "low":
# Wait for completion before announcing anything
if event.get("complete"):
announce(event["chunk"]) # the final, complete output
elif user_cognitive_load == "medium":
# Sentence-by-sentence
sentence_coalescer.add(...)
else: # high
# Per-chunk if requested
announce(event["chunk"])
6.4 Per-modality nuances
- Speech (TTS): sentence-based coalescing matches natural speech rhythm.
- Braille: paragraph-based or full-completion coalescing reduces refresh cycles.
- Captions: word- or phrase-based coalescing matches reading flow.
- Voice control output: completion-only often makes most sense.
7. Cognitive-load adaptation
Cognitive load is a user setting that says: how much information should reach me?
| Mode | Subscriber behavior |
|---|---|
low |
Suppress all background events. Coalesce streaming output to completion only. Announce only critical events and final outputs. |
medium (default) |
Standard announcement. Sentence-level coalescing. State changes announced briefly. |
high |
Verbose announcements. Per-chunk streaming. State changes announced with detail. |
Some users want detailed running commentary; others want only essential information. Both are legitimate. The protocol gives you a clean way to honor both.
8. Handling multilingual content
8.1 Language detection
The event's localization_hints.primary_language (envelope) and per-event language (on streaming events) tell you what language the content is in. Use this to:
- Switch TTS voice
- Route to a language-specific speech engine
- Adjust pace (some languages naturally read faster or slower)
- Apply correct grapheme/word segmentation rules
8.2 Fallback chains
If your subscriber doesn't support the producer's language, you can request a fallback:
{
"languages": ["yo-NG", "en-NG", "en-US"]
}
The producer will provide content in the first language it supports from your list.
8.3 Right-to-left text
When text_direction is "rtl" or content is in a known RTL script (Arabic, Hebrew, Persian, Urdu), apply the Unicode Bidirectional Algorithm (UAX #9) before rendering. Most modern TTS engines handle this automatically; braille displays and visual UIs may need explicit handling.
8.4 Tonal languages
For tonal languages like Yoruba (yo-NG), Igbo (ig-NG), Vietnamese (vi-VN), and Mandarin Chinese (zh-Hans, zh-Hant), use a TTS voice trained on the specific language. Standard English TTS reading Yoruba will mangle the tones; this is not just an accent issue but a comprehension one.
9. Multiple concurrent producers
A single user may have multiple AAEP-emitting agents running simultaneously (e.g., a coding assistant in their IDE, a customer service bot in their browser, a productivity agent in their email). Your AT may receive events from all of them.
9.1 Producer identification
Every event carries producer.agent_id and producer.agent_name. Use these to:
- Prefix announcements with the producer name when ambiguous
- Apply different routing rules per producer
- Allow the user to mute specific producers
9.2 Session interleaving
Don't assume events arrive in session-grouped order. A user might be running:
- Session A (their coding assistant)
- Session B (their email assistant)
- Session C (a one-shot question to a search agent)
Events from all three may interleave. Use session_id to keep them straight in your internal state.
9.3 Critical-event priority
When multiple producers want to announce critical events simultaneously, you decide the queueing order. Common heuristics:
- Most recent producer the user interacted with gets priority
- User-configured priority order
- First-come, first-served for events of equal urgency
10. AT-specific guidance
10.1 Microsoft Narrator (Windows)
Narrator supports AAEP via a UIA-bridged subscription model. Your add-on or extension subscribes via Narrator's plugin API and translates AAEP events into Narrator's announcement primitives.
Key integration points:
- Register as a UIA pattern handler for the agent's UI element
- Use Narrator's
Speak()API for normal-urgency announcements - Use
SpeakInterruptible()for critical-urgency announcements - Surface confirmation events using Narrator's confirmation dialog
A complete add-on prototype is in ../examples/subscribers/narrator-bridge-prototype/.
10.2 NVDA (Windows)
NVDA's plugin API makes AAEP integration relatively straightforward. Use a global plugin that:
- Listens on a configured transport (stdio JSON-RPC or Unix socket are typical)
- Routes events to NVDA's
speech.speak()for announcements - Handles confirmations via NVDA's dialog framework
A worked NVDA add-on is in ../examples/subscribers/nvda-addon-prototype/.
10.3 JAWS (Windows)
JAWS scripts can interface with AAEP via a Python bridge. Use a JAWS script that loads a Python process which speaks AAEP messages through JAWS's SayString function.
10.4 VoiceOver (macOS / iOS)
macOS VoiceOver supports AAEP through accessibility notifications. Send NSAccessibilityAnnouncementRequestedNotification for normal events; use NSAccessibilityPriorityHigh for critical events.
iOS VoiceOver requires app-side integration: each AAEP-aware app embeds an AAEP subscriber that posts UIAccessibility.post(notification: .announcement, argument: ...).
10.5 TalkBack (Android)
TalkBack 14.1+ supports AAEP via Android's AccessibilityService API. Implement an AccessibilityService subclass that connects to a local AAEP socket and dispatches AccessibilityEvents with TYPE_ANNOUNCEMENT.
10.6 Orca (Linux)
Orca's plugin API supports Python plugins. Implement a plugin that connects via Unix domain socket and uses Orca's speak() API.
10.7 Voice control systems
Voice control AAEP integration usually inverts the typical role: you primarily send clarification replies and receive confirmations. Streaming output is often coalesced to completion only.
10.8 Switch input
Switch users typically configure AAEP for cognitive_load: "low" and prefer default_decision: "reject" to give them more time to switch-confirm intentionally.
10.9 Refreshable braille
Braille displays have limited cell counts (often 40 or 80 cells). Aggressively coalesce: paragraph- or completion-only is usually appropriate. Use the manual review pattern: surface a summary on the braille line; the user requests detail via a button if interested.
11. Privacy and data handling
11.1 What you receive
AAEP events may contain:
- The user's original natural-language request
- Summarized tool arguments (possibly including user-supplied PII)
- Streamed model output
- Producer-supplied descriptions of actions
You may NOT receive:
- Tool secrets, API keys, or system credentials (producers MUST NOT include these)
- Full conversation history (unless the producer explicitly includes it)
- Internal model reasoning details (unless surfaced in
summary_detailed)
11.2 What you should NOT log
By default, do not log:
- Full event payloads with
summary_normalcontaining user requests - Authentication tokens (these travel separately from event payloads)
- Cross-session correlation that could reveal user behavior patterns
If you need diagnostics, log envelope-only (type, event_id, session_id, timestamp, producer.agent_id). This is enough to debug timing issues without retaining user content.
11.3 User data in replies
When you send a clarification reply, the user's response is included in the response field. This data leaves your subscriber. Consider whether your AT's policies allow this; some health and education AT may need to redact PII before transmission.
12. Conformance and certification
12.1 What "AAEP-compliant subscriber" means
A subscriber that:
- Implements at least Conformance Level 1 (the protocol's minimum)
- Passes the subscriber half of the conformance test suite at its claimed level
- Honors the negotiated capabilities throughout each subscription
- Surfaces critical events to the user
12.2 Running the conformance suite
pip install aaep-conformance
aaep-conformance subscriber --connect <your-subscriber-endpoint> --level 2
The suite acts as a synthetic producer, exercising your subscriber against ~120 test cases. It generates conformance-report.json and HTML reports.
12.3 Publishing your conformance
Include in your AT's accessibility documentation:
- Conformance level claimed (1, 2, or 3)
- Date of last conformance run
- Link to the conformance report
- AAEP version supported
Example: "VoiceOver supports AAEP v1.0 at Conformance Level 2. Last verified: 2026-09-15. Report: [link]."
12.4 Reporting issues
If you find a producer claiming AAEP conformance whose behavior is non-conforming, file an issue on their repository. If your own AT product cannot pass a conformance test, file an issue on the AAEP repository — the spec may need clarification.
Where to go from here
- For the precise normative rules, return to the specification.
- For implementing the producer side, read the Implementer's Guide.
- For domain-specific extensions, read the Extensions Guide.
- For reference subscriber implementations, see
../examples/subscribers/. - For frequently asked questions, see the FAQ.
Welcome to the AAEP subscriber community. The protocol exists because of you.