MACP End-to-End Flow
Comprehensive walkthrough of how all MACP components work together — from protocol specification through Rust runtime, NestJS control plane, and TypeScript/Python SDKs
Status: Non-normative (explanatory). In case of conflict, the referenced RFCs are authoritative.
References: RFC-MACP-0001 | RFC-MACP-0002 | RFC-MACP-0003 | RFC-MACP-0004 | RFC-MACP-0005 | RFC-MACP-0006 | RFC-MACP-0012
Imagine three AI agents sitting around a virtual table. One is an architect, another a security reviewer, and the third a cost optimizer. They need to agree on a deployment strategy for a critical Q3 release. Each has its own expertise, its own biases, and its own definition of "good enough." Left to their own devices, they might negotiate forever, contradict each other, or worse — two of them might reach one conclusion while the third reaches another.
This is the problem MACP solves. And this document is the story of how it solves it — tracing a coordination request from the moment an agent first introduces itself, through session creation, deliberation, voting, resolution, and all the way to replay and audit. Along the way, we will meet every layer of the system and understand not just what it does, but why it was designed that way.
The Four Layers: Why Architecture Matters
Think of MACP as a layered system built around a single conviction: when autonomous agents need to produce one binding outcome, the rules of engagement cannot be left to convention. They must be enforced.
At the bottom sits a protocol specification — twelve RFCs that define exactly what "coordination" means. Above that, a Rust runtime acts as the impartial referee, enforcing every rule the spec defines. The control plane adds orchestration and observability, turning raw protocol events into something a human operator can watch in real-time. And at the top, TypeScript and Python SDKs give agent developers a typed, ergonomic interface so they never have to think about envelope serialization or gRPC plumbing.
flowchart TB
subgraph Clients["Agent Layer"]
TS["TypeScript SDK"]
PY["Python SDK"]
UI["UI / API Consumer"]
end
subgraph CP["Control Plane — NestJS"]
API["REST API + SSE"]
Executor["Run Executor"]
Normalizer["Event Normalizer"]
Projection["Projection Engine"]
DB[(PostgreSQL)]
end
subgraph RT["MACP Runtime — Rust"]
Kernel["Coordination Kernel"]
Modes["Mode Registry"]
Policy["Policy Evaluator"]
Storage["Storage Backend\nfile / rocksdb / redis"]
end
subgraph Spec["Protocol Specification"]
RFCs["RFCs 0001–0012"]
Schemas["Protobuf + JSON Schemas"]
Registries["Mode · Policy · Error Registries"]
end
UI -->|"HTTP / SSE"| API
API --> Executor
Executor -->|"gRPC bidirectional stream"| Kernel
Normalizer --> Projection --> DB
TS -->|"gRPC"| Kernel
PY -->|"gRPC"| Kernel
Kernel --> Modes --> Policy --> Storage
Spec -.->|"defines contracts for"| RT
Spec -.->|"defines contracts for"| CP
Spec -.->|"defines contracts for"| TS
Spec -.->|"defines contracts for"| PYThis layered design also means there are two distinct ways to use the system, depending on how much infrastructure you want in the loop:
- SDK-direct — Agents connect straight to the runtime via gRPC and manage their own session lifecycle. This is lightweight, fast, and requires no control plane at all. It is ideal for agent-to-agent coordination where no human needs to watch what is happening.
- Control-plane-mediated — A UI or API consumer submits an
ExecutionRequestto the control plane, which opens a runtime session on the agents' behalf, sends kickoff messages, streams every event through a normalization pipeline, and builds real-time projections for the UI. This is the path you take when observability, audit, and replay matter.
Both patterns use the same runtime, the same protocol, and the same Protobuf wire format defined in RFC-MACP-0006. The control plane is an addition, not a replacement.
Introducing Themselves: Agent Creation and Registration
Before any coordination can happen, agents need to introduce themselves. In the physical world, you would exchange business cards. In MACP, agents publish a manifest — a structured declaration of who they are, what they can do, and how to reach them.
What goes into a manifest
An agent manifest (RFC-MACP-0005) answers a handful of essential questions: What is your name? What can you do? What coordination modes do you support? What data formats do you speak?
| Field | Required | Description |
|---|---|---|
agent_id | Yes | Unique identifier |
title | Yes | Human-readable name |
description | Yes | What this agent does |
supported_modes | Yes | Array of mode identifiers the agent can participate in |
input_content_types | Yes | MIME types the agent accepts |
output_content_types | Yes | MIME types the agent produces |
transport_endpoints | No | Array of { transport, uri, content_types } |
metadata | No | Arbitrary key-value pairs |
Think of our architect agent: its manifest might declare support for macp.mode.decision.v1 and macp.mode.proposal.v1, accept application/json input, and produce application/json output. The security reviewer might support the same modes but also list macp.mode.quorum.v1 — because in its world, some decisions require a formal approval threshold.
How agents discover each other
Discovery is the moment agents learn who else is out there. MACP supports four mechanisms, ordered from simplest to most sophisticated:
- Well-known URL —
https://<host>/.well-known/macp.jsonreturns the manifest as JSON. Simple, cacheable, works everywhere. - GetManifest RPC — Programmatic discovery via gRPC. Pass an empty
agent_idto get the serving runtime's own manifest. - ListModes RPC — Returns only standards-track modes (
macp.mode.decision.v1, etc.), useful for capability probing. - Registry services — Organizations can index manifests for fleet-wide discovery, letting agents find each other across deployment boundaries.
// TypeScript — discover runtime capabilities
const manifest = await client.getManifest();
console.log(manifest.supportedModes); // ['macp.mode.decision.v1', ...]
const modes = await client.listModes();
// Returns ModeDescriptor[] with name, version, message types, determinism class# Python — discover runtime capabilities
manifest = client.get_manifest()
print(manifest.supported_modes)
modes = client.list_modes()With manifests published and discovery complete, our three agents now know about each other. The architect knows the security reviewer can participate in Decision mode. The cost optimizer knows the runtime supports the governance policies it needs. It is time to connect.
Connecting to the Runtime: SDK Initialization
Both SDKs follow a two-layer design that keeps things clean. At the bottom, a low-level MacpClient handles gRPC transport, authentication, and connection management. On top of that, high-level mode sessions (DecisionSession, ProposalSession, and so on) provide a typed, ergonomic API for each coordination mode. You never have to manually construct a Protobuf envelope if you do not want to.
Creating a client
The first step for any agent is creating a client connection to the runtime. Here is what that looks like:
// TypeScript
import { MacpClient, Auth } from '@macp/sdk';
const client = new MacpClient({
address: '127.0.0.1:50051',
secure: false, // TLS required in production
auth: Auth.bearer('my-token'), // or Auth.devAgent('agent-id')
defaultDeadlineMs: 10_000,
});# Python
from macp_sdk import MacpClient, AuthConfig
client = MacpClient(
target="127.0.0.1:50051",
secure=False,
auth=AuthConfig.for_bearer("my-token"), # or AuthConfig.for_dev_agent("agent-id")
default_timeout=10.0,
)The handshake: version and capability negotiation
Before anything else happens, the client and runtime perform a handshake — a version and capability negotiation defined in RFC-MACP-0001. This is not just a formality. The handshake establishes which protocol version both sides will speak, which optional features are available, and which coordination modes the runtime has loaded. Without it, neither side can make any assumptions about what the other supports.
sequenceDiagram
participant Agent as SDK Client
participant RT as Runtime
Agent->>RT: Initialize(client_name, client_version, capabilities)
RT->>Agent: InitializeResult(selected_version, runtime_info, supported_modes, capabilities)
Note over Agent,RT: Capabilities negotiated:<br/>sessions.stream, cancellation,<br/>progress, manifest, mode_registry,<br/>roots, policy_registryThe runtime responds with its supported protocol version, the list of available modes, and which optional capabilities it supports. The SDK stores these for the lifetime of the client — no need to re-negotiate on every call.
const init = await client.initialize();
// init.runtimeInfo — { name, version }
// init.supportedModes — ModeDescriptor[]
// init.capabilities — { sessions, cancellation, progress, ... }At this point, our architect agent has a live connection to the runtime. It knows the runtime supports Decision mode v1, that streaming is available, and that the policy registry is loaded. Now the question becomes: who kicks off the coordination? In many real-world scenarios, the answer is the control plane.
Orchestrating the Run: Control Plane Lifecycle
When coordination is mediated through the control plane — as it often is in production deployments where humans need visibility — the process follows a managed run lifecycle with well-defined state transitions. A "run" is the control plane's concept of a single coordination episode, from the moment someone requests it to the moment it completes, fails, or is cancelled.
The run state machine
The state machine is deliberately simple. Runs can only move forward — there is no going back from failed to running, and no way to resurrect a cancelled run. This simplicity is a feature: it makes the system easy to reason about and impossible to put into an inconsistent state.
stateDiagram-v2
[*] --> queued: POST /runs
queued --> starting: executor picks up
starting --> binding_session: runtime session opened
binding_session --> running: kickoff messages sent
running --> completed: session resolved
running --> failed: runtime error / stream lost
running --> cancelled: POST /runs/:id/cancel
starting --> failed: runtime unavailable
binding_session --> failed: session start rejected
completed --> [*]
failed --> [*]
cancelled --> [*]From request to coordination: the execution flow
Let us follow what happens when a human operator (or an automated pipeline) submits a coordination request through the control plane API. The sequence is precise, and every step has a reason:
sequenceDiagram
participant UI as API Consumer
participant API as Control Plane API
participant Exec as Run Executor
participant Mgr as Run Manager
participant RTP as Runtime Provider
participant RT as MACP Runtime
UI->>API: POST /runs (ExecutionRequest)
API->>Exec: launch()
Exec->>Mgr: createRun() → queued
Exec->>Mgr: markStarted() → starting
Exec->>RTP: initialize()
RTP->>RT: Initialize RPC
RT-->>RTP: InitializeResult
Exec->>RTP: openSession()
RTP->>RT: StreamSession (bidirectional gRPC)
RT-->>RTP: SessionStart Ack
Exec->>Mgr: bindSession() → binding_session
Exec->>RTP: send kickoff messages
RTP->>RT: Send envelopes
Exec->>Mgr: markRunning() → running
loop Event stream
RT-->>RTP: Accepted envelopes
RTP-->>Exec: Raw events
Exec->>Exec: Normalize → Canonical events
Exec->>Exec: Update projection
end
RT-->>RTP: Session resolved
Exec->>Mgr: markCompleted() → completed
API-->>UI: SSE stream / GET /runs/:id/stateWhat goes into an ExecutionRequest
The control plane needs to know everything up front. The ExecutionRequest is a fully resolved specification of what coordination should happen, containing:
- mode — Which coordination mode to use (e.g.,
macp.mode.decision.v1) - runtime — Runtime address and kind (
rust) - session — Participant list, TTL, policy version, context
- kickoff — Array of initial messages to send after session creation
- execution — Mode (
live,replay,sandbox), tags, metadata
There is a design philosophy at work here: the control plane never makes assumptions. It does not guess which mode you want or which agents should participate. Everything is declared explicitly, making runs reproducible and auditable.
Persistence: everything gets recorded
The control plane persists everything to PostgreSQL. This is not optional — it is fundamental to the system's ability to provide observability, replay, and audit. Here is what gets stored:
| Table | Purpose |
|---|---|
runs | Run metadata, status, timing, error info |
runtime_sessions | Bound session metadata, mode, capabilities |
run_events_raw | Raw runtime events (append-only) |
run_events_canonical | Normalized events for UI consumption |
run_projections | Current state cache (built from events) |
run_metrics | Token usage, cost estimates, event counts |
run_artifacts | Trace bundles, logs, generated reports |
The separation between run_events_raw and run_events_canonical is worth noting. Raw events are preserved exactly as the runtime emitted them — they are the source of truth. Canonical events are a normalized, UI-friendly representation. By keeping both, the system can always re-derive canonical events from raw ones, which matters for replay and debugging.
Opening the Session: Where Coordination Begins
Now we arrive at the heart of the protocol. A coordination session begins with a SessionStart message — and this is the most validated message in the entire system. The reason is simple: everything that follows depends on the session being correctly configured. A bad session start would cascade into invalid state transitions, policy mismatches, and non-deterministic replays. So the runtime checks everything.
The validation gauntlet
When a SessionStart arrives, it passes through twelve validation steps before the session is created. Each step catches a different class of error, and the order matters — cheap checks (authentication, rate limiting) come before expensive ones (mode resolution, policy lookup).
sequenceDiagram
participant Agent
participant RT as Runtime
Agent->>RT: Send(Envelope with SessionStart)
RT->>RT: 1. Authenticate sender (bearer / mTLS / JWT)
RT->>RT: 2. Derive sender identity from auth context
RT->>RT: 3. Rate limit check (60 SessionStart/min default)
RT->>RT: 4. Validate envelope structure (macp_version, mode, message_type)
RT->>RT: 5. Validate session_id format (UUID v4/v7, ≥128 bits entropy)
RT->>RT: 6. Check session_id not already in use
RT->>RT: 7. Validate SessionStartPayload
RT->>RT: 8. Resolve mode (must be registered)
RT->>RT: 9. Resolve policy (policy_version → registry lookup)
RT->>RT: 10. Create session: OPEN state
RT->>RT: 11. Append to storage (commit point)
RT->>RT: 12. Call mode.on_session_start()
RT->>Agent: Ack(ok=true, session_state=OPEN)What goes into a SessionStart
The SessionStartPayload declares everything the session needs to function. Some fields are required — you cannot start a session without participants or a TTL. Others are optional but powerful, like binding a governance policy or freezing ambient context.
| Field | Required | Description |
|---|---|---|
participants | Yes | Non-empty list of declared participant identifiers |
mode_version | Yes | Semantic version of the mode (immutable for session) |
configuration_version | Yes | Configuration profile version (immutable for session) |
ttl_ms | Yes | Session deadline in milliseconds (1 – 86,400,000) |
policy_version | No | Governance policy identifier; empty resolves to policy.default |
context | No | Frozen context bound at session creation |
roots | No | Root descriptors for ambient context |
intent | No | Human-readable session purpose |
Version binding: the key to determinism
Here is a design decision that permeates the entire system. Three versions are immutably bound at session creation and cannot change for the session's lifetime:
- mode_version — Which semantic profile of the mode to use
- configuration_version — Voting/evaluation/acceptance profile
- policy_version — Governance rules
Why immutable? Because of deterministic replay. If you replay the same accepted history under the same bound versions, the runtime MUST produce identical state transitions. If versions could change mid-session, replay would be meaningless — you could never be sure whether a different outcome was caused by different agent behavior or different runtime configuration.
Starting our deployment decision
Let us return to our running example. The architect agent decides it is time to coordinate on the Q3 deployment strategy. Here is what that looks like through the SDKs:
// TypeScript — start a Decision session
const session = new DecisionSession(client, {
modeVersion: '1.0.0',
configurationVersion: '1.0.0',
policyVersion: 'policy.majority',
auth: Auth.bearer('coordinator-token'),
});
await session.start({
intent: 'Choose deployment strategy for Q3 release',
participants: ['architect-agent', 'security-agent', 'cost-agent'],
ttlMs: 120_000, // 2 minutes
});# Python — start a Decision session
session = DecisionSession(client, policy_version="policy.majority")
session.start(
intent="Choose deployment strategy for Q3 release",
participants=["architect-agent", "security-agent", "cost-agent"],
ttl_ms=120_000,
)Notice the policy.majority policy version. This tells the runtime to use majority voting rules when the time comes to evaluate a commitment. The architect agent has declared that a simple majority is enough — the cost optimizer does not get veto power. This is governance embedded in the protocol, not left to ad-hoc agent logic.
The session is now OPEN. Our three agents have two minutes to reach a decision.
The Admission Pipeline: Every Message Earns Its Place
With the session open, agents can start sending messages — proposals, evaluations, votes. But not every message gets through. Every single message from a participant passes through a strict admission pipeline before it can enter the session's accepted history. This is where the runtime earns its role as an impartial referee.
The pipeline, step by step
The pipeline is a chain of checks, each one acting as a gate. Fail any gate, and the message is rejected with a structured error. Pass them all, and the message is appended to the session's authoritative log.
flowchart LR
A["Incoming\nEnvelope"] --> B["AuthN\nbearer / mTLS\nJWT / dev-header"]
B --> C["Sender\nDerivation"]
C --> D["Rate\nLimiting"]
D --> E["Envelope\nValidation"]
E --> F["Session\nLookup"]
F --> G["Session\nOPEN?"]
G --> H["Deduplication\nmessage_id"]
H --> I["Participant\nCheck"]
I --> J["Mode\nAuthorization"]
J --> K["Append to\nLog"]
K --> L["Mode\nDispatch"]Two steps in this pipeline deserve special attention.
Authentication: you are who the runtime says you are
The runtime supports multiple authentication mechanisms (RFC-MACP-0004), from bearer tokens for typical production use to mTLS for high-security deployments:
| Mechanism | Header | Use Case |
|---|---|---|
| Bearer token | Authorization: Bearer <token> | Production — tokens issued by control plane |
| mTLS | TLS client certificate | High-security deployments |
| JWT / OIDC | Authorization: Bearer <jwt> | Federated identity |
| Dev header | x-macp-agent-id: <id> | Local development only |
Here is a design choice that matters enormously: the sender field in the Envelope is always overwritten by the runtime from the authenticated identity. Agents cannot self-assert their sender. Period. This single rule eliminates an entire class of impersonation attacks. When the security reviewer sees a proposal from "architect-agent," it knows the runtime verified that identity — not just that someone claimed to be the architect.
Per-token authorization
Each bearer token carries authorization metadata that constrains what the agent can do:
{
"token": "abc123...",
"sender": "architect-agent",
"allowed_modes": ["macp.mode.decision.v1", "macp.mode.task.v1"],
"can_start_sessions": true,
"max_open_sessions": 10
}Rate limiting: preventing runaway agents
Default limits enforced per authenticated sender keep any single agent from overwhelming the system:
| Limit | Default |
|---|---|
| SessionStart messages per minute | 60 |
| Session-scoped messages per minute | 600 |
| Maximum payload size | 1 MB |
In a world of autonomous agents, rate limiting is not just about fairness — it is about safety. An agent stuck in a retry loop should not be able to saturate the runtime and starve other sessions.
Mode-specific authorization: not everyone can do everything
Each coordination mode defines precisely who can send which message types. This is not a suggestion — it is enforced by the runtime at the protocol level:
| Mode | Message Type | Authorized Sender |
|---|---|---|
| Decision | Proposal, Evaluation, Objection, Vote | Any declared participant |
| Decision | Commitment | Session initiator (default) |
| Proposal | Proposal, CounterProposal, Accept, Reject | Any participant |
| Proposal | Withdraw | Author of referenced proposal only |
| Task | TaskRequest | Session initiator |
| Task | TaskUpdate, TaskComplete, TaskFail | Active assignee only |
| Handoff | HandoffOffer, HandoffContext | Current responsibility owner |
| Handoff | HandoffAccept, HandoffDecline | Target participant of offer |
| Quorum | Approve, Reject, Abstain | Any eligible declared participant |
| Quorum | ApprovalRequest, Commitment | Session initiator |
Back in our scenario: the cost optimizer cannot unilaterally commit the decision — only the architect (as session initiator) can do that. The security reviewer can propose, evaluate, and vote, but it cannot commit. These rules are not configurable per-session — they are baked into the mode definition. This is by design: mode semantics should be predictable and auditable, not customized into unpredictability.
Mode Invocation: How Coordination Actually Unfolds
Now that messages are flowing through the admission pipeline, it is time to understand what happens when they reach the mode itself. Modes are the semantic heart of MACP — they define the structure of coordination. Is it a vote? A negotiation? A task delegation? A responsibility transfer? The mode decides.
How the runtime dispatches to modes
When an accepted envelope reaches the mode layer, the runtime looks up the mode by name in its registry and calls into it through a well-defined trait interface. The mode can authorize the sender, process the message, update its internal state, and optionally resolve the session.
flowchart LR
A["Accepted\nEnvelope"] --> B["Mode Registry\nlookup by name"]
B --> C["mode.authorize_sender\nenvelope, session"]
C --> D["mode.on_message\nenvelope, session state"]
D --> E{"ModeResponse"}
E -->|NoOp| F["No state change"]
E -->|PersistState| G["Update mode state"]
E -->|Resolve| H["Session → RESOLVED"]
E -->|PersistAndResolve| I["Update + Resolve"]The Mode trait in the Rust runtime is deliberately minimal. Every mode must implement exactly three methods — session start handling, message handling, and sender authorization. This constraint ensures modes are predictable, testable, and composable:
trait Mode: Send + Sync {
fn on_session_start(&self, session: &Session, env: &Envelope)
-> Result<ModeResponse, MacpError>;
fn on_message(&self, session: &Session, env: &Envelope)
-> Result<ModeResponse, MacpError>;
fn authorize_sender(&self, session: &Session, env: &Envelope)
-> Result<(), MacpError>;
}The five standards-track modes
MACP ships with five coordination modes, each designed for a different interaction pattern. They range from structured group decision-making to simple task delegation:
| Mode | Identifier | Participant Model | Determinism |
|---|---|---|---|
| Decision | macp.mode.decision.v1 | Declared | Semantic-deterministic |
| Proposal | macp.mode.proposal.v1 | Peer | Semantic-deterministic |
| Task | macp.mode.task.v1 | Orchestrated | Structural-only |
| Handoff | macp.mode.handoff.v1 | Delegated | Context-frozen |
| Quorum | macp.mode.quorum.v1 | Quorum | Semantic-deterministic |
Let us walk through each one, because the differences matter.
Decision Mode: our deployment strategy scenario
This is the mode our three agents are using. Decision mode provides structured choice among proposals with explicit evaluation, objection, and voting phases. It is the most ceremony-heavy mode, but that ceremony exists for a reason — when the stakes are high enough to warrant three agents deliberating, you want a clear audit trail of who proposed what, who evaluated it, and how the vote went.
stateDiagram-v2
[*] --> Proposing: SessionStart
Proposing --> Evaluating: Proposal(s) submitted
Evaluating --> Voting: Evaluation(s) submitted
Voting --> Committed: Commitment accepted
Committed --> [*]
note right of Proposing: Any participant submits proposals
note right of Evaluating: Participants evaluate with APPROVE/REVIEW/BLOCK/REJECT
note right of Voting: Participants vote APPROVE/REJECT/ABSTAIN
note right of Committed: Initiator binds outcomeIn our scenario, this is where things get interesting. The architect proposes blue-green deployment. The security reviewer evaluates it — APPROVE, with high confidence. The cost optimizer votes in favor. Here is the code:
// Decision Mode — TypeScript
await session.propose({ proposalId: 'p1', option: 'Blue-green deploy', rationale: 'Zero downtime' });
await session.evaluate({ proposalId: 'p1', recommendation: 'APPROVE', confidence: 0.9 });
await session.vote({ proposalId: 'p1', vote: 'APPROVE' });
await session.commit({ action: 'decision.accepted', authorityScope: 'session', reason: 'Majority approved' });Proposal Mode: when agents need to negotiate
Sometimes you do not want a formal vote — you want agents to negotiate. Proposal mode supports bounded negotiation with proposals and counterproposals. Think of it as a structured back-and-forth that must eventually converge or terminate.
stateDiagram-v2
[*] --> Negotiating: SessionStart
Negotiating --> Negotiating: Proposal / CounterProposal
Negotiating --> Converged: Accept convergence
Negotiating --> Rejected: Terminal Reject
Converged --> Committed: Commitment
Rejected --> Committed: Commitment
Committed --> [*]Task Mode: delegation with accountability
Task mode is the simplest interaction pattern: one agent requests work, another performs it. But even here, the protocol adds value — it tracks the task through acceptance, progress updates, and completion or failure, ensuring the requester always knows the current state.
stateDiagram-v2
[*] --> Requested: SessionStart + TaskRequest
Requested --> InProgress: TaskAccept
Requested --> Unassigned: TaskReject
InProgress --> Completed: TaskComplete
InProgress --> Failed: TaskFail
Completed --> Committed: Commitment
Failed --> Committed: Commitment
Committed --> [*]Handoff Mode: passing the baton
When one agent needs to transfer responsibility to another — along with the context needed to continue — Handoff mode provides a structured transfer protocol. The offering agent can attach context, and the receiving agent explicitly accepts or declines.
stateDiagram-v2
[*] --> Offered: SessionStart + HandoffOffer
Offered --> Enriched: HandoffContext (optional)
Enriched --> Accepted: HandoffAccept
Enriched --> Declined: HandoffDecline
Offered --> Accepted: HandoffAccept
Offered --> Declined: HandoffDecline
Accepted --> Committed: Commitment
Declined --> Committed: Commitment
Committed --> [*]Quorum Mode: threshold-based approval
Quorum mode is for situations where you need a specific number of approvals from a pool of participants — think code review approvals, compliance sign-offs, or multi-party authorization. The mode tracks votes against a threshold and resolves when the threshold is met or becomes unreachable.
stateDiagram-v2
[*] --> Voting: SessionStart + ApprovalRequest
Voting --> Voting: Approve / Reject / Abstain
Voting --> ThresholdMet: required_approvals reached
Voting --> ThresholdUnreachable: remaining cannot reach threshold
ThresholdMet --> Committed: Commitment (positive)
ThresholdUnreachable --> Committed: Commitment (negative)
Committed --> [*]The one rule that unifies all five modes
Across all five modes, there is a single invariant that matters more than any other: only a Commitment message resolves the session. Intermediate outcome messages — TaskComplete, HandoffAccept, Approve — make the session eligible for commitment but do not transition the session to RESOLVED.
This is a deliberate design choice, and it is worth pausing to understand why. In a world of autonomous agents, the protocol needs a single, unambiguous moment when the outcome becomes binding. By separating "the work is done" from "the outcome is committed," MACP gives the initiator (or the policy-designated authority) final say. It also creates a clean audit point: when you see a Commitment in the log, you know the session is over and the outcome is authoritative.
Governance Built In: Policy Application
Our architect agent chose policy.majority when starting the session. But what does that actually mean? How does the runtime enforce governance rules? This is where MACP's policy system comes in — declarative governance rules that constrain how modes operate, defined in RFC-MACP-0012.
A two-phase lifecycle: resolve, then evaluate
Policies have an elegant two-phase lifecycle. In the first phase, at session start, the runtime resolves the policy — looking it up in the registry, validating it, and binding it immutably to the session. In the second phase, when a Commitment message arrives, the runtime evaluates the policy against the session's accumulated history to decide whether the commitment is allowed.
sequenceDiagram
participant Agent
participant RT as Runtime
participant PR as Policy Registry
participant PE as Policy Evaluator
Note over Agent,PE: Phase 1: Resolution at SessionStart
Agent->>RT: SessionStart(policy_version="policy.majority")
RT->>PR: Lookup "policy.majority"
PR-->>RT: PolicyDescriptor (rules, schema_version)
RT->>RT: Bind policy immutably to session
Note over Agent,PE: Phase 2: Evaluation at Commitment
Agent->>RT: Commitment(action, reason)
RT->>PE: Evaluate(policy_rules, accepted_history, participants)
PE->>PE: Pure function — no I/O, no wall-clock, no randomness
PE-->>RT: PolicyDecision::Allow
RT->>Agent: Ack(ok=true, session_state=RESOLVED)How policy resolution works
The resolution process is straightforward but strict:
- Extract
policy_versionfromSessionStartPayload - If empty, resolve to
policy.default(no additional constraints) - If non-empty, look up in policy registry
- If not found, reject with
UNKNOWN_POLICY_VERSION - If mode mismatch (policy targets different mode), reject with
INVALID_POLICY_DEFINITION - Store resolved
PolicyDescriptoron the session — immutable for its lifetime
Notice step 5: a policy designed for Quorum mode cannot be used in a Decision mode session. This prevents subtle configuration errors where governance rules do not match the coordination semantics.
What policies can control
Each mode exposes different policy knobs. The policy system is not one-size-fits-all — it adapts to the semantics of the mode it governs:
| Mode | Policy Controls |
|---|---|
| Decision | Voting algorithm (majority/supermajority/weighted), quorum requirements, objection veto thresholds, commitment authority |
| Proposal | Acceptance criteria (all_parties/counterparty/initiator), counter-proposal round limits, terminal rejection |
| Task | Allow reassignment on reject, require output on completion |
| Handoff | Implicit accept timeout, commitment authority |
| Quorum | Threshold override, abstention interpretation (neutral/implicit_reject/ignored) |
In our scenario, policy.majority tells the Decision mode that a simple majority of votes is enough for the architect to commit the outcome. If the security reviewer had wanted veto power, a different policy — perhaps one with objection thresholds or supermajority requirements — would have been needed. The point is that these governance decisions are made explicitly at session creation, not implicitly during coordination.
The determinism guarantee
This is perhaps the most important property of the policy system. Policy evaluation is a pure function of:
- The resolved policy rules (immutable for the session)
- The accumulated accepted message history
- The session's declared participants
It MUST NOT depend on wall-clock time, external services, randomness, or any state outside the session boundary. This ensures that policy decisions are identical during replay. If you replay a session and the same history leads to a different policy decision, something is deeply wrong.
The Wire: Message Flow and Results
We have talked about what messages mean. Now let us talk about how they travel. Every MACP message — proposals, votes, commitments, signals — is wrapped in a canonical Envelope (RFC-MACP-0001). The Envelope is the universal container that carries any message type through the system.
The Envelope: one format to carry them all
message Envelope {
string macp_version = 1; // Protocol version (e.g., "2026-03-02")
string mode = 2; // Empty for ambient signals
string message_type = 3; // Discriminator (e.g., "Proposal", "Vote")
string message_id = 4; // Unique ID for idempotency
string session_id = 5; // Empty for ambient signals
string sender = 6; // Authenticated identity (runtime-derived)
int64 timestamp_unix_ms = 7; // Informational timestamp
bytes payload = 8; // Mode-defined content (protobuf-encoded)
}The design here is worth appreciating. The Envelope separates routing information (session, mode, sender) from content (payload). The message_id enables idempotent delivery — send the same message twice, and the runtime will deduplicate. The sender field, as we discussed, is always runtime-derived from authentication, never self-asserted.
The Send/Ack cycle: truth is authoritative
The primary message pattern is deceptively simple — unary Send followed by Ack — but the semantics are precise:
sequenceDiagram
participant A as Agent A
participant RT as Runtime
participant B as Agent B (streaming)
A->>RT: Send(Envelope) — unary gRPC
RT->>RT: Admission pipeline<br/>(auth → validate → dedup → append)
RT->>A: Ack(ok=true, session_state, accepted_at)
RT->>B: StreamSession: Accepted Envelope (in order)
Note over A,B: Ack is authoritative per-message.<br/>StreamSession delivers accepted<br/>envelopes in order to subscribers.The Ack is the runtime's authoritative verdict on a message. It tells the sender not just whether the message was accepted, but the current session state after processing:
| Field | Description |
|---|---|
ok | Whether the message was accepted |
duplicate | Whether the message_id was already seen |
message_id | Reference to the sent message |
session_id | Session context |
accepted_at_unix_ms | Server-side acceptance timestamp |
session_state | Current session state after processing |
error | Error details if ok=false |
Streaming: watching coordination unfold
The StreamSession RPC provides a bidirectional gRPC stream bound to a single session. Subscribers receive accepted envelopes in authoritative order — the order the runtime accepted them, which is the canonical ordering for replay and audit.
// TypeScript — streaming
const stream = client.openStream({ auth: Auth.bearer('observer-token') });
// Send via stream
await stream.send(envelope);
// Receive accepted envelopes
for await (const received of stream.responses()) {
console.log(received.messageType, received.sender);
// Process in acceptance order
}# Python — streaming
stream = client.open_stream()
stream.send(envelope)
for envelope in stream.responses(timeout=30.0):
print(f"{envelope.message_type} from {envelope.sender}")Client-side projections: making sense of the stream
Raw envelopes are useful, but agents usually want to know higher-level things: "How many votes does my proposal have? Is there a majority winner? Has anyone raised a blocking objection?" Both SDKs maintain client-side projections — pure state machines that track accepted envelopes and derive higher-level state locally:
// After voting
const totals = session.projection.voteTotals();
// { 'proposal-1': 3, 'proposal-2': 1 }
const winner = session.projection.majorityWinner();
// 'proposal-1'
const blocking = session.projection.hasBlockingObjection('proposal-1');
// falseThese projections are "pure" in the functional programming sense — given the same sequence of accepted envelopes, they always produce the same state. This makes them safe for use in agent decision logic, because the agent's view of the session is always consistent with the runtime's authoritative ordering.
The Other Channel: Ambient Signals
Not everything in a multi-agent system is coordination. Sometimes agents need to broadcast status updates, heartbeats, or progress reports without binding them to a session outcome. MACP separates these two concerns into distinct planes (RFC-MACP-0001):
flowchart TB
subgraph Ambient["Ambient Plane"]
direction LR
S1["Agent A"] -->|"Signal\n(session_id='', mode='')"| Bus["Signal Bus"]
Bus --> Sub1["Subscriber 1"]
Bus --> Sub2["Subscriber 2"]
end
subgraph Coordination["Coordination Plane"]
direction LR
M1["Agent A"] -->|"Envelope\n(session_id='abc', mode='decision.v1')"| Session["Session abc"]
Session --> Log["Append-only Log"]
end
Ambient ~~~ Coordination
style Ambient fill:#1a1a2e,stroke:#4a9eff
style Coordination fill:#1a1a2e,stroke:#9f7aeaThe separation is deliberate and important. Coordination messages enter a durable, ordered log and can affect session state. Signals do neither — they are ephemeral, non-binding, and broadcast to whoever is listening.
Signal semantics
The rules for signals are defined by what they cannot do:
- Signals carry empty
session_idand emptymode - Signals are non-binding — they MUST NOT create sessions, mutate session state, or produce binding outcomes
- Signals are ephemeral — they are not required to enter durable replay history
- Signals may include a
correlation_session_idin their payload for informational cross-referencing - Signals are broadcast via the
WatchSignalsRPC to all subscribers
Signal types
The SignalPayload is intentionally flexible:
signal_type— Discriminator (e.g.,"heartbeat","status_update")data— Arbitrary payload bytesconfidence— Optional confidence scorecorrelation_session_id— Optional session cross-reference (does NOT make the signal session-scoped)
Progress signals
One particularly useful signal type is ProgressPayload, designed for reporting work progress back to observers:
progress_token— Identifies the progress streamprogress/total— Numeric progress indicatorsmessage— Human-readable statustarget_message_id— Which message this progress relates to
In our deployment scenario, while the cost optimizer is evaluating proposals, it might broadcast progress signals: "Analyzing infrastructure costs... 40% complete." These signals let the UI show progress without polluting the coordination log with non-binding chatter.
Bridging Two Worlds: Control Plane and Runtime Interaction
We have seen the runtime's perspective (sessions, envelopes, modes) and the SDK's perspective (typed clients, projections). Now let us look at how the control plane bridges these two worlds — taking raw gRPC events from the runtime and transforming them into something a human operator can watch, query, and replay.
The event pipeline
Every event from the runtime passes through a normalization and projection pipeline before reaching the UI. This pipeline is where raw protocol events become meaningful operational data:
sequenceDiagram
participant RT as MACP Runtime
participant SC as Stream Consumer
participant EN as Event Normalizer
participant ES as Event Service
participant PS as Projection Service
participant MS as Metrics Service
participant SH as Stream Hub
participant UI as UI Client
RT->>SC: Accepted envelope (gRPC stream)
SC->>EN: Raw runtime event
EN->>EN: Normalize to canonical event
EN->>ES: Canonical event
ES->>ES: Allocate sequence number (transactional)
ES->>ES: Persist raw + canonical (atomic write)
ES->>PS: Apply event to projection
PS->>PS: Build RunStateProjection
PS->>PS: Persist to run_projections
ES->>MS: Record metrics (tokens, costs, counts)
ES->>SH: Publish event
SH->>UI: SSE (canonical_event)Canonical event types
Events are normalized into a standard taxonomy. This normalization is what makes the control plane's UI possible — instead of dealing with raw Protobuf envelopes, the UI works with a clean, categorized event stream:
| Category | Event Types |
|---|---|
| Run lifecycle | run.created, run.started, run.completed, run.failed, run.cancelled |
| Session | session.bound, session.stream.opened, session.state.changed |
| Participants | participant.seen |
| Messages | message.sent, message.received, message.send_failed |
| Signals | signal.emitted |
| Coordination | proposal.created, decision.proposed, decision.finalized |
| Tools | tool.called, tool.completed |
| Policy | policy.resolved, policy.commitment.evaluated, policy.denied |
The RunStateProjection: a real-time read model
The projection engine builds a comprehensive read model from the event stream. This projection is what powers the control plane's UI — a single query returns the complete current state of a run:
interface RunStateProjection {
run: RunSummaryProjection; // Status, timing, mode
participants: ParticipantProjection[]; // Activity per participant
graph: GraphProjection; // Message dependency graph
decision: DecisionProjection; // Decision-specific state
signals: SignalProjection; // Signal summary
progress: ProgressProjection; // Progress tracking
timeline: TimelineProjection; // Chronological events
trace: TraceSummary; // Distributed trace info
outboundMessages: OutboundMessageSummary;
policy: PolicyProjection; // Policy resolution status
}Circuit breaker: failing gracefully
The runtime provider implements a circuit breaker pattern — a nod to the reality that distributed systems fail. If the runtime becomes unreachable, the circuit opens and rejects new requests immediately rather than waiting for timeouts. This prevents cascading failures: a slow runtime should not make the control plane slow, it should make it fast at returning errors. The circuit resets after a configurable cooldown.
SSE streaming to clients
UI consumers connect via GET /runs/:id/stream (Server-Sent Events), and the experience is designed for real-time watching:
- On connect: receive a
snapshotevent with the full currentRunStateProjection - As events occur: receive
canonical_eventmessages in real-time - On disconnect: automatic reconnection with
Last-Event-IDheader for resumption
That snapshot-on-connect pattern is worth noting. A UI that connects mid-session does not have to replay the entire event history — it gets the current projection immediately, then stays in sync via the event stream.
Watching Everything: Observability
A coordination system that you cannot observe is a coordination system you cannot trust. MACP provides observability at every layer, from Rust-level structured logging to distributed traces that span from the UI through the control plane into the runtime.
flowchart LR
subgraph Runtime["Runtime — Rust"]
RL["tracing crate\nstructured logs"]
RM["metrics.rs\nper-mode counters"]
RO["OpenTelemetry\n(optional otel feature)"]
end
subgraph ControlPlane["Control Plane — NestJS"]
CL["pino\nstructured JSON logs"]
CM["prom-client\nPrometheus metrics"]
CO["OpenTelemetry\nNode SDK"]
CT["TraceService\nmanual spans"]
end
subgraph Endpoints["API Endpoints"]
E1["GET /runs/:id/traces"]
E2["GET /runs/:id/metrics"]
E3["GET /runs/:id/artifacts"]
end
RL --> CL
RM --> CM
RO --> CO
CO --> CT
CT --> E1
CM --> E2
E3Runtime observability
The Rust runtime uses the tracing crate for structured logging, controlled by the RUST_LOG environment variable. Every significant event is logged with structured fields:
- Session creation:
session_id,mode,sender - Message acceptance:
session_id,message_type,sender, resultingstate - Session resolution/expiry:
session_id,mode - Auth failures and rate limit violations
- Storage warnings during crash recovery
Per-mode metrics are tracked as atomic counters — lightweight enough to leave on in production:
sessions_started/sessions_resolved/sessions_expired/sessions_cancelledmessages_accepted/messages_rejectedcommitments_accepted/commitments_rejected
OpenTelemetry support (enabled via the otel cargo feature) provides distributed tracing:
- OTLP exporter configured via
OTEL_EXPORTER_OTLP_ENDPOINT - Batch export integrated with tokio runtime
- Trace context propagated via gRPC metadata
Control plane observability
The NestJS control plane adds its own observability layer:
Structured logging via pino with JSON output — machine-parseable, grep-friendly.
Prometheus metrics via prom-client, exposed for scraping by your existing monitoring infrastructure.
OpenTelemetry integration ties everything together:
- Node SDK with auto-instrumentations
TraceServicefor manual span management- W3C Trace Context propagation (traceId flows from UI to Control Plane to Runtime)
- Per-run traces accessible via
GET /runs/:id/traces
Per-run metrics persisted in the run_metrics table provide granular accounting:
- Event, message, and signal counts
- Token usage extracted from event payloads
- Estimated cost via model pricing lookup
Audit events
Both layers log security-relevant events per RFC-MACP-0004. In a world of autonomous agents, audit is not a nice-to-have — it is how you answer questions like "who tried to impersonate the security reviewer at 3 AM?":
- Authentication failures
- Authorization failures
- Duplicate message rejections
- Terminal state transitions
- Cancellation events
- Rate limit violations
When Things Go Wrong: Error Handling
Distributed systems fail. Agents send invalid messages. Networks partition. Runtimes crash. MACP does not pretend otherwise — it provides structured, consistent error handling across every layer, rooted in a shared taxonomy from the error code registry.
Runtime error codes
The runtime returns precise, actionable error codes. Notice how each code maps to an HTTP status, making it straightforward to surface errors in REST APIs:
| Code | HTTP | When |
|---|---|---|
UNAUTHENTICATED | 401 | Authentication failed or missing |
FORBIDDEN | 403 | Authenticated but not authorized |
SESSION_NOT_FOUND | 404 | Session ID doesn't exist |
SESSION_NOT_OPEN | 409 | Session is RESOLVED or EXPIRED |
DUPLICATE_MESSAGE | 409 | message_id already accepted in session |
SESSION_ALREADY_EXISTS | 409 | SessionStart for existing session_id |
INVALID_ENVELOPE | 400 | Envelope validation failed |
UNSUPPORTED_PROTOCOL_VERSION | 400 | No mutual protocol version |
MODE_NOT_SUPPORTED | 400 | Mode not available or not registered |
INVALID_SESSION_ID | 400 | Session ID format invalid |
PAYLOAD_TOO_LARGE | 413 | Exceeds maximum payload size |
RATE_LIMITED | 429 | Too many requests from this sender |
UNKNOWN_POLICY_VERSION | 404 | Policy not found in registry |
POLICY_DENIED | 403 | Commitment rejected by governance rules |
INVALID_POLICY_DEFINITION | 400 | Policy fails schema validation |
INTERNAL_ERROR | 500 | Unrecoverable runtime error |
Control plane error codes
The control plane adds its own error codes for orchestration-level failures — things the runtime does not know about because they happen in the layer above it:
| Code | When |
|---|---|
RUN_NOT_FOUND | Run ID doesn't exist |
INVALID_STATE_TRANSITION | Cannot transition run to requested state |
RUNTIME_UNAVAILABLE | Cannot connect to runtime |
RUNTIME_TIMEOUT | gRPC deadline exceeded |
STREAM_EXHAUSTED | Max stream reconnection retries exceeded |
SESSION_EXPIRED | Runtime session expired during run |
KICKOFF_FAILED | Initial kickoff message rejected |
MODE_NOT_SUPPORTED | Requested mode not available on runtime |
CIRCUIT_BREAKER_OPEN | Runtime circuit breaker is open |
MESSAGE_SEND_FAILED | Mid-session message send failed |
SDK exception hierarchy
Both SDKs wrap these error codes in typed exception hierarchies that make error handling in agent code clean and idiomatic.
TypeScript:
MacpSdkError (base)
├── MacpTransportError — gRPC connection failure
├── MacpAckError — Runtime NACK (carries ack.error.code)
├── MacpSessionError — Session state violation
├── MacpTimeoutError — Deadline exceeded
└── MacpRetryError — All retries exhaustedPython:
MacpSdkError (base)
├── MacpAckError — Runtime NACK (carries AckFailure)
├── MacpSessionError — Session state violation
├── MacpTransportError — gRPC failure
│ ├── MacpTimeoutError — Deadline exceeded
│ └── MacpRetryError — Retries exhaustedError handling in practice
Back in our scenario, what happens if the cost optimizer tries to vote after the session has already been resolved? Here is how the SDKs handle it:
// TypeScript
try {
await session.vote({ proposalId: 'p1', vote: 'APPROVE' });
} catch (err) {
if (err instanceof MacpAckError) {
// Runtime rejected the message
console.error(err.ack.error?.code); // 'SESSION_NOT_OPEN', 'FORBIDDEN', etc.
} else if (err instanceof MacpTransportError) {
// gRPC connection issue — may be retryable
} else if (err instanceof MacpTimeoutError) {
// Deadline exceeded
}
}# Python
try:
session.vote(proposal_id="p1", vote="APPROVE")
except MacpAckError as e:
print(f"Rejected: {e.failure.code}") # 'SESSION_NOT_OPEN', etc.
except MacpTransportError:
print("Connection failed")
except MacpTimeoutError:
print("Deadline exceeded")The error hierarchy is designed so that the most specific exceptions are caught first. A MacpAckError means the runtime understood the message but rejected it — you need to look at the error code to decide what to do. A MacpTransportError means the message may not have reached the runtime at all — retrying might make sense. This distinction matters for building resilient agent logic.
Rewinding Time: Replay and Determinism
We saved one of the most powerful features for near the end, because replay only makes sense once you understand everything that came before it.
MACP provides a structural replay guarantee: replaying identical accepted Envelope sequences under identical bound versions MUST reproduce identical state transitions (RFC-MACP-0003). This is not an aspirational goal — it is an invariant enforced by every design decision we have discussed so far: immutable version binding, pure policy evaluation, authoritative ordering, and runtime-derived sender identity.
The determinism boundary
The replay engine takes a small, well-defined set of inputs and guarantees that a specific set of outputs will be identical:
flowchart LR
subgraph Inputs["Deterministic Inputs"]
H["Accepted Envelope\nSequence"]
MV["mode_version"]
CV["configuration_version"]
PV["policy_version"]
PRT["macp_version"]
end
subgraph Outputs["Guaranteed Identical"]
ST["State transitions"]
AD["Accept/reject decisions"]
TS["Terminal state\nRESOLVED or EXPIRED"]
TM["Terminal message"]
end
Inputs --> F["Deterministic\nReplay Engine"]
F --> OutputsWhat is guaranteed
Given identical accepted envelope history and identical bound versions:
- Session lifecycle transitions are identical
- Within-session acceptance order is identical
- Idempotent duplicate handling is identical
- Terminal state (RESOLVED/EXPIRED) and terminal message are identical
What is NOT guaranteed — and why that is fine
Not everything can or should be deterministic. The protocol is honest about its boundaries:
- Semantic outcomes — Mode-defined results (e.g., Task mode is structural-only; external side effects may differ)
- Error message text — May vary between runtime versions
- Cross-session ordering — Only within-session order is deterministic
- External side effects — Application responsibility
Determinism classes by mode
Each mode has a determinism class that tells you exactly what replay guarantees you get:
| Mode | Class | Meaning |
|---|---|---|
| Decision | Semantic-deterministic | Same history + versions = same outcome |
| Proposal | Semantic-deterministic | Same history + versions = same outcome |
| Task | Structural-only | State transitions guaranteed; execution results may differ |
| Handoff | Context-frozen | Deterministic only if bound context replayed exactly |
| Quorum | Semantic-deterministic | Same ballots + threshold = same quorum state |
Our deployment decision scenario uses Decision mode, which is semantic-deterministic. If you replay the same proposals, evaluations, and votes under the same mode, configuration, and policy versions, the outcome will always be "blue-green deploy, approved by majority." Always.
TTL determinism
Even time-based expiration is deterministic. Session TTL is computed from the SessionStart envelope's timestamp_unix_ms:
expiration = SessionStart.timestamp_unix_ms + ttl_msDuring replay, the pre-computed deadline from the original session is used — never wall-clock time. If TTL elapsed before a terminal condition was accepted, the session is EXPIRED. This means you can replay a session that originally ran for two minutes in two seconds, and the expiration logic still behaves correctly.
Replay via the control plane
The control plane supports three replay modes, each useful for different scenarios:
| Mode | Behavior |
|---|---|
instant | All events emitted immediately |
timed | Events replayed with proportional inter-event timing (speed multiplier supported) |
step | Events emitted one at a time on request |
POST /runs/:id/replay — Start replay session
GET /runs/:id/replay/stream — SSE of replayed events
GET /runs/:id/replay/state — Projection at specific sequence numberThe timed mode is particularly useful for post-mortems — you can watch a coordination session unfold at 10x speed, seeing exactly when each agent acted and how long deliberation took. The step mode is a debugger's best friend: advance one event at a time and inspect the projection at each step.
Storage and crash recovery
Here is an elegant detail: the same determinism guarantee that serves replay also serves crash recovery. The runtime uses an append-only log per session. On startup, sessions are recovered by replaying their logs through the same deterministic state machine. Crash recovery is just replay with a different trigger.
Storage backends:
- FileBackend —
session.json+log.jsonlper session (default) - RocksDB — Embedded key-value store
- Redis — Shared storage for multi-instance deployments
Putting It All Together
Let us return one last time to our three agents — the architect, the security reviewer, and the cost optimizer — and trace their deployment decision through the entire system, from the first connection to the final commit.
sequenceDiagram
participant UI as API Consumer
participant CP as Control Plane
participant RT as Runtime
participant A as Agent A (SDK)
participant B as Agent B (SDK)
Note over UI,B: 1. Initialization
A->>RT: Initialize
B->>RT: Initialize
UI->>CP: POST /runs (ExecutionRequest)
Note over UI,B: 2. Session Creation
CP->>RT: SessionStart (gRPC stream)
RT->>RT: Validate, create session OPEN
RT-->>CP: Ack (session bound)
CP->>RT: Kickoff messages
Note over UI,B: 3. Coordination
A->>RT: Proposal
RT-->>A: Ack
RT-->>CP: Accepted envelope (stream)
CP-->>UI: SSE canonical_event
B->>RT: Vote
RT-->>B: Ack
RT-->>CP: Accepted envelope (stream)
Note over UI,B: 4. Resolution
A->>RT: Commitment
RT->>RT: Policy evaluation (pure function)
RT->>RT: Session → RESOLVED
RT-->>A: Ack (session_state=RESOLVED)
RT-->>CP: Session resolved
CP->>CP: Normalize + project + persist
CP-->>UI: SSE run.completed
Note over UI,B: 5. Observability
UI->>CP: GET /runs/:id/state
CP-->>UI: RunStateProjection
UI->>CP: GET /runs/:id/traces
CP-->>UI: OpenTelemetry spansThe agents connected and negotiated capabilities. The control plane opened a session with majority voting policy. The architect proposed blue-green deployment. The security reviewer evaluated it favorably. The cost optimizer voted to approve. The architect committed the decision, the runtime evaluated the majority policy and confirmed it, and the session resolved.
Every step was authenticated. Every message was validated through the admission pipeline. Every event was persisted, normalized, and projected for the UI. The entire session can be replayed — instantly, at speed, or step by step — and will always produce the same outcome. Distributed traces connect every span from the UI through the control plane into the runtime. Audit logs capture every authentication attempt, every authorization decision, every state transition.
This is MACP's core promise: when autonomous agents need to produce one binding outcome, the protocol specification defines the rules, the runtime enforces them, the control plane orchestrates and observes, and the SDKs give agents a typed interface to participate. All four layers working together to turn a chaotic multi-agent conversation into a structured, auditable, replayable coordination process with a single authoritative result.