1. Scope and conformance
This document specifies the ChiefLab Execution Contract: a lifecycle, a set of required tool shapes, and an approval protocol that any "operator" — a server-side tool that performs business work on behalf of an agent + a human — MUST implement to interoperate with other ChiefLab operators and clients.
The contract is intentionally minimal. It says nothing about WHAT business work an operator does (launch a product, post to a channel, run a sales sequence, manage a knowledge base). It only says HOW that work must be staged, reviewed, approved, executed, measured, and persisted so that a stateless agent and a stateful human can collaborate safely across time.
The key words MUST, MUST NOT, SHOULD, MAY, REQUIRED, RECOMMENDED, OPTIONAL in this document are to be interpreted as described in RFC 2119.
Why this contract exists
The Model Context Protocol (MCP) — released November 2024 — defined the syntax of agent-to-tool communication: JSON-RPC over stdio or HTTP, tools/list, tools/call, resources, prompts. It did not define the semantics of safe agent action: approval state machines, audit trails, idempotency under retry, signed review URLs, cross-run memory, lifecycle stages.
That gap is what this specification fills. ChiefLab's Execution Contract is the semantic layer on top of MCP's syntactic layer. We do not replace MCP. We complement it.
Concretely, stateless LLMs can draft. They cannot:
- hold OAuth tokens across sessions,
- enforce "no publish without explicit human approval" server-side,
- poll measurement APIs 24 hours after publish,
- persist tenant-scoped memory across runs,
- guarantee idempotency under retry.
These are the irreducible stateful primitives of agent-mediated business execution. The contract names them, formalizes their interfaces, and makes them composable across operators that follow it.
2. Vocabulary
- Operator
- A server-side component that performs one category of business work and conforms to this specification. ChiefLab's reference implementations include
chieflab-launch,chieflab-post,chieflab-email,chieflab-measure,chieflab-brain,chieflab-connect. An operator exposes its capability through one or more tools. - Tool
- A JSON-RPC method exposed via Model Context Protocol (MCP). Tools belong to operators. An operator MAY expose multiple tools — but every tool MUST be reachable through the operator's
primaryTool. - Workspace
- The tenant boundary. Every run, action, asset, secret, and memory entry MUST be scoped by a
workspaceId. A workspace MAY serve thousands of downstream tenants (an agency workspace, for example) but the workspace itself is the auth boundary. - Run
- A single invocation of an operator producing one or more actions. Identified by
runId(UUID). Runs MUST be persisted with their inputs, outputs, status, and lineage. - Asset
- A piece of content produced during a run (a draft post, an email body, positioning text, a generated image). Assets are tenant-scoped, persisted, and referenced by
assetId(UUID). - Action
- A staged external side effect (publish, send, charge, modify). Actions are persisted with status
awaiting_approval → approved → executed(orrejected). An action MUST NOT execute until status transitions toapprovedvia the approval protocol defined in §4. - Approval
- A signed, human-authored transition of an action from
awaiting_approvaltoapprovedorrejected. Approvals MAY arrive via the in-chat path (agent callschieflab_approve_actionwith the user's consent) or via the web path (user clicks an approval URL). - Review URL
- A signed, time-limited URL (default 7-day TTL) that surfaces a run's pending actions for human review. Specified in §4. Operators MUST issue a review URL alongside every run that contains pending actions.
- Connector
- An OAuth-authenticated integration with an external service (Zernio, Resend, GA4, Search Console, etc). Connectors are workspace-scoped and persist refresh tokens server-side. Actions reference the connector they require.
- Brain
- The per-tenant memory store: brand voice, repo facts, proof assets, channel performance history, claim risk log. Operators write to the brain at the
rememberstage and read from it at thepreparestage. - Idempotency key
- A caller-provided string that uniquely identifies a logical operation. The same key MUST produce the same response on retry (no duplicate actions, no duplicate publishes).
3. The execution lifecycle
Every operator MUST implement the six-stage lifecycle:
prepare → review → approve → execute → measure → remember
Stages MAY overlap (an operator MAY emit measurement metadata at prepare time). Stages MUST NOT be skipped (a publish MUST NOT happen without a preceding approve). The lifecycle is a contract over state transitions, not a strict ordering of API calls.
3.1 prepare
The operator receives caller context (workspace, tenant, repo evidence, goal) and produces:
- One or more assets (drafts, plans, briefs) — persisted with
assetId. - Zero or more actions — persisted with status
awaiting_approval. - A review URL covering the run (§4).
- An agentGuide object instructing the calling agent how to surface results to the human.
The prepare stage MUST NOT have external side effects. No publishing. No emailing. No charging. No connector mutations. Only reads from the brain, reads from connectors, and writes to the operator's own asset/action store.
Required response shape at minimum:
{
"runId": "<UUID>",
"workspaceId": "<workspace>",
"assets": [ { "id", "type", "title", "body" } ],
"actions": [ { "id", "channel", "connector", "executorTool", "status": "awaiting_approval", "preflight": { ... } } ],
"reviewUrl": "https://<host>/runs/<runId>?token=<hmac>",
"agentGuide": {
"renderInChat": { ... }, // see §6
"userMessage": "<verbatim to user>",
"nextToolCalls": { "primary": { ... }, "fallback": { ... } },
"stopRule": "<one-sentence stop instruction>",
"agentDependency": [ ... ] // why a stateless LLM cannot replace this
},
"stopRule": "<top-level mirror of agentGuide.stopRule>"
} 3.2 review
The agent surfaces the prepared assets and pending actions to the human inline in the agent's runtime (IDE chat, voice agent transcript, custom UI). The agent MUST render agentGuide.renderInChat[*].body for each channel that produced renderable content.
The reviewUrl is a fallback surface — used when the human cannot read content in the agent's runtime (phone approval, multi-person review, image-variant gallery review, audit trail). It MUST NOT be the agent's primary surface. Pushing the user to a URL when in-chat render is available is a conformance violation.
Operators MUST emit clean human-readable content in renderInChat[*].body. Drafting briefs ("brief mode" / "context mode" / system prompts intended for the calling LLM) MUST NOT bleed into human-facing fields. See §6 for the rationale (the "approval-theater" failure mode).
3.3 approve
A human authors an approval decision for each pending action. Two paths:
In-chat path (RECOMMENDED for the IDE-native flow):
chieflab_approve_action({ actionId: "<UUID>" })
chieflab_reject_action({ actionId: "<UUID>", reason: "<human feedback>" }) The agent calls these tools when the user types "approve linkedin" / "reject email / let's edit" in chat.
Web path (fallback):
The human opens the review URL and clicks Approve / Reject / Edit. The server records the decision against the action and transitions status. Both paths converge on the same state machine:
awaiting_approval → approved (human said yes; ready for execute)
awaiting_approval → rejected (human said no; never executes)
awaiting_approval → edited (human modified content; goes back to awaiting_approval) An approved action MAY then be executed by the operator. Approval is necessary but not sufficient — connector readiness checks (§6) MAY still block execution.
3.4 execute
An operator executes an action by calling its bound executorTool. The executor MUST:
- Verify the action's status is exactly
approved. Reject otherwise. - Verify the action belongs to the caller's workspace. Reject cross-workspace access with 401, never 404 (do not leak existence).
- Honor the caller's idempotency key. Retries with the same key MUST return the prior result.
- Persist external side effects (postId, messageId, paymentId) on the action record.
- Transition action status from
approved→executedexactly once.
A publish executor that fires without status approved is a critical conformance violation.
3.5 measure
24 hours (or operator-defined interval) after execution, the operator pulls metrics from the relevant connectors and produces a measurement asset. Measurement MUST be tied to the originating runId. Operators SHOULD also recommend a next move (the "compounding" property — see §6).
3.6 remember
Operators write measurement results + human signals (approvals, rejections, edits, brand voice deltas) back into the brain for the tenant. Subsequent prepare calls read from the brain.
The brain is the substrate of cross-run memory — what makes a stateful operator stack different from a stateless LLM chain. Operators that do not write to the brain are still conformant but forfeit the compounding-state property.
4. Approval URL specification
Operators MUST issue a signed review URL for every run that contains at least one pending action. Format:
https://<host>/runs/<runId>?token=<token> Where <token> is a URL-safe base64-encoded JSON object with HMAC signature:
{
"runId": "<UUID>",
"workspaceId": "<workspace>",
"iat": <unix seconds, issued-at>,
"exp": <unix seconds, expiry — default iat + 7 days>,
"scope": "review"
} The token MUST be signed with HMAC-SHA256 using the operator's RUN_TOKEN_SECRET. The signature MUST be appended to the token using a constant-time comparable encoding.
Token verification MUST:
- Reject tokens where
runIdin the path differs fromrunIdin the token. - Reject tokens past
exp. - Use constant-time comparison for the HMAC signature.
- Return 403 on token failure (not 401 — token is invalid, not absent).
5. Audit trail format
Operators MUST persist a complete audit trail per run. Required fields per action:
{
"id": "<UUID>",
"runId": "<UUID>",
"workspaceId": "<workspace>",
"tenantId": "<tenant>", // optional but recommended
"type": "<channel>_<verb>", // e.g., "linkedin_post"
"channel": "<channel id>",
"connector": "<connector id>",
"assetId": "<UUID>", // optional — links to the rendered content
"executorTool": "<tool name>",
"status": "awaiting_approval | approved | rejected | edited | executed | failed",
"preflight": {
"severity": "low | medium | high",
"warnings": [ ... ],
"recommendations": [ ... ],
"gates": [ ... ],
"connectorReady": <boolean>,
"connectorBlocker": "<id|null>",
"connectorFixHint": "<string|null>",
"estimatedCostCredits": <number>
},
"approvedBy": "<user identifier|null>",
"approvedAt": "<ISO 8601|null>",
"executedAt": "<ISO 8601|null>",
"externalId": "<post id|message id|charge id|null>", // after execute
"metadata": { ... }
} The audit trail MUST be queryable by workspaceId alone (for tenant-isolation audits) and by runId (for per-run lineage).
6. Operator conformance
A tool stack is a conformant operator if and only if:
- It exposes a
primaryToolthat returns the §3.1 response shape with all required fields populated. - Every action it stages goes through the §3.3 approval state machine before §3.4 execution.
- It issues a §4-conformant review URL for every run with pending actions.
- It persists a §5-conformant audit trail.
- It returns clean human-readable content in
renderInChat[*].body. No frame wrappers ("## Generated (mode=context, route=...)"), no drafting prompts intended for the calling LLM, no metadata-only cards. This is the approval-theater rule: the human approving an action MUST be able to read what they're approving in the surface they're in. Failure to honor this is the most common conformance violation observed in practice. - It honors idempotency keys at every external side effect.
- It is listed in the host's
.well-known/mcp.jsonunderoperatorswith:id,label,function,primaryTool,connectors,status.
A tool that exposes MCP without honoring the lifecycle and approval state machine is NOT a conformant operator. It is "a tool with MCP." The distinction is load-bearing.
7. Error semantics & recovery
Every recoverable failure MUST return the recovery shape:
{
"ok": false,
"reason": "<reason code>",
"summaryForUser": "<one-line human-readable>",
"userMessage": "<verbatim to surface to the human>",
"fixActionForAgent": "<one-line instruction to the calling agent>",
"recoveryTool": { "name": "<tool>", "args": { ... } } | null,
"retryable": <boolean>,
"stopRule": "<one-line stop instruction>"
} Standard reason codes (operators MAY add more, MUST NOT redefine these):
missing_api_key— caller lacks a valid workspace token;recoveryToolSHOULD point to a workspace signup tool.missing_connector— required connector not yet wired;recoveryToolSHOULD point to the connect flow.requires_approval— action exists but is not yet approved; surface review URL or call the in-chat approval tool.missing_repo_context— operator needs more grounding evidence to do its job.missing_credits— workspace has insufficient credits for a metered action.workspace_not_found— token resolves but no workspace exists at that ID.connector_failed— connector returned an error during the side-effect call.measurement_unavailable— measurement window not yet reached, or upstream API down.invalid_action_id— action ID unknown or malformed.wrong_workspace— action exists in a different workspace than the caller's token.rate_limited— operator-level or connector-level rate limit hit.provider_not_live— connector is registered but not yet generally available.
The calling agent's behavior on a recovery shape MUST be:
- If
retryableis true ANDrecoveryToolis non-null: call the recovery tool, then re-attempt the original tool. - If
retryableis false: STOP. SurfaceuserMessageverbatim to the human. Do not retry. Do not synthesize a workaround.
8. Transports
This spec is transport-agnostic. Reference implementations support:
- HTTP MCP — JSON-RPC 2.0 over HTTPS POST, Bearer token auth. Recommended for hosted agent platforms (ChatGPT cloud agents, Claude API agents, Cursor cloud agents, custom server-side agents).
- stdio MCP — JSON-RPC 2.0 over stdio per the official Model Context Protocol spec. Recommended for desktop IDE runtimes.
Both transports MUST expose identical tool surfaces and response shapes. The contract is preserved across transports.
9. Reference implementations
ChiefLab ships six conformant operators:
| Operator | Primary tool | Function |
|---|---|---|
chieflab-launch | chieflab_launch_product | Orchestrates a full launch: positioning + per-channel drafts + landing + email + measurement. Internally composes the other operators. |
chieflab-post | chieflab_post | Single-channel publish. Accepts channel (linkedin / x / hn / reddit / product_hunt). Returns one draft + one publishAction. |
chieflab-email | chieflab_send | Single email send via Resend. Returns one draft + one sendAction. |
chieflab-measure | chieflab_measure | 24h readback for a runId. Pulls GA4, Search Console, Zernio engagement. Returns measurement asset + next-move recommendation. |
chieflab-brain | chieflab_brain_read / chieflab_brain_record | Per-tenant memory: brand voice, repo facts, proof assets, channel performance. Drives cross-run grounding. |
chieflab-connect | chieflab_connect_provider | OAuth flow for connectors (Zernio, Resend, GA4, Search Console, HubSpot). |
Operators outside ChiefLab MAY adopt the contract. We will list conformant third-party operators in .well-known/mcp.json on request.
10. Changelog
- v0.1 — 2026-05-12 — initial draft
- First public version of the execution contract. Defines the six-stage lifecycle, approval URL spec, audit trail format, operator conformance criteria, error recovery shape, and six reference operators. Status: draft. Feedback welcome at hi@chieflab.io.
11. License
This specification is released under the MIT License. Anyone MAY implement it, fork it, extend it, or build operators against it without permission. We ask only that derivative specs cite this version and that conformant operators link back to chieflab.io/spec/v0.1 so the contract surface stays discoverable.