What's live. What's beta. What's planned.

Live items are verifiable. Beta items name their caveat. Planned items have a target window.

Updated 2026-05-15 · Trust →

Live — verifiable today

Passes a production smoke check or has verifiable artifact IDs.

Hosted MCP at chieflab.io/api/mcp

45 production tools focused on the launch-operator wedge. JSON-RPC 2.0, HTTP transport, llms.txt, .well-known/mcp.json, smithery.yaml. api.chieflab.io is the proxy-free Vercel-direct primary endpoint; chieflab.io/api/mcp is the brand-domain fallback.

proof: POST chieflab.io/api/mcp with tools/list — returns 45 tools (customer-smoke gates this number).

chieflab_launch_product end-to-end loop

Plan → create → approve → publish → measure → recommend, in real artifacts. X publish via Zernio (real post 69f8a74fc50416d0f77f852e). Email via Resend (real messageId ba122a9c-2843-4d38-b998-a345990911dc). 24h post-launch review pulls Zernio + GA4 + Search Console and recommends next iteration.

proof: CLOSED_LOOP_VERIFIED.md in repo lists every artifact ID.

Signed reviewUrl + workspace Inbox at /app

Two human-control surfaces, one backend. Signed HMAC reviewUrl (no login, 7-day TTL) for the agent-shareable approval. Workspace Inbox for owner-side multi-run control. Approve / reject / inline feedback — rejection feedback flows into per-tenant memory.

proof: Try it: POST chieflab.io/api/sandbox/launch and click the returned reviewUrl.

Live OAuth reads — GA4 + Search Console

Real OAuth flow, encrypted token storage, real fetch against analyticsdata.googleapis.com and webmasters/v3. Wired into chieflab_post_launch_review.

proof: Connect from /app/connections — read snapshot returns real data.

Approved publish — Zernio (social) + Resend (email)

Real per-platform publishing across LinkedIn / X / Threads / IG / FB / Bluesky / TikTok via Zernio. Real email send via Resend (mail.chieflab.io is verified; onboarding@resend.dev is the bootstrap fallback).

proof: Approve a publishAction or sendAction — the connector fires immediately.

Per-tenant brand context + voice memory

Inline brand: { audience, voice, pillars } threads into agent drafting prompts. Approved/rejected voice samples persist. The tenancy spine — every run, action, secret, memory entry is workspace+tenant scoped.

proof: chieflab_create_tenant + chieflab_set_tenant_context — second run uses the context.

Image generation (Gemini 2.5 Flash, brand-grounded)

Opt-in via imagesNeeded > 0. Three modes: brief / prompt for caller's image model (free, default), BYOK image gen (free), hosted Gemini ($0.04 / image). No surprise bills.

proof: Sandbox launch with imagesNeeded: 1 — returns a generated image URL.

8 install paths — verified configs

Cursor (one-click cursor:// deeplink), Claude Desktop, Codex, Lovable, Bolt, OpenClaw, Hermes, Base44. Direct HTTPS works for any agent that can fetch.

proof: Each config block is in /install/<runtime> and re-tested on every release.

50/50 customer-perspective smoke checks

npm run smoke:customer runs after every prod deploy and gates the Cloudflare Pages function bundle that proxies chieflab.io/api/* to the API. A documented prior regression makes this script load-bearing.

proof: Run scripts/customer-smoke.mjs — non-zero exit on any regression.

LaunchBench — public reproducible benchmark

Apache 2.0 benchmark in benchmark/. 20 real product URLs, 6-dimension rubric, deterministic LLM-judge prompt. Compares ChiefLab vs raw Sonnet, raw GPT-5, Sonnet+chieflab-launch skill.

proof: cd benchmark && node run.mjs — leaderboard.md is the artifact.

P10 scaffolding — Multi-agent GTM orchestrator + 8 agents

Eight specialized GTM sub-agents under ChiefMO (launch / social / email / blog-seo / creative / analytics / experiment / approval). Typed handoffs, per-tenant shared memory, declared dependencies. Orchestrator routes; agents never call each other directly. 8 MCP tools: chiefmo_gtm_run_start + 7 read/write helpers. Public agents page at /agents. Architecture spec: docs/proposals/p10-multi-agent-gtm.md. NB: this row is the SCAFFOLD only; the dogfood + design-partner proof is in Beta below.

proof: node scripts/gtm-smoke.mjs — orchestrator end-to-end smoke green. node scripts/security-mismatch-smoke.mjs — workspace isolation contract on every GTM tool green (18 assertions).

Stripe webhook (subscription state sync) — closes P7 revenue gap

POST /webhooks/stripe verifies the Stripe-Signature header (HMAC-SHA256, 5-min replay tolerance) and applies customer.subscription.{created,updated,deleted} + invoice.paid to chieflab_workspace_owners. Idempotent via chieflab_stripe_events. STRIPE_WEBHOOK_SECRET env required; STRIPE_PRICE_TO_PLAN env maps price ids to plan names.

proof: node scripts/stripe-webhook-smoke.mjs — 7/7 green (signature missing/malformed/stale/wrong-secret all 400; valid sig 200; replay 200; invoice.paid 200).

P21 — Cold-start usability sweep

Strategic fix: manualFallback default surfaces paste-ready briefs inline when zero connectors are wired (was: missing_connector error). Brand-scrape pre-pass + anti-fabrication rail (3 stacked rules: no refusals, no fabrication, honest degradation with _inferred_ markers). chieflab_help as first-contact tool. _meta.category on all tools so cold agents discover primary first. (The earlier experimental.progress capability was removed when the official MCP SDK rejected unknown experimental fields; see /status for the current initialize check.) chieflab_redraft for multi-turn revision (two-call pattern: brief → render → commit). Smart channel narrowing on thin context — drafts 1 channel instead of 5-7 when scrape is thin AND repoContext is light.

proof: Cold-stranger smoke 26/26 PASS post-deploy (scripts/cold-stranger-smoke.mjs). Spec ↔ implementation parity green (12/12 reason codes match). Drafts verified honest: zero fabrications, _inferred_ markers applied, no brand conflation.

P22.1 — BYOK + cost transparency

Provider-key vault: chieflab_set_provider_key / chieflab_list_provider_keys / chieflab_revoke_provider_key. Supported providers: gemini, resend, zernio, anthropic, openai. costEstimate on every chieflab_launch_product response with per-provider source (byok | hosted | no_key | n/a) + per-provider USD. Per-workspace daily cost cap (CHIEFLAB_DAILY_COST_CAP_USD, default $5) prevents runaway hosted bills.

proof: BYOK lifecycle verified live: set Gemini key → costEstimate.gemini.source: byok → revoke → costEstimate.gemini.source: hosted. Migration trail at supabase/migrations/202605120{100,300}_connector_secrets_*.sql.

P23.1 — Reply loop (chieflab-reply operator)

Inbound: chieflab_record_engagement (push from agent or webhook) → chieflab_inbox → chieflab_draft_reply (brain voice + launch context + anti-template rules) → approval gate → chiefmo_send_reply (channel adapter routes to Zernio for social / Resend for email). Brain integration: every approved reply becomes a voice sample. Same architecture for the outbound counterpart.

proof: 5 new tools (chieflab_record_engagement / chieflab_inbox / chieflab_draft_reply / chiefmo_send_reply / chieflab_dismiss_engagement) live on chieflab.io/api/mcp. Migration: supabase/migrations/202605120200_engagement_events.sql.

P24.1 — Outbound operator (chieflab-outbound)

Cold prospecting: chieflab_record_prospect → chieflab_outbound_inbox → chieflab_draft_outbound (brain voice + product scrape + anti-canned-opener rules + ≤120w cold-email format) → approval gate → chiefmo_send_outbound (Resend adapter; LinkedIn InMail / Twitter DM stubbed for later). When the prospect replies, the chieflab-reply loop catches it via chieflab_record_engagement. Closed selling loop end-to-end.

proof: Full outbound loop verified live with Resend test address delivered@resend.dev: 6/7 steps green; step 5 (real Resend send) requires workspace BYOK Resend key (config gap, not code). Migration: supabase/migrations/202605120400_outbound_prospects.sql.

P25.1 — Security audits + namespace cleanup + status/brain aggregators

Approval-gate adversarial audit (scripts/approval-gate-bypass-test.mjs): 4 executors × 7 bypass attempts — 0 breaches. reviewUrl HMAC adversarial audit (scripts/review-url-hmac-test.mjs): 8 attacks — 0 breaches. chieflab_* surface promoted to primary; legacy chiefmo_* names demoted to category=legacy (back-compat retained). chieflab_status: single aggregator returning pending approvals + ripe measurements + new engagements + queued prospects + recent launches + today's hosted spend. chieflab_brain_summary: plain-English narrative of what the workspace brain has learned (the moat made visible).

proof: scripts/approval-gate-bypass-test.mjs and scripts/review-url-hmac-test.mjs both exit 0. Audits at docs/APPROVAL_BYPASS_AUDIT_2026-05-12.md and docs/REVIEW_URL_HMAC_AUDIT_2026-05-12.md.

Beta — works, but with caveats

Ships today with a known caveat.

Repo-aware launch context

API shape designed in docs/proposals/repo-aware-mvp.md. Caller passes routes, recent commits, README excerpts; ChiefLab uses them in drafting prompts. Currently behind a flag while we eval against the repo-aware fixture set in benchmark/fixtures/. Targeting general availability after first 5 design partners run on real repos.

caveat: Today: repo context flows through, but eval signal vs. plain-agent baseline is in collection.

HubSpot connector (read)

OAuth completes; token is captured but the encrypt+persist + getDealsForWorkspace read function are still TODO. ChiefSales runs return mock-shaped data until the read path lands.

caveat: Hidden from /app/connections behind HUBSPOT_READS_LIVE flag — listed as 'Beta' there once unhidden.

Stripe billing (paid plans actually charging)

Webhook handler is shipped + smoke-gated (see Live above). Plan column flips on customer.subscription.{created,updated,deleted}; invoice.paid stamps last_invoice_paid_at. What's still required to actually charge a customer: (1) STRIPE_WEBHOOK_SECRET env on Vercel; (2) Stripe products/prices created in the dashboard; (3) STRIPE_PRICE_TO_PLAN env (JSON map); (4) the 202605070400 migration applied. All 4 are dashboard/env tasks, not code work.

caveat: Until the user runs through (1)–(4) there are still zero paid customers. Code is no longer the blocker.

Operator-daily cron

Route + schedule + queue table all shipped (vercel.json schedules 23 8 * * *). Skips with 'env not set' until CHIEFLAB_OPERATOR_WORKSPACE is configured.

caveat: Toggle is one env var away from firing; first run after setup populates the queue.

Workspace claiming (agent-first → human Google sign-in)

Agent that mints a key gets a workspace; design for letting the human later attach their Google account to the same workspace is in docs/proposals/workspace-claiming.md.

caveat: Today: agent-first signup works; the human-claim handoff is the next ship.

Dashboard analytics

/app/usage exists; ingestion is per-workspace metered. Cloudflare Web Analytics is the planned host-level analytics for chieflab.io itself (no signup, native to Pages).

caveat: Analytics on chieflab.io is not yet enabled.

P10 — GTM command center at /app/gtm

Six panes: active runs, agent kanban (one column per agent), approvals queue, scheduled work, assets timeline, results. Reads from /workspace/gtm/runs (list, paginated) and /workspace/gtm/runs/:id (full agent_runs + handoffs timeline) — both wired in apps/api/src/server.js as of 2026-05-08.

caveat: Page renders the empty-state copy until a workspace fires its first chiefmo_gtm_run_start AND the GTM migration (202605070300) is applied to prod. Pre-migration the orchestrator falls back to in-memory; the dashboard sees 0 rows.

P10 — Zero dogfood runs, zero design partners (the wedge isn't proven yet)

The orchestrator works in smokes; no real launch has been run through it yet, on chieflab.io or any partner repo. Until that happens, the multi-agent claim is architecture, not validated product. Outreach templates ready in docs/DESIGN_PARTNER_LOOP.md; first-run script ready at scripts/dogfood-gtm.mjs.

caveat: Honest framing for any investor / design-partner conversation: 'orchestrator + 8 agents are scaffolded with typed handoffs; first real-world run is the next ship.' Not 'multi-agent GTM team is live.'

P10 agents — all 8 emit grounded briefs; final rendering is caller's LLM (outputMode: context)

Live behavior: launch (positioning + channel picker heuristic), social (per-platform briefs → Zernio publish on approval), email (per-segment briefs → Resend send on approval), blog-seo (multi-source target-query picker reading positioning + repoContext + recent commits, de-duped against prior workspace targets), creative (Gemini image-gen opt-in), analytics (real GA4 + SC reads), experiment (3 hypotheses grounded in metrics movers + P9 brain channel.performance, skips top hypothesis if it matches prior run's recommendation), approval (existing approval-pack skill + reviewUrl flow).

caveat: By design ChiefLab does NOT pay for the writing — the caller's LLM renders the final copy from each brief. The next ship is wiring the optional outputMode: 'full' path through providers.modelRouter for callers who don't have an LLM in front of them (e.g. stand-alone CLI flows).

Planned — next 30 days

If not shipped by 2026-06-05, moves to the 60-day bucket with a note.

5–10 design partners running on real repos

Recruit + onboard. Weekly feedback loop, per-partner case studies. Outreach templates in docs/design-partner-loop.md.

Repo-aware GA + first eval results

Run benchmark/fixtures against ChiefLab vs raw-Sonnet, publish leaderboard.md, link from this page.

MCP registry listings

Submit to Smithery, Cursor MCP catalog, Anthropic MCP catalog after eval signal exists. Submission packets in docs/MCP_REGISTRY_SUBMISSION.md (currently being rewritten with accurate connector claims).

Resend domain verification + custom 'from'

mail.chieflab.io DKIM/SPF/DMARC pending DNS — drop the onboarding@resend.dev fallback once verified.

P10 — blog-seo + experiment agents to real model calls

Migration is applied; REST endpoints are wired; orchestrator and 6 of 8 agents call real providers (launch heuristic, social/Zernio, email/Resend, creative/Gemini, analytics/GA4+SC, approval). blog-seo and experiment still emit valid frame-stub handoffs in mock mode — next ship wires them through the existing skill registry + modelRouter so the multi-agent loop is fully provider-backed end-to-end.

P10 — Run first multi-agent GTM run on chieflab.io itself

Invoke chiefmo_gtm_run_start({ goal: 'Launch the GTM operating system to existing ChiefLab users', productUrl: 'chieflab.io' }) — capture handoffs, verify orchestrator picker accuracy, screenshot the command center for the deck.

Planned — next 60 days

HubSpot read path → ship token-persist + getDealsForWorkspace

Move HubSpot from 'Beta' to 'Live' on /trust. Unhide in /app/connections.

Activate Stripe webhook in prod (env + dashboard)

Code is live (see Live above). Remaining: set STRIPE_WEBHOOK_SECRET env on Vercel, create Starter $39/mo + Pro $149/mo prices in Stripe dashboard, register the webhook URL https://api.chieflab.io/webhooks/stripe, paste the price IDs into STRIPE_PRICE_TO_PLAN env, apply migration 202605070400. After that, the first real subscription flips a workspace to plan='starter' automatically.

Self-serve data delete in /app/settings

Today: delete by emailing hi@chieflab.io. Goal: one-click workspace + tenant delete with 7-day soft-delete window.

3 launch case studies published

Repo context → launch pack → approval → publish/send → measured 24h result. Template in marketing/case-study-template.md.

Planned — next 90 days (driven by demand)

Ships only if design-partner pull justifies it.

ChiefSales as second operator (driven by demand)

Only after the launch operator has signal from design partners. HubSpot real OAuth + a real CRM-writes path. Not before.

Python SDK

TypeScript SDK first. Python second, when we see Python-runtime callers in the wild.

Public Discord + community

ChiefLab needs its own Discord server for support + community. ~10 min to spin up; deferred until after design partners are running.

Not on the roadmap — deliberately

Considered and ruled out.

ChiefSupport / ChiefFI / ChiefOps as launch operators

Cut from the wedge. Tool definitions exist, zero connectors wired. Will revisit only after launch has clear pull.

Self-host story / OSS runtime

Hosted runtime is the product. If asked, email hi@chieflab.io — but we won't promise OSS we can't support.

Per-seat SaaS pricing

Usage-based or workspace-flat. Not per-seat. The buyer is a developer, not a 50-seat marketing team.

Want to push something up the priority list?

Design partners shape this roadmap directly. If you'd run ChiefLab on a real repo and give weekly feedback, email hi@chieflab.io.