Skip to main content

Principles

How I build production AI agents.

Opinionated rules from 4 years shipping agentic systems — pyAGI (acquired by AGI House), agent-audit-kit, agent-airlock, mnemo, and production work at Attri.ai. Living document — revised when the world contradicts a rule.

  1. 01

    Audit-trail first. If you can't replay it, you don't own it.

    Every agent run needs to be reconstructable from logs alone. No replay → no debugging → no audit → no production. The audit-emit channel ships before the feature, not after.

  2. 02

    Capability leases over long-lived keys.

    Long-lived secrets are the SQL injection of the agent era. Every tool call gets a short-lived, scoped, revocable lease. Okta NHI + Cloudflare Mesh + Cisco Agentic Workforce Identity all converge here for a reason.

  3. 03

    Deny by default on tool calls. Allow-listing is the only safe surface.

    An agent that can call any function is an agent that will call the wrong function. Tool inventories shrink, not grow, as you mature. Every new tool is a new attack surface — justify each addition.

  4. 04

    Eval harnesses ship before features.

    Feature-without-eval is technical debt with a deadline. Golden sets are written first; the feature is just whatever passes them. The team that ships evals last ships rollbacks first.

  5. 05

    MCP STDIO is untrusted. Sandbox or die.

    CVE-2026-30623 confirmed it: STDIO MCP is RCE-by-design. Every connector — first-party or community — is treated as untrusted. E2B / Firecracker / gVisor sandboxes are non-negotiable. No exceptions for vendors you trust this week.

  6. 06

    Cost router > model snobbery.

    Simple problems get simple models. The cascading router pattern (small model first, escalate on uncertainty) cuts production cost 60-80% with zero quality loss on the long tail. The model fanboy is the cost center.

  7. 07

    Vendor-multi by default. Substrate deals shift.

    Microsoft–OpenAI restructured in April 2026; the AGI clause is gone. Anyone who standardized on one frontier vendor is one announcement away from a forced migration. Your routing layer should outlive any single vendor's term sheet.

  8. 08

    Ship the changelog publicly. Build-in-public is the cheapest distribution.

    A dated, public build log earns more inbound than the polished landing page it points to. The VAJRA build log is more valuable than the VAJRA homepage — because the log is evidence.

  9. 09

    Reproducibility is a feature, not a chore.

    If the run is non-deterministic, the audit is fiction. Seed everything that can be seeded. Pin model versions in the contract. "It worked yesterday" is the saddest sentence in agent operations.

  10. 10

    Acquisitions die. The patterns survive.

    pyAGI was acquired in 2022. The task-create / execute / prioritize / replan loop it embodied has shipped in a hundred frameworks since. Write down what survived the acquisition — that's the thing that compounds.

  11. 11

    Less, sharper, louder.

    Five weak repos cost more attention than two strong ones earn. Five middling Gumroad products earn less than one serialized book. Six competing CTAs earn less than one obvious next step. The strongest move in a noisy market is silence on everything else.

  12. 12

    Naming a thing is half of owning it.

    "Context Engineering" (Lance Martin). "AI Evals" (Hamel Husain). "Antibrittle Agents" (Hrishi Olickel). None of these existed before someone wrote three essays. The category-shaper is whoever defines the vocabulary, not whoever ships the most code.

Last revised 2026-05-23. Disagree with one of these? Tell me why.