Skip to main content
Case Studies

Agentic & GenAI Systems

I ship signed-capability-lease primitives for agent runtimes — the missing security contract under MCP. Production deployments of multi-agent AI systems, voice-first platforms, and enterprise GenAI solutions with 99.9% uptime.

Multi-Agent Orchestration
LLM Integration
Voice AI
Production Scale
99.9%
Platform Uptime
15+
AI Agents
70%
Cost Reduction
24/7
Availability
Flagship · Platform Engineering

Built the Org-Wide CI/CD Floor — 200-Repo Audit → Shipped Kit

208 active repos
Platform Tech Lead — design, build, operate

Ran an org-wide CI/CD audit across all active repositories at Attri.ai. Initial findings: 1,135 CRITICAL across 10 audit dimensions — 81% of repos had no branch protection, 72 repos had verified live secret leaks (332 OpenAI keys + 88 AWS access tokens + 30 GitHub PATs in git history), 0% had GitHub secret-scanning or Dependabot security updates enabled, 48% of merged PRs had zero review records. Then designed and shipped `attri-dev-kit` — semver-versioned (current v1.7.0), self-testing, one-line YAML opt-in, language-aware across Python + TypeScript + Rust + Terraform + Shell + C#.

Outcomes

  • Org-wide adoption rolling out across 208 repos via a single repository-rollout tracker
  • AI-specific guardrails: hallucinated-import detection, swallowed-exception flags across 5 languages, unjustified-lint-disable hard-blocks, AI-author signature failure escalation, test-delta gate on production code
  • Self-gating: the kit runs against itself on every PR — we don't ship a version that fails its own check
  • Framing: 'AI as a risk-multiplier' — pre-AI the cost of a careless commit was bounded by typing speed; post-AI a frontier model produces 200 lines of plausible-but-wrong code in 30 seconds. The kit absorbs the multiplier so individual engineers don't have to remember to defend against it
GitHub Actions
Reusable Workflows
Composite Actions
Trufflehog
Gitleaks
CycloneDX SBOM
Sigstore
Customer Engagements · 17 months

Six anonymized engagements across six verticals.

Industry vertical · role · scope · outcome. Client names withheld under contractual confidentiality. Every number, scope item, and outcome below is sourced from internal evidence (commit history, PR review counts, authored docs, calendar collaborator graph).

US Commercial Insurance
50-state regulated carrier

50-State Production API Integration + MS Graph OAuth2 Migration

Tech Lead — end-to-end ownership

Owned the integration of a regulated 50-state premium / tax / coverage-type API across all US jurisdictions for a commercial insurance carrier. Drove the cutover from legacy SMTP to Microsoft Graph + OAuth2 client-credentials for compliance-sensitive transactional notifications. Diagnosed and resolved a vendor auth ambiguity (Secret-ID vs Secret-Value confusion, AADSTS7000215) in one business day — validated token endpoint (200 OK) + Graph sendMail (202 Accepted) production-ready before EOD.

All 50-state production integrations live with state-by-state tax + stamping fee validation
MS Graph migration completed without a customer-visible incident
Authored long-term stabilization memo to leadership identifying team rigidity + missing automation ownership, proposed Centralized Source of Truth + Issue Tracking — adopted
FastAPI
PostgreSQL
Azure
Microsoft Graph API
OAuth2 client-credentials
Alembic
US Construction
Long-tenured US general contractor

Production Ops Portal v4 — Jobs / Timesheets / ERP Integrations

Senior IC + customer principal — 17 months

Owned the production operations portal for a US general contractor: jobs management, labor timesheets, hours summary export, audit log, trucking, supplier + products DB, vendor portal, daily recap emails with material/equipment cost columns. Built integrations with industry-standard inventory + construction-management platforms (end-to-end OAuth flows, sandboxed test envs, prod cutover). Drove a vendor-diversity automation pipeline: matching company records against a public diversity-program registry with confidence thresholds + manual override UI.

Largest single-repo footprint of tenure — 205 commits, +92,936 / −60,143 LOC across 1,629 files
Eliminated the N+1 query class across supplier lookups, shipped race-condition guards on critical writes (`get_or_create` paths, place-order)
Activated auto-replenish cron, designed-equipment red-marker, hours-summary export for trucking payroll admins
Django
DRF
PostgreSQL
Google App Engine
Sentry
OAuth2
US Healthcare AI
Greenfield product launch

Greenfield Healthcare AI Platform — Both Halves in 6 Weeks

Tech Lead — end-to-end greenfield

End-to-end ownership of a new healthcare AI product launch — both halves of the stack (FastAPI backend + React frontend). Brought to production from empty repo to v1 cutover in 6 weeks. Authored the Infrastructure & Compliance Audit identifying 8 critical pre-launch gaps: unredacted PHI passed to model providers, subscription ownership mismatch, missing BAA/SLA/IP clauses in vendor contracts, plaintext secrets in App Service settings, Postgres `log_statement=all` logging PHI parameters, missing diagnostic settings.

95k LOC across 885 files shipped in 6 weeks (backend + frontend)
Verified strengths pre-launch: production data plane on private endpoints, Postgres CMK + 35-day backups, three-role RBAC + Entra ID OAuth, no third-party telemetry with PHI
Closed-loop compliance audit before customer go-live
FastAPI
React
Azure App Service
Azure Postgres
Entra ID OAuth
Azure Key Vault
Mistral OCR
Anthropic Claude
US EdTech (K-12 test prep)
Struggling MERN platform

121-Issue Codebase Audit in 30 Days + Remediation Roadmap

Senior auditor — code review, architecture, remediation lead

Inherited a struggling MERN-stack platform serving K-12 students. Ran a parallel-AI-agent code review across all three repos (frontend, backend, AI service) producing 121 categorised issues, 23 of them Critical including: auth bypass, concurrency-driven data corruption, payment gaps, zero automated test coverage. Authored a 16-section Master Engineering Plan: Mongo→Postgres migration, AI question-generation rebuild, Bayesian mastery / IRT engine fixes, infra modernization, observability from zero, COPPA/FERPA/PCI DSS compliance roadmap, mobile (React Native / Expo) strategy.

Full audit + 30+ findings per service delivered in 30 days, single-handed
Architecture diagrams (system, data model, user flows, adaptive testing engine, subscription flow) — reusable by the in-house team
Took ownership of remediation: GitHub access for the team, Stripe sandbox integration, SendGrid setup, PostHog analytics, Azure infra + MFA, ongoing daily updates
MongoDB → PostgreSQL
React Native + Expo
Stripe
SendGrid
PostHog
Azure
Parallel AI code review
US AmLaw 200 Legal Firm
Enterprise legal-AI deployment

Enterprise Claude AI Audit + Observability Platform

Platform Tech Lead — system design, security, runbooks

Designed and operates a Claude-based audit + observability platform running inside the firm's own Azure tenant — captures every prompt, response, and tool use into private Azure PostgreSQL in the US, exposed through a private API for compliance reviewers. Designed for ABA Model Rule 1.6 confidentiality + ABA Formal Opinion 512 (generative-AI ethics). Three-party model: model vendor (Anthropic) + cloud platform vendor (Attri) + IT partner (managed-services provider).

Authored: System Overview (SSOT for compliance + vendor reviewers), cross-platform desktop-agent Security & MDM-Readiness Assessment (Windows + macOS), SigNoz on Azure private-subnet runbook (~$206/mo), code-signing posture status, Anthropic enterprise escalation memo for Claude Code-class hook parity
OTel + SigNoz design: no public IP on VM, ingress via Azure Standard Load Balancer:443, NAT GW egress, TLS via Nginx + Let's Encrypt, bearer-token auth at the Nginx layer, UI accessible only via SSH tunnel
Drove code-signing setup (Authenticode + Apple Developer ID notarization) for the desktop agent that ships to attorneys via MDM
Anthropic Claude
Azure
PostgreSQL
OpenTelemetry
SigNoz
Nginx
MDM / Intune
ABA Rule 1.6 compliance
US Geospatial / Drone Inspection
Vendor selection POC

Drone-Inspection POC — Vendor Selection + KMZ Verification

Technical Lead — POC scope, vendor eval, ingestion design

Defined the POC scope for an automated drone-inspection pipeline for a US commercial real-estate buyer. Required KML/KMZ exportable flight-mission file + Smart Oblique capture for repeatability. Personally verified `.kmz` flight-path data (`waylines.wpml`) — 'the golden ticket' for reproducible captures. Confirmed multi-format deliverables (OBJ + LAS/LAZ + DXF). Requested AT (aerial triangulation) / Block-Exchange XML for centimeter-grade accuracy. Authorized capture; designed Phase-2 AI ingestion pipeline.

Vendor selection criteria locked in writing — repeatable flight-mission + multi-format deliverables + AT XML for accuracy
POC capture authorized for first target building
Phase-2 AI ingestion pipeline scoped
KML/KMZ
WPML
OBJ / LAS / LAZ / DXF
AT/Block-Exchange XML
Smart Oblique
Open Source Production Tools

Scanners & Firewalls for Agentic Systems

Three OSS tools I ship and use on my own agents — plus the evaluation plugin that rates them.

As of 2026-05-24 · Metrics auto-synced from GitHub.

Agent Security
v0.3.24

agent-audit-kit

SAST-style scanner for agentic AI systems. Full OWASP Agentic + MCP Top-10 coverage, SARIF output, 11-framework compliance reporting (EU AI Act, SOC 2, HIPAA, NIST AI RMF, ISO 42001).

148 rules · released 2026-05-23
GitHub
Runtime Firewall
7 releases

agent-airlock

Runtime firewall for AI agents. Ghost-argument stripping, strict type validation, PII masking, RBAC, E2B sandboxing, network airgap, circuit breaker, cost tracking.

1,157 tests · 9 framework integrations · PyPI
GitHub
Agent Memory
132 tests

mnemo

MCP-native embedded memory database for AI agents, written in Rust. REMEMBER/RECALL/FORGET/SHARE primitives, hybrid vector search (RRF), AES-256-GCM encryption, branching/replay, RBAC.

15 framework integrations · DuckDB + PostgreSQL backends
GitHub
Quality Evaluation
Claude Code plugin

verdict

Universal quality judge for Claude Code. 7-dimension scoring (correctness, completeness, adherence, efficiency, safety), configurable rubrics, threshold blocking, auto-hooks.

7-dimension scoring · /judge command · auto-hooks
GitHub
Side Project Spotlight · Try it live

Consumer AI — beyond enterprise

Solo-built consumer product that proves the agent-feasibility playbook. Type a problem below — get the full analysis on the live product in a new tab.

AI that doesn't care about your feelings

Why can't we have an agent for this?

Type any problem you wish an AI agent would do for you. Get a brutally honest roast, viability score, competitive landscape, open-source alternatives, agent-readiness scorecard, and a CLAUDE.md scaffold you can drop into Cursor — in ~60 seconds.

No signup. No BS. Opens in a new tab on the live product.

Or try one of these

What you'll get back~60 seconds · no signup

A full agent-feasibility report

Every roast returns the same eight-section structure — quick to skim, brutal where it needs to be, deterministic agent-readiness score at the end.

  1. 1
    Verdict + 1-10 score
    Tier from "Build it yesterday" to "Don't bother"
  2. 2
    Viability sub-scores
    Market demand · feasibility · competition · monetization · disruption risk · fun factor
  3. 3
    Pros & cons
    What's going for it · what's against it · what kills it
  4. 4
    Who you're up against
    Real competitors with positioning + threat level
  5. 5
    Open-source alternatives
    What you could fork instead of building from zero
  6. 6
    Big-AI killer timeline
    Who absorbs your idea + when + your survival strategy
  7. 7
    Build estimate
    Solo-dev time · team size · cost · suggested tech stack
  8. 8
    CLAUDE.md scaffold
    Drop-in starter spec for Claude Code / Cursor
Solo-built · Hacker News #1 referrer Week 1See live product

The Production Agent

Newsletter

Weekly lessons from running 15+ AI agents in production. Governance, security, memory, cost optimization. No demos — systems that work.

  • Agent orchestration & governance patterns
  • LLM cost optimization strategies
  • Security & memory architecture deep-dives
  • Real production war stories & lessons
Subscribe for Free

Free forever. No spam. Unsubscribe anytime.

Build Your AI Platform

Looking to implement multi-agent systems or GenAI solutions? Let's discuss how I can help architect and build production-grade AI platforms for your organization.

Need AI Consulting?Book Discovery Call
Explore MoreView All Projects