Skip to main content
Case Studies

Agentic & GenAI Systems

I ship signed-capability-lease primitives for agent runtimes — the missing security contract under MCP. Production deployments of multi-agent AI systems, voice-first platforms, and enterprise GenAI solutions with 99.9% uptime.

Multi-Agent Orchestration
LLM Integration
Voice AI
Production Scale
99.9%
Platform Uptime
15+
AI Agents
70%
Cost Reduction
24/7
Availability
Featured · Mental Health Tech

MannSetu - AI Mental Wellness Platform

2024 - Present

India's first voice-first AI mental wellness companion with real-time emotion analysis and CBT-based guidance

Problem & Context

India faces a severe mental health crisis with a massive treatment gap. Young Indians (18-35) struggle with exam stress, family pressure, workplace anxiety, and societal expectations, but lack accessible, culturally-aware mental health support.

Key Challenge: Build a voice-first AI platform that provides 24/7 mental wellness support in Hindi, English, and Hinglish while maintaining privacy and cultural sensitivity.

Solution & Technical Approach

Voice-First AI Architecture

  • Real-time emotion analysis from voice tone
  • Multilingual support (Hindi/English/Hinglish)
  • 60-second voice message processing

Wellness Features

  • CBT-based therapy guidance
  • Smart AI journaling
  • Mood tracking and analysis

Privacy & Compliance

  • End-to-end encryption
  • Data hosted on Indian servers
  • DPDP Act 2023 compliant

Accessibility & Safety

  • Free for students
  • 24/7 availability
  • Crisis support via Tele-MANAS

Technology Stack

React
Voice AI
Emotion Analysis
CBT Algorithms
Privacy-First
Hindi/English NLP

Before vs After

❌ Before MannSetu

  • • 2-4 week wait times for therapy appointments
  • • ₹2,000-5,000 per session cost barrier
  • • Limited Hindi/regional language support
  • • Stigma preventing help-seeking behavior
  • • No support during crisis hours (nights/weekends)

✅ After MannSetu

  • • Instant access - zero wait time
  • • Free for students, affordable for all
  • • Full Hindi/English/Hinglish support
  • • Private voice-first interaction (no stigma)
  • • 24/7 availability including crisis support

Results & Impact

50+
Active Users
Engaged users on the platform
4.8/5
User Rating
Based on user feedback
40%+
Engagement
vs 1-5% industry standard
Featured · Enterprise AI

Agentify - Multi-Agent AI Platform

Dec 2024 - Present

Production-grade platform orchestrating 15+ specialized AI agents for enterprise application development

Problem & Context

Attri.ai needed a production-grade platform to orchestrate 15+ specialized AI agents for enterprise application development. The platform had to support complete SDLC automation, secure code execution, real-time collaboration, and comprehensive observability - all while maintaining 99.9% uptime.

Key Challenge: Build an enterprise-ready multi-agent system that automates the entire software development lifecycle while maintaining security, scalability, and reliability.

Solution & Architecture

15+ Specialized Agents

  • Orchestrator Agent: Workflow coordination
  • PRD Agent: Requirements generation
  • Solution Architect: System design
  • Coder Agent: Code generation

Enterprise Features

  • Complete SaaS billing (Stripe)
  • Multi-tenant workspace management
  • Zero-downtime deployments
  • Real-time collaboration

Security & Isolation

  • E2B MicroVMs for code execution
  • Sandboxed environments per agent
  • Secure API key management
  • Rate limiting and abuse prevention

Advanced Capabilities

  • 100+ agent templates library
  • MCP (Model Context Protocol)
  • RAG pipeline with pgvector
  • React Flow workflow visualization

Technology Stack

Claude Opus 4.7
Claude Sonnet 4.7
GPT-5.5
MCP
Stripe
Azure WebPubSub
E2B MicroVMs
Datadog
Vercel
pgvector

Before vs After

❌ Before Agentify

  • • 2-3 weeks for PRD to deployment
  • • Manual code reviews causing delays
  • • Siloed teams, fragmented workflows
  • • High developer costs ($150-250/hr)
  • • Inconsistent code quality across projects

✅ After Agentify

  • • 3-5 days PRD to production deployment
  • • Automated code review by AI agents
  • • Unified 15-agent orchestration pipeline
  • • 70% cost reduction on development
  • • Consistent, enterprise-grade code output

Outcomes & Business Impact

99.9%
Platform Uptime
Enterprise-grade reliability
50%
Faster Task Completion
Measured against manual development
80%
Faster Incident Response
Real-time monitoring & alerts
Featured · AgentOps Infrastructure

FerrumDeck - AgentOps Control Plane

Open Source

Production-grade platform for running agentic AI workflows with deterministic governance, comprehensive observability, and measurable reliability

Problem & Context

LLMs are probabilistic and unpredictable, but production systems demand strict governance, audit trails, and budget controls. AI agents can make costly mistakes through excessive token spending, incorrect tool calls, or prompt injection attacks—with no visibility into what went wrong.

Key Challenge: Bridge the gap between probabilistic AI and deterministic production requirements with governance, observability, and reproducibility built-in.

Solution: Dual-Plane Architecture

Control Plane (Rust)

  • Deterministic state management
  • Policy enforcement & approval gates
  • Budget tracking (tokens/cost/time)
  • Immutable audit logging

Data Plane (Python)

  • Probabilistic LLM execution
  • MCP tool routing with policy checks
  • Step execution & artifact storage
  • Multi-model support (Claude, GPT-4)

Governance Features

  • Deny-by-default tool execution
  • Risk levels: Low → Critical
  • Human approval gates for high-risk
  • Automatic termination on budget breach

Observability Stack

  • OpenTelemetry + Jaeger tracing
  • Real-time token counting
  • GenAI semantic conventions
  • Visual trace exploration

Technology Stack

Rust
Python 3.12
Next.js 16
PostgreSQL
pgvector
Redis Streams
OpenTelemetry
Jaeger
MCP
Docker
LiteLLM

Before vs After

❌ Without FerrumDeck

  • • No visibility into agent decisions
  • • Unbounded token/cost spending
  • • Prompt injection vulnerabilities
  • • Impossible to debug failures
  • • No audit trail for compliance

✅ With FerrumDeck

  • • Full trace of every agent step
  • • Hard budget limits enforced
  • • Deny-by-default tool security
  • • Step-level replay for debugging
  • • Immutable audit logs for compliance

Key Capabilities

100%
Reproducibility
Version-controlled agents & prompts
Zero
Trust by Default
Explicit tool permissions required
Full
Observability
OpenTelemetry + Jaeger tracing
Security disclosure · CVE-2026-30623

MCP STDIO is RCE-by-design — my deployments treat it as untrusted

OX Security's April 2026 disclosure (CVE-2026-30623) confirms MCP STDIO is RCE-by-design across 150M+ downloads. My deployments treat any STDIO MCP as untrusted: sandboxed in E2B, no host filesystem, no host network, signed-capability-lease envelope per call./api/quote does not invoke MCP STDIO.

2026-04-28 update: Anthropic's nine-connector creative launch ships first-party MCP servers for Adobe, Figma, Canva, and others. The STDIO-sandbox posture above still treats every connector — first-party or community — as untrusted by default. The CVE-2026-30623 design flaw is in the protocol surface, not in any one vendor's implementation.

MCP STDIO postureOX Security · CVE-2026-30623

STDIO MCP is RCE-by-design across 150M+ downloads. My posture: sandboxed in E2B, no host filesystem, no host network, signed-capability-lease envelope per call. /api/quote does not invoke MCP STDIO.

Agent-on-Agent Commerce · April 2026

Posture: every engagement ships agent-readable

Agent-on-agent commerce is a real surface as of April 2026 (Anthropic Project Deal, Cloudflare Mesh, Cisco Agentic Workforce Identity at RSAC 2026). My posture: every consulting engagement ships with a signed-capability-lease envelope, an audit-emit channel, and a public /api/quote endpoint that any compliant buyer-agent can hit to retrieve scope, rate, availability, and identity proofs. No human-in-the-loop required for first-touch.

Try it now → live POST /api/quote

Buyer-agents: validate your POST body against /api/quote/schema.json (JSON Schema draft 2020-12) before sending. Discovery via /agents-registry.json.

Agent contract: Capability lease required · audit-emit channel · DID-verified buyer.

Hit the live /api/quote endpoint from this page. Rate-limited 30/hr/IP. Response is the same JSON a buyer-agent would receive over the wire.

Agent Identity · GA 2026-04-30

Okta for AI Agents GA — non-human identities are first-class today

As of April 30, 2026, Okta for AI Agents is GA. Non-human identities (NHIs) for agents now sit in the same Universal Directory as human users — lifecycle, audit, and revocation are first-class. With Cisco Agentic Workforce Identity (RSAC 2026) and Cloudflare Mesh, this finalizes the three-layer identity pattern: directory · cryptographic identity · per-call authorization.

My posture: the consulting starter ships Okta agent-identity bindings as an opt-in module from today. Existing engagements get a one-line patch that binds capability leases to Okta NHI tokens. /api/quote/schema.json v0.2 (also shipped today) accepts an optional okta_nhi_token field for buyer-agent attribution.

Market signals · this week

≤ 7-day primary-source signals shaping the consulting stance

Each card cites a primary source and pairs it with a one-line operator stance. Cards auto-prune at 14 days unless explicitly pinned. The component refuses to render without a primary-source URL.

Microsoft / OpenAI

Microsoft–OpenAI partnership restructured to non-exclusive (AGI clause removed)

My consulting stance has always been multi-vendor by default; this is the canonical citation that even the largest substrate deal is no longer exclusive.

Primary source
AWS

AWS Bedrock Managed Agents — OpenAI Codex / GPT-5.5 (limited preview)

agent-airlock now wraps Bedrock-Codex / GPT-5.5 invocations with sandbox + capability-lease envelope. /api/quote schema v0.2 accepts an optional bedrock_invocation_arn for buyer-agent attribution.

Primary source
Anthropic

Anthropic Claude for Creative Work — 9 first-party MCP connectors

First-party MCP ≠ trusted MCP. agent-airlock's STDIO-sandbox posture (CVE-2026-30623) treats every connector — first-party or community — as untrusted by default.

Primary source
Okta

Okta for AI Agents GA — non-human identities in Universal Directory

Consulting starter ships Okta agent-identity bindings as opt-in from today. /api/quote schema v0.2 accepts an optional okta_nhi_token. See /identity-posture.

Primary source
Microsoft

Microsoft Q3 FY26 — $40B capex one quarter (substrate moat dynamics)

Market-context signal — informs the multi-vendor stance. No portfolio surface change, just a citation.

Primary source
DeepSeek

DeepSeek V4-Pro / V4-Flash released

Watch-list footnote in /llms-full.txt model-posture line. NOT in production routing fallback.

Primary source
Mistral

Mistral Medium 3.5 released

Watch-list footnote in /llms-full.txt model-posture line. NOT in production routing fallback.

Primary source
OpenObserve

OpenObserve Observability 3.0 + autonomous AI-SRE agent

Category-comparable to agent-audit-kit at the runtime layer (not the SAST layer). Complements, not competitors.

Primary source
Frontier Security Context · April 2026

Where this work fits the 2026 frontier-security stack

Three signals from the past 19 days place agent-audit-kit, agent-airlock, and verdict directly on the relevant lines of the 2026 production-AI security map.

Apr 7 · Anthropic

Mythos Preview / Project Glasswing

Anthropic's gated security-research model is the upstream attacker — already credited with thousands of zero-days incl. a 17-year FreeBSD NFS RCE. agent-audit-kit + agent-airlock are the downstream defender controls.

Project Glasswing announcement
Apr 20 · TheHackerWire

LangChain-ChatChat 0.3.1 RCE via MCP STDIO

11 CVEs across LiteLLM, LangChain, LangFlow, Flowise, LettaAI, LangBot — all rooted in unsanitized MCP STDIO config. agent-audit-kit's STDIO-config rule family detects this exact class.

Disclosure write-up
Apr 22 · Claude Code

Claude Code v2.1.117 sandbox hardening

PID-namespace subprocess isolation when CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1. verdict is verified against v2.1.117 — the rubric matches the new isolation contract.

Claude Code changelog
Open Source Production Tools

Scanners & Firewalls for Agentic Systems

Three OSS tools I ship and use on my own agents — plus the evaluation plugin that rates them.

As of 2026-05-04 · Metrics auto-synced from GitHub.

Agent Security
v0.3.13

agent-audit-kit

SAST-style scanner for agentic AI systems. Full OWASP Agentic + MCP Top-10 coverage, SARIF output, 11-framework compliance reporting (EU AI Act, SOC 2, HIPAA, NIST AI RMF, ISO 42001).

148 rules · released 2026-05-03
GitHub
Runtime Firewall
7 releases

agent-airlock

Runtime firewall for AI agents. Ghost-argument stripping, strict type validation, PII masking, RBAC, E2B sandboxing, network airgap, circuit breaker, cost tracking.

1,157 tests · 9 framework integrations · PyPI
GitHub
Agent Memory
132 tests

mnemo

MCP-native embedded memory database for AI agents, written in Rust. REMEMBER/RECALL/FORGET/SHARE primitives, hybrid vector search (RRF), AES-256-GCM encryption, branching/replay, RBAC.

15 framework integrations · DuckDB + PostgreSQL backends
GitHub
Quality Evaluation
Claude Code plugin

verdict

Universal quality judge for Claude Code. 7-dimension scoring (correctness, completeness, adherence, efficiency, safety), configurable rubrics, threshold blocking, auto-hooks.

7-dimension scoring · /judge command · auto-hooks
GitHub
Side Project Spotlight

Consumer AI: Beyond Enterprise

Demonstrating consumer AI product thinking beyond enterprise work

Consumer AI Tool
LIVE

Why Can't We Have An Agent For This?

A consumer AI tool that analyzes everyday problems and generates AI agent feasibility analyses. Solo-built and launched during Holi 2026 with zero marketing spend — 157+ visitors from 5+ countries in Week 1, with Hacker News as the #1 traffic source.

Technical Highlights

  • • Claude API prompt engineering for feasibility analysis
  • • 7-layer security architecture
  • • Dynamic OG image generation
  • • SEO-first design with structured data

Launch Metrics

  • • 157+ visitors in Week 1
  • • Hacker News as #1 referrer
  • • 5+ countries reached
  • • $3-5/day operating cost
Claude API
Next.js
Vercel
Upstash Redis
TypeScript
Visit Live Site

The Production Agent

Newsletter

Weekly lessons from running 15+ AI agents in production. Governance, security, memory, cost optimization. No demos — systems that work.

  • Agent orchestration & governance patterns
  • LLM cost optimization strategies
  • Security & memory architecture deep-dives
  • Real production war stories & lessons
Subscribe for Free

Free forever. No spam. Unsubscribe anytime.

Build Your AI Platform

Looking to implement multi-agent systems or GenAI solutions? Let's discuss how I can help architect and build production-grade AI platforms for your organization.

Need AI Consulting?Book Discovery Call
Explore MoreView All Projects