Case Studies

Agentic & GenAI Systems

I ship signed-capability-lease primitives for agent runtimes — the missing security contract under MCP. Production deployments of multi-agent AI systems, voice-first platforms, and enterprise GenAI solutions with 99.9% uptime.

Multi-Agent Orchestration

LLM Integration

Voice AI

Production Scale

99.9%

Platform Uptime

15+

AI Agents

70%

Cost Reduction

24/7

Availability

Featured · Mental Health Tech

MannSetu - AI Mental Wellness Platform

2024 - Present

India's first voice-first AI mental wellness companion with real-time emotion analysis and CBT-based guidance

Problem & Context

India faces a severe mental health crisis with a massive treatment gap. Young Indians (18-35) struggle with exam stress, family pressure, workplace anxiety, and societal expectations, but lack accessible, culturally-aware mental health support.

Key Challenge: Build a voice-first AI platform that provides 24/7 mental wellness support in Hindi, English, and Hinglish while maintaining privacy and cultural sensitivity.

Solution & Technical Approach

Voice-First AI Architecture

• Real-time emotion analysis from voice tone
• Multilingual support (Hindi/English/Hinglish)
• 60-second voice message processing

Wellness Features

• CBT-based therapy guidance
• Smart AI journaling
• Mood tracking and analysis

Privacy & Compliance

• End-to-end encryption
• Data hosted on Indian servers
• DPDP Act 2023 compliant

Accessibility & Safety

• Free for students
• 24/7 availability
• Crisis support via Tele-MANAS

Technology Stack

React

Voice AI

Emotion Analysis

CBT Algorithms

Privacy-First

Hindi/English NLP

Before vs After

❌ Before MannSetu

• 2-4 week wait times for therapy appointments
• ₹2,000-5,000 per session cost barrier
• Limited Hindi/regional language support
• Stigma preventing help-seeking behavior
• No support during crisis hours (nights/weekends)

✅ After MannSetu

• Instant access - zero wait time
• Free for students, affordable for all
• Full Hindi/English/Hinglish support
• Private voice-first interaction (no stigma)
• 24/7 availability including crisis support

Results & Impact

50+

Active Users

Engaged users on the platform

4.8/5

User Rating

Based on user feedback

40%+

Engagement

vs 1-5% industry standard

Visit MannSetu Platform

Featured · Enterprise AI

Agentify - Multi-Agent AI Platform

Dec 2024 - Present

Production-grade platform orchestrating 15+ specialized AI agents for enterprise application development

Problem & Context

Attri.ai needed a production-grade platform to orchestrate 15+ specialized AI agents for enterprise application development. The platform had to support complete SDLC automation, secure code execution, real-time collaboration, and comprehensive observability - all while maintaining 99.9% uptime.

Key Challenge: Build an enterprise-ready multi-agent system that automates the entire software development lifecycle while maintaining security, scalability, and reliability.

Solution & Architecture

15+ Specialized Agents

• Orchestrator Agent: Workflow coordination
• PRD Agent: Requirements generation
• Solution Architect: System design
• Coder Agent: Code generation

Enterprise Features

• Complete SaaS billing (Stripe)
• Multi-tenant workspace management
• Zero-downtime deployments
• Real-time collaboration

Security & Isolation

• E2B MicroVMs for code execution
• Sandboxed environments per agent
• Secure API key management
• Rate limiting and abuse prevention

Advanced Capabilities

• 100+ agent templates library
• MCP (Model Context Protocol)
• RAG pipeline with pgvector
• React Flow workflow visualization

Technology Stack

Claude Opus 4.7

Claude Sonnet 4.7

GPT-5.5

MCP

Stripe

Azure WebPubSub

E2B MicroVMs

Datadog

Vercel

pgvector

Before vs After

❌ Before Agentify

• 2-3 weeks for PRD to deployment
• Manual code reviews causing delays
• Siloed teams, fragmented workflows
• High developer costs ($150-250/hr)
• Inconsistent code quality across projects

✅ After Agentify

• 3-5 days PRD to production deployment
• Automated code review by AI agents
• Unified 15-agent orchestration pipeline
• 70% cost reduction on development
• Consistent, enterprise-grade code output

Outcomes & Business Impact

99.9%

Platform Uptime

Enterprise-grade reliability

50%

Faster Task Completion

Measured against manual development

80%

Faster Incident Response

Real-time monitoring & alerts

Visit Attri.ai Platform

Featured · AgentOps Infrastructure

FerrumDeck - AgentOps Control Plane

Open Source

Production-grade platform for running agentic AI workflows with deterministic governance, comprehensive observability, and measurable reliability

Problem & Context

LLMs are probabilistic and unpredictable, but production systems demand strict governance, audit trails, and budget controls. AI agents can make costly mistakes through excessive token spending, incorrect tool calls, or prompt injection attacks—with no visibility into what went wrong.

Key Challenge: Bridge the gap between probabilistic AI and deterministic production requirements with governance, observability, and reproducibility built-in.

Solution: Dual-Plane Architecture

Control Plane (Rust)

• Deterministic state management
• Policy enforcement & approval gates
• Budget tracking (tokens/cost/time)
• Immutable audit logging

Data Plane (Python)

• Probabilistic LLM execution
• MCP tool routing with policy checks
• Step execution & artifact storage
• Multi-model support (Claude, GPT-4)

Governance Features

• Deny-by-default tool execution
• Risk levels: Low → Critical
• Human approval gates for high-risk
• Automatic termination on budget breach

Observability Stack

• OpenTelemetry + Jaeger tracing
• Real-time token counting
• GenAI semantic conventions
• Visual trace exploration

Technology Stack

Rust

Python 3.12

Next.js 16

PostgreSQL

pgvector

Redis Streams

OpenTelemetry

Jaeger

MCP

Docker

LiteLLM

Before vs After

❌ Without FerrumDeck

• No visibility into agent decisions
• Unbounded token/cost spending
• Prompt injection vulnerabilities
• Impossible to debug failures
• No audit trail for compliance

✅ With FerrumDeck

• Full trace of every agent step
• Hard budget limits enforced
• Deny-by-default tool security
• Step-level replay for debugging
• Immutable audit logs for compliance

Key Capabilities

100%

Reproducibility

Version-controlled agents & prompts

Zero

Trust by Default

Explicit tool permissions required

Full

Observability

OpenTelemetry + Jaeger tracing

View on GitHub Read Documentation

Security disclosure · CVE-2026-30623

MCP STDIO is RCE-by-design — my deployments treat it as untrusted

OX Security's April 2026 disclosure (CVE-2026-30623) confirms MCP STDIO is RCE-by-design across 150M+ downloads. My deployments treat any STDIO MCP as untrusted: sandboxed in E2B, no host filesystem, no host network, signed-capability-lease envelope per call./api/quote does not invoke MCP STDIO.

2026-04-28 update: Anthropic's nine-connector creative launch ships first-party MCP servers for Adobe, Figma, Canva, and others. The STDIO-sandbox posture above still treats every connector — first-party or community — as untrusted by default. The CVE-2026-30623 design flaw is in the protocol surface, not in any one vendor's implementation.

OX Security disclosure →Technical deep dive →OWASP Agentic Top-10 →

MCP STDIO postureOX Security · CVE-2026-30623

STDIO MCP is RCE-by-design across 150M+ downloads. My posture: sandboxed in E2B, no host filesystem, no host network, signed-capability-lease envelope per call. /api/quote does not invoke MCP STDIO.

OX Security disclosure →Technical deep dive →

Agent-on-Agent Commerce · April 2026

Posture: every engagement ships agent-readable

Agent-on-agent commerce is a real surface as of April 2026 (Anthropic Project Deal, Cloudflare Mesh, Cisco Agentic Workforce Identity at RSAC 2026). My posture: every consulting engagement ships with a signed-capability-lease envelope, an audit-emit channel, and a public /api/quote endpoint that any compliant buyer-agent can hit to retrieve scope, rate, availability, and identity proofs. No human-in-the-loop required for first-touch.

POST /api/quote (agent endpoint) →AGENTS.md manifest →Topmate (human-in-the-loop) →

Try it now → live POST /api/quote

Buyer-agents: validate your POST body against /api/quote/schema.json (JSON Schema draft 2020-12) before sending. Discovery via /agents-registry.json.

Agent contract: Capability lease required · audit-emit channel · DID-verified buyer.

Hit the live /api/quote endpoint from this page. Rate-limited 30/hr/IP. Response is the same JSON a buyer-agent would receive over the wire.

Agent Identity · GA 2026-04-30

Okta for AI Agents GA — non-human identities are first-class today

As of April 30, 2026, Okta for AI Agents is GA. Non-human identities (NHIs) for agents now sit in the same Universal Directory as human users — lifecycle, audit, and revocation are first-class. With Cisco Agentic Workforce Identity (RSAC 2026) and Cloudflare Mesh, this finalizes the three-layer identity pattern: directory · cryptographic identity · per-call authorization.

My posture: the consulting starter ships Okta agent-identity bindings as an opt-in module from today. Existing engagements get a one-line patch that binds capability leases to Okta NHI tokens. /api/quote/schema.json v0.2 (also shipped today) accepts an optional okta_nhi_token field for buyer-agent attribution.

/identity-posture (full declaration) →/identity-posture.json (machine-readable) →Okta Showcase 2026 (primary source) →

Market signals · this week

≤ 7-day primary-source signals shaping the consulting stance

Each card cites a primary source and pairs it with a one-line operator stance. Cards auto-prune at 14 days unless explicitly pinned. The component refuses to render without a primary-source URL.

Microsoft / OpenAI2026-04-27

Microsoft–OpenAI partnership restructured to non-exclusive (AGI clause removed)

My consulting stance has always been multi-vendor by default; this is the canonical citation that even the largest substrate deal is no longer exclusive.

Primary source

AWS2026-04-28

AWS Bedrock Managed Agents — OpenAI Codex / GPT-5.5 (limited preview)

agent-airlock now wraps Bedrock-Codex / GPT-5.5 invocations with sandbox + capability-lease envelope. /api/quote schema v0.2 accepts an optional bedrock_invocation_arn for buyer-agent attribution.

Primary source

Anthropic2026-04-28

Anthropic Claude for Creative Work — 9 first-party MCP connectors

First-party MCP ≠ trusted MCP. agent-airlock's STDIO-sandbox posture (CVE-2026-30623) treats every connector — first-party or community — as untrusted by default.

Primary source

Okta2026-04-30

Okta for AI Agents GA — non-human identities in Universal Directory

Consulting starter ships Okta agent-identity bindings as opt-in from today. /api/quote schema v0.2 accepts an optional okta_nhi_token. See /identity-posture.

Primary source

Microsoft2026-04-29

Microsoft Q3 FY26 — $40B capex one quarter (substrate moat dynamics)

Market-context signal — informs the multi-vendor stance. No portfolio surface change, just a citation.

Primary source

DeepSeek2026-04-24

DeepSeek V4-Pro / V4-Flash released

Watch-list footnote in /llms-full.txt model-posture line. NOT in production routing fallback.

Primary source

Mistral2026-04-29

Mistral Medium 3.5 released

Watch-list footnote in /llms-full.txt model-posture line. NOT in production routing fallback.

Primary source

OpenObserve2026-04-29

OpenObserve Observability 3.0 + autonomous AI-SRE agent

Category-comparable to agent-audit-kit at the runtime layer (not the SAST layer). Complements, not competitors.

Primary source

Frontier Security Context · April 2026

Where this work fits the 2026 frontier-security stack

Three signals from the past 19 days place agent-audit-kit, agent-airlock, and verdict directly on the relevant lines of the 2026 production-AI security map.

Apr 7 · Anthropic

Mythos Preview / Project Glasswing

Anthropic's gated security-research model is the upstream attacker — already credited with thousands of zero-days incl. a 17-year FreeBSD NFS RCE. agent-audit-kit + agent-airlock are the downstream defender controls.

Project Glasswing announcement

Apr 20 · TheHackerWire

LangChain-ChatChat 0.3.1 RCE via MCP STDIO

11 CVEs across LiteLLM, LangChain, LangFlow, Flowise, LettaAI, LangBot — all rooted in unsanitized MCP STDIO config. agent-audit-kit's STDIO-config rule family detects this exact class.

Disclosure write-up

Apr 22 · Claude Code

Claude Code v2.1.117 sandbox hardening

PID-namespace subprocess isolation when CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1. verdict is verified against v2.1.117 — the rubric matches the new isolation contract.

Claude Code changelog

Open Source Production Tools

Scanners & Firewalls for Agentic Systems

Three OSS tools I ship and use on my own agents — plus the evaluation plugin that rates them.

As of 2026-05-04 · Metrics auto-synced from GitHub.

Agent Security

v0.3.13

agent-audit-kit

SAST-style scanner for agentic AI systems. Full OWASP Agentic + MCP Top-10 coverage, SARIF output, 11-framework compliance reporting (EU AI Act, SOC 2, HIPAA, NIST AI RMF, ISO 42001).

148 rules · released 2026-05-03

GitHub

Runtime Firewall

7 releases

agent-airlock

Runtime firewall for AI agents. Ghost-argument stripping, strict type validation, PII masking, RBAC, E2B sandboxing, network airgap, circuit breaker, cost tracking.

1,157 tests · 9 framework integrations · PyPI

GitHub

Agent Memory

132 tests

mnemo

MCP-native embedded memory database for AI agents, written in Rust. REMEMBER/RECALL/FORGET/SHARE primitives, hybrid vector search (RRF), AES-256-GCM encryption, branching/replay, RBAC.

15 framework integrations · DuckDB + PostgreSQL backends

GitHub

Quality Evaluation

Claude Code plugin

verdict

Universal quality judge for Claude Code. 7-dimension scoring (correctness, completeness, adherence, efficiency, safety), configurable rubrics, threshold blocking, auto-hooks.

7-dimension scoring · /judge command · auto-hooks

GitHub

Side Project Spotlight

Consumer AI: Beyond Enterprise

Demonstrating consumer AI product thinking beyond enterprise work

Consumer AI Tool

LIVE

Why Can't We Have An Agent For This?

A consumer AI tool that analyzes everyday problems and generates AI agent feasibility analyses. Solo-built and launched during Holi 2026 with zero marketing spend — 157+ visitors from 5+ countries in Week 1, with Hacker News as the #1 traffic source.

Technical Highlights

• Claude API prompt engineering for feasibility analysis
• 7-layer security architecture
• Dynamic OG image generation
• SEO-first design with structured data

Launch Metrics

• 157+ visitors in Week 1
• Hacker News as #1 referrer
• 5+ countries reached
• $3-5/day operating cost

Claude API

Next.js

Vercel

Upstash Redis

TypeScript

Visit Live Site

The Production Agent

Newsletter

Weekly lessons from running 15+ AI agents in production. Governance, security, memory, cost optimization. No demos — systems that work.

✓ Agent orchestration & governance patterns
✓ LLM cost optimization strategies
✓ Security & memory architecture deep-dives
✓ Real production war stories & lessons

Subscribe for Free

Free forever. No spam. Unsubscribe anytime.

Build Your AI Platform

Looking to implement multi-agent systems or GenAI solutions? Let's discuss how I can help architect and build production-grade AI platforms for your organization.

Hiring?View Resume

Need AI Consulting?Book Discovery Call

Explore MoreView All Projects