← Back to Portfolio Hub

Engineering + Platform

Replatforming, reliability, and agentic systems at scale. I've led engineering organizations of 150+ across cloud-native transformations, SRE maturity, and — most recently — production AI/agentic platform builds. At Pine Labs I re-architected a legacy payments stack into reactive microservices on AWS/GCP, scaling throughput 25×. Today at STAR Systems and Commusoft I architect the infrastructure that runs AI-Native Enterprises and production workflow automation.

Current — AI/Agentic Infrastructure

  • Commusoft Workflow Engine (CWE): Python/FastAPI MCP server, TypeScript Temporal monorepo, Angular 17 canvas — live on Linode
  • AINE platform infra: L1 compute layer across IBM Cloud, Linode, and IBM Code Engine
  • MCP server design: action registry, domain anchor pre-pass, role-scoped routing
  • Temporal durable workflows: retry caps, activity design, HumanHandover with SendGrid
  • Six-layer test framework — 91%+ pass rate across autoplay suites
MCPTemporal FastAPIAngular 17 IBM Code Engine

Prior — Cloud-Native Platform (Pine Labs)

  • Re-architected legacy online payments stack → reactive microservices on AWS/GCP
  • 25× throughput scaling; P99 latency down 60% via query shaping and async pipelines
  • SLO dashboards and error budgets → MTTR down 40%
  • Led engineering org of 150+ engineers; instituted DevOps/SRE practices
  • Blue/green deployments, HPA autoscaling, chaos testing, paved-road platform
KubernetesKafka AWS/GCPSRECI/CD

What I lead

  • Agentic platform architecture: MCP, Temporal, multi-agent coordination
  • Replatforming: monolith → microservices, strangler patterns, service boundaries
  • Cloud-native: Kubernetes, Terraform, GitHub Actions, progressive delivery
  • Streaming & data integrity: Kafka, exactly-once, idempotency, retries/DLQs
  • Reliability: SRE practices, incident drills, error budgets, chaos testing
  • Org design: hiring loops, ladders, operating cadence, delivery rituals

Representative wins

  • CWE v3.0 — production agentic engine; 7/7 autoplay tests passing against live Commusoft Stage
  • AINE L1 substrate audit — 30 groundable actions identified across 1,200+ CS API endpoints
  • IBM Code Engine deployments — WAR Machine, watsonx Learning Platform, StintLabs, Sangli
  • Pine Labs — 25× throughput, 60% P99 latency reduction, 40% MTTR improvement
  • GNN-based code call-graph CLI — OSS/Ollama-only, open source