← Back to Portfolio Hub
Engineering + Platform
Replatforming, reliability, and agentic systems at scale.
I've led engineering organizations of 150+ across cloud-native transformations, SRE maturity,
and — most recently — production AI/agentic platform builds. At Pine Labs I re-architected a
legacy payments stack into reactive microservices on AWS/GCP, scaling throughput 25×. Today
at STAR Systems and Commusoft I architect the infrastructure that runs AI-Native Enterprises
and production workflow automation.
Current — AI/Agentic Infrastructure
- Commusoft Workflow Engine (CWE): Python/FastAPI MCP server, TypeScript Temporal monorepo, Angular 17 canvas — live on Linode
- AINE platform infra: L1 compute layer across IBM Cloud, Linode, and IBM Code Engine
- MCP server design: action registry, domain anchor pre-pass, role-scoped routing
- Temporal durable workflows: retry caps, activity design, HumanHandover with SendGrid
- Six-layer test framework — 91%+ pass rate across autoplay suites
MCPTemporal
FastAPIAngular 17
IBM Code Engine
Prior — Cloud-Native Platform (Pine Labs)
- Re-architected legacy online payments stack → reactive microservices on AWS/GCP
- 25× throughput scaling; P99 latency down 60% via query shaping and async pipelines
- SLO dashboards and error budgets → MTTR down 40%
- Led engineering org of 150+ engineers; instituted DevOps/SRE practices
- Blue/green deployments, HPA autoscaling, chaos testing, paved-road platform
KubernetesKafka
AWS/GCPSRECI/CD
What I lead
- Agentic platform architecture: MCP, Temporal, multi-agent coordination
- Replatforming: monolith → microservices, strangler patterns, service boundaries
- Cloud-native: Kubernetes, Terraform, GitHub Actions, progressive delivery
- Streaming & data integrity: Kafka, exactly-once, idempotency, retries/DLQs
- Reliability: SRE practices, incident drills, error budgets, chaos testing
- Org design: hiring loops, ladders, operating cadence, delivery rituals
Representative wins
- CWE v3.0 — production agentic engine; 7/7 autoplay tests passing against live Commusoft Stage
- AINE L1 substrate audit — 30 groundable actions identified across 1,200+ CS API endpoints
- IBM Code Engine deployments — WAR Machine, watsonx Learning Platform, StintLabs, Sangli
- Pine Labs — 25× throughput, 60% P99 latency reduction, 40% MTTR improvement
- GNN-based code call-graph CLI — OSS/Ollama-only, open source