360K+ lines of production AI infrastructure — compilers, kernels, LLM engines & multi-agent platforms — all built from scratch in Rust, CUDA & TypeScript.
Production RAG pipelines, multi-agent orchestration, vector search, LLM fine-tuning & custom inference engines.
Built a self-hosting compiler from scratch. JIT/AOT via Cranelift, Hindley-Milner type inference, SIMD codegen.
Custom CUDA kernels for LLM training. 7,800+ tok/s inference. Fused attention, flash decoding, mixed-precision.
Bare-metal OS kernel, custom bootloader, memory allocators, interrupt handlers — zero external dependencies.
High-performance Next.js/React frontends with FastAPI/Node backends. WebSocket dashboards, real-time telemetry.
CI/CD pipelines, Docker orchestration, Vercel deployments, monitoring stacks, M365 administration & security.
Multi-agent AI evolution platform — 4 LLMs compete, evolve & breed code
MCP Smart Gateway — semantic tool routing, RBAC & real-time observability
Chrome DevTools for AI agents — time-travel debugging & cost tracking
Self-evolving JIT-compiled language with genetic fitness evaluation
End-to-end design for production AI — from LLM selection and prompt engineering to vector store architecture and multi-agent orchestration.
CUDA kernel optimization, Rust rewrites of Python bottlenecks, compiler-level performance tuning. Proven 139x speedups.
Architecture reviews, technology strategy, CTO-as-a-service for startups. Deep expertise across the entire stack from silicon to UI.
Currently accepting AI architecture consulting, performance engineering projects, and technical advisory roles.