McKinsey · Expert Generative AI Engagement Manager

Building the future
of

I lead end-to-end delivery of large-scale Generative AI solutions for financial institutions — from multi-agent architectures and patent-pending hallucination detection to production agentic systems serving millions.

Faster Promotion
0%
GenAI Adoption Rate
0+
Banks Deployed
0%
Efficiency Gain

Deep expertise across
the AI & cloud stack

🧠

GenAI & Agentic Systems

Designing production multi-agent platforms, RAG architectures, and LLM-powered automation — including a patent-pending hallucination detection framework commercially deployed across banking clients.

AWS Strands LangGraph Bedrock Agents Claude API OpenAI Agents RAG

Deep Dive

  • Patent-pending hallucination detection for LLM and agentic responses — deployed commercially across multiple banking clients
  • Trained a Small Language Model using LoRA fine-tuning, DPO, and knowledge distillation for a Central Asian government tax platform
  • Built a GraphRAG Compliance Engine performing real-time agentic global legal search for a global ride-sharing company
  • Multi-agent orchestration with function calling, vector databases, and prompt engineering at enterprise scale
6+
Production Agent Systems
1
Patent Pending
☁️

Cloud & Infrastructure

Architecting scalable, secure cloud infrastructure across AWS, Azure, and GCP — from serverless microservices to GPU clusters for model training and inference at enterprise scale.

AWS Azure GCP Lambda Labs Docker Kubernetes Terraform

Deep Dive

  • Designed best-in-class AWS architecture for agent orchestration using Strands + Bedrock
  • Deployed SLM on Lambda Lab + Azure hybrid GPU server stack for government production workloads
  • Built full-stack on-premises infrastructure for real-time agentic call center operations
  • MCP / A2A protocol integration, OpenTelemetry observability, and Terraform-based IaC
3
Cloud Platforms
E2E
Infra to Production
🏦

Banking & Risk Automation

Automating RCSA workflows, fraud detection, credit risk modeling, and regulatory compliance — deploying AI-powered risk intelligence across major financial institutions in LATAM and North America.

RCSA Fraud Detection Credit Risk CCAR Compliance

Deep Dive

  • RCSA agentic solution deployed across 6+ regional banks with multi-agent compliance orchestration
  • AI-driven reconciliation algorithms for syndicated lending at a top-3 Japanese bank — 92% backlog reduction
  • Validated fraud, optimization, and credit risk models for multiple LATAM and North American banks
  • Led C&I CCAR modeling and automated validation suites at U.S. Bancorp
92%
Backlog Reduction
16+
Banks Served

Data Engineering & ML Ops

Building high-performance data pipelines, model training infrastructure, and real-time analytics systems — from Snowflake ETL to LoRA fine-tuning and Monte Carlo scenario analysis.

Python PyTorch Snowflake CUDA CI/CD FastAPI

Deep Dive

  • Automated SQL querying + model training with real-time Monte Carlo & Bayesian scenario analysis for 30k+ employees
  • Developed McKinsey's Model Bias Testing Framework — identified critical biases in 3 production healthcare models
  • Built object detection neural networks and adversarial detection models for DARPA / Department of Defense
  • Researched neural network video compression on IBM HAL cluster at NCSA
30k+
Employees Analyzed
DARPA
Research Partner

Impact-driven
projects at scale

01

Multi-Agent Workforce Planning Platform

AWS Strands + Bedrock agentic platform enabling automated SQL querying, model training, and real-time Monte Carlo & Bayesian scenario analysis for 30,000+ employees at a major regional bank.

30k+
Employees Covered

Technical Stack

  • AWS Strands agent orchestration
  • Amazon Bedrock (Anthropic Claude)
  • Automated SQL generation & execution
  • Real-time Monte Carlo simulation
  • Bayesian scenario modeling

Impact & Outcomes

  • Covers 30,000+ employees in real-time
  • Automated model training pipeline
  • Best-in-class AWS agent architecture
  • Executive-ready reporting dashboards
02

Real-Time Agentic Call Center Solution

Full-stack from on-premises infrastructure setup through agent development, testing, and production rollout — delivered end-to-end in 6 months.

57%
Efficiency Gain

What Was Built

  • On-premises infrastructure design & setup
  • Real-time agentic response system
  • End-to-end testing & QA framework
  • Production deployment & monitoring

Results

  • 57% efficiency improvement in response handling
  • 6-month concept-to-production timeline
  • Full-stack ownership: infra → agents → deploy
03

RCSA Agentic Automation

AI-powered Risk & Control Self-Assessment solution deployed across 6 different regional banks, automating complex compliance workflows with multi-agent orchestration.

6
Banks Deployed

Architecture

  • Multi-agent compliance orchestration
  • Automated risk assessment workflows
  • Cross-bank deployment framework
  • Regulatory document analysis

Scale

  • 6 regional banks in production
  • Complex compliance workflow automation
  • Standardized RCSA across institutions
04

GraphRAG Compliance Engine

Real-time agentic global legal search engine for a global ride-sharing company, performing automated compliance analysis across international regulatory frameworks.

Global
Legal Coverage

How It Works

  • Knowledge graph of global regulations
  • Agentic retrieval-augmented generation
  • Real-time legal search across jurisdictions
  • Automated compliance gap analysis

Capabilities

  • Multi-jurisdictional regulatory coverage
  • Natural language legal querying
  • Continuous regulatory update ingestion
05

GenAI Knowledge Retrieval — Commercial Bank

Led full lifecycle of a Knowledge Retrieval platform — process mapping, business case, requirements, cross-regional build, and CoE launch — leveraging Azure and advanced RAG for 10,000+ documents across EMEA, Americas, and India.

85%
Adoption Rate

Leadership Scope

  • 10+ cross-functional technical teams
  • McKinsey, client, and vendor coordination
  • Global delivery: Europe, India, U.S.

Outcomes

  • 85% user adoption rate
  • Enterprise-wide knowledge retrieval
  • Commercial Investment Bank deployment
06

SLM Fine-Tuning — Government Tax Platform

Trained a Small Language Model for a Central Asian government using LoRA fine-tuning, DPO, and knowledge distillation, deployed within an Agentic RAG architecture.

SLM
Custom Model

ML Techniques

  • LoRA fine-tuning for domain adaptation
  • Direct Preference Optimization (DPO)
  • Knowledge distillation from larger models
  • Agentic RAG deployment architecture

Infrastructure

  • Lambda Labs GPU training cluster
  • Azure production serving stack
  • Government-grade security compliance
07

Taxy.AI — Agentic Tax Preparation Assistant

Locally-hosted, AI-powered tax prep assistant mirroring the TurboTax guided experience. Combines Mistral OCR, dual-LLM analysis (Claude + OpenAI with RAG), confidence scoring, IRS Form 1040 generation, and autonomous agentic orchestration.

View on GitHub →
9-Step
Wizard UI

Technical Stack

  • Autonomous n0 agent loop with TodoWrite planning
  • Dual-LLM analysis (Anthropic Claude + OpenAI Assistants/RAG)
  • Mistral OCR 3 for document extraction
  • React + Vite frontend, FastAPI backend
  • OpenTelemetry tracing & JSONL audit trail

Impact & Outcomes

  • IRS Form 1040 AcroForm PDF generation (23 fields)
  • GREEN/AMBER/RED/YELLOW confidence scoring engine
  • 97 automated tests with digital twin framework
  • Real-time SSE streaming with human-in-the-loop
08

LoRA MultiModal Fine-Tuning

End-to-end pipeline for fine-tuning HunyuanVideo (13B params) with LoRA to generate personalized videos from text prompts. Uses a trigger-token approach to bind specific subjects during training, enabling placement into novel scenes during inference.

View on GitHub →
13B
Parameter Model

Key Technical Features

  • HunyuanVideo 13B base model with ~100MB LoRA adapter
  • Trigger-token subject binding for personalized generation
  • Gemini 2.5 Flash automated captioning pipeline
  • 6x data augmentation (temporal crops + horizontal flips)
  • FP8 quantization & bfloat16 mixed-precision training
  • FFmpeg/OpenCV normalization (768×512, 24fps)

See it in action

Demo YouTube COBOL → Python Migration
COBOL Demo

AI-powered legacy code transformation — converting enterprise COBOL to modern Python with full logic preservation and test coverage.

Claude Python AST Parsing
Watch on YouTube →
Demo YouTube Workforce Management Planning
Workforce Planning Demo

Multi-agent platform leveraging AWS Strands and Bedrock for automated SQL querying, model training, and real-time scenario analysis.

AWS Strands Bedrock Monte Carlo
Watch on YouTube →
Demo YouTube RCSA Agentic Automation
RCSA Demo

AI-powered Risk & Control Self-Assessment automation — streamlining compliance workflows with multi-agent orchestration.

Multi-Agent Compliance LangGraph
Watch on YouTube →
Demo Vimeo Agentic Taxy.AI
Taxy.AI Demo

Full walkthrough of the AI-powered tax preparation assistant — from document upload and OCR to dual-LLM analysis, confidence scoring, and IRS Form 1040 generation.

Claude OpenAI OCR React
Watch on Vimeo →
Demo YouTube LoRA MultiModal Fine-Tuning
HunyuanVideo LoRA Demo

End-to-end LoRA fine-tuning pipeline — from video normalization and Gemini-powered captioning through multi-GPU training to personalized video generation from text prompts.

PyTorch LoRA Gemini HuggingFace
Watch on YouTube →

Tools & technologies
I work with daily

AI & Frameworks
OpenAI Agents SDK
AWS Strands / Bedrock Agents
LangGraph / AutoGen
Claude API / Gemini API
PyTorch / CUDA
Palantir AIP
RAG / Vector DBs
Cloud & Infrastructure
AWS (Lambda, S3, DynamoDB)
Azure / GCP
Lambda Labs (GPU)
Docker / Kubernetes
Terraform / IaC
MCP / A2A Protocols
CI/CD / OpenTelemetry
Languages & Data
Python
C/C++ / CUDA
Julia
JavaScript / TypeScript
SQL / NoSQL
React / FastAPI
Claude Code / CODEX

Thoughts on AI,
engineering & strategy

Deep Dive · 10 min read

Vibe Coding vs. Production Agentic AI: What the Demos Won't Show You

The gap between a working prototype and a production agentic system is not incremental — it is architectural. Exploring seven critical failure modes and the infrastructure to survive them.

Read Deep Dive →
LinkedIn

AI Code Generation Is Barely Touching 30% of Software

Read →
Medium

Revolutionizing KYC with Agentic AI and Semantic Search

Read →
Medium

5 Reasons Agentic AI Fails — and How to Avoid Them

Read →
Medium

From Art to Engineering: A Practical Rubric for GPT-4.1 Prompt Design

Read →
Medium

Enhancing Entity Resolution Using Generative AI — Part 1

Read →
Medium

GenAI Defensive Data Poisoning

Read →
Medium

Knowledge Graphs vs. Agentic RAG — Part 1

Read →
Medium

Reviewing YOLOv4

Read →
Medium

YOLOv3 PyTorch Video & Image Model

Read →
Medium

What Is ShuffleNet?

Read →
View All on Medium →

From founder to
enterprise AI leader

2024 – Present
Expert Engagement Manager — AI
McKinsey & Company, Austin, TX
Leading end-to-end delivery of large-scale GenAI solutions for financial institutions — Knowledge Retrieval platforms, Data Quality GenAI systems, Banking Control AI, and Knowledge Graph RAG frameworks. Patent pending on hallucination detection. Promoted 3× faster than standard timeline.
2023 – 2024
Specialist — Data Science & Analytics
McKinsey & Company, Boston, MA
Patented hallucination detection framework. Designed AI reconciliation algorithms reducing backlog by 92% at a top-3 Japanese bank. Built McKinsey's Model Bias Testing Framework.
2021 – 2023
Senior Analyst
McKinsey & Company
Leading expert in Banking Fraud/AML analytics and transformations. Developed best-in-class Fairness and Bias modeling standards. Expertise in Loan Operations and Model Risk Management across international clients.
2020 – 2021
Lead Quantitative Risk Model Developer
U.S. Bank, Minneapolis, MN
Advanced from intern to Lead in 6 months. Led C&I CCAR modeling, developed first full Python model development pipeline, and converted CCAR/CECL SAS code to Python. Built Wholesale hazard and failure time models.
2020
AI/FinTech Machine Learning Engineer
Neocova, St. Louis, MO
Managed four teams (24 interns) in an Agile environment. Developed ML models for Community Bank valuation and deployed FinTech valuation tools on Azure/R/Python in record six weeks.
2020
ML & AI Summer Consultant
Retinal Care Inc., Durham, NC
Developed and deployed a Deep Learning Rank Model for Diabetic Retinopathy identification, outperforming current industry models. Led a cross-discipline team of four through the full development cycle.
2019 – 2021
Graduate Research Assistant
Duke Applied Machine Learning Lab
Worked on classified Department of Defense Machine Learning projects. Implemented YOLOv3 and CenterNet models in PyTorch. Multiple publications and conference presentations.
2019
Research Fellow
National Center for Supercomputing Applications (NCSA), UIUC
Researched statistical learning for graphene nanomanufacturing and designed a deep learning framework for near-duplicate image detection that outperformed all prior work. Nominated for Best Oral Presentation at ISRS'19.
2013 – 2020
Founder & CEO
Fast River Logistics Inc., Houston, TX
Founded and scaled an interstate freight trucking company from 0 to 14 vehicles and 18 employees. Expanded to all 48 states and Mexico with 6 consecutive years of profit growth.

The person behind
the architecture

Before I ever wrote a line of code for McKinsey, I was running 18-wheelers across 48 states. I founded Fast River Logistics at 22 and spent seven years learning that the hardest engineering problems aren't technical — they're about people, systems, and relentless execution under pressure.

That operator's mindset followed me through a Computer Engineering Master's at Duke, DARPA-funded research in adversarial AI at the Applied Machine Learning Lab, and into McKinsey — where I now lead the end-to-end delivery of enterprise Generative AI solutions for financial institutions worldwide. I've shipped agentic systems serving tens of thousands of users, hold a patent pending on LLM hallucination detection, and was promoted three times faster than the standard timeline.

Fluent in English and Spanish, conversational in Russian, and learning Arabic — I bring a global perspective and a builder's intensity to every system I architect.

Download Résumé
🚀

3× Faster Promotion

Specialist → Engagement Manager in under 1 year at McKinsey. Standard timeline is 3+ years.

📜

Patent Pending

Novel hallucination detection methodology for LLMs and agentic systems, commercially deployed across banking clients.

🎓

Duke + DARPA Research

MS Computer Engineering (3.8 GPA). Developed adversarial detection models for the Department of Defense.

🏗️

Founder at 22

Built Fast River Logistics from zero to 48-state operations with 6 years of consistent profit growth.

🌍

Multilingual

English & Spanish (native/bilingual), Russian (elementary) — effective across global teams.

📚

Teaching & Mentorship

Graduate TA at Duke Fuqua (MBA) and Pratt (Engineering). Student mentor for Duke Athletics and Pratt's DEI Committee Subcommittee Chairman.

🔬

Published Researcher

NCSA Research Fellow (UIUC), XSEDE EMPOWER Apprentice, PEARC'19 conference publication, nominated for Best Oral Presentation at ISRS'19.

Academic foundations

M.Eng. Computer Engineering
Duke University — Pratt School of Engineering
2019–2021
Summer Research — Computer Engineering
University of Illinois at Urbana-Champaign (NCSA)
2019
B.S. Computer Engineering
University of Houston-Clear Lake
2017–2019
B.A. Economics
Grinnell College
2009–2013
Physics
Dartmouth College
2008–2009

Continuous learning

🧑‍💻
Claude Code in Action
Anthropic
🤖
AI Agents Fundamentals
Nanodegree
👁️
Computer Vision Nanodegree
Udacity
🐍
AI Programming with Python Nanodegree
Udacity
📊
Fundamentals of Quantitative Modeling
Wharton Online
📈
Modeling Risk and Realities
Wharton Online
// Let's Connect

Ready to build
something intelligent?

Whether you're exploring AI transformation, scaling agentic systems, or modernizing financial infrastructure — I'd love to hear about your challenge.