Senior AI Engineer · LLM Agents · RAG · NLP

I build production AI
for enterprise.

Senior AI Engineer with 6+ years building production AI systems from zero to one. At Aunalytics, created the Text-to-SQL training pipeline, Support RAG system, and Elasticsearch autocomplete from scratch; led BERT → GPT-4 migration (~20% accuracy gain, zero production hotfixes). Now building agentic systems independently — Jupiter is live. YouTube @LLMImplementation (2K+ subscribers, 35+ videos). Columbia University, M.A. Statistics.


Engineer, not evangelist.

I'm an AI engineer who builds production AI systems from zero to one — not demos, not wrappers. My career has been spent at the intersection of NLP, data science, and enterprise software, building tools that real users depend on daily.

At Aunalytics, I created the Text-to-SQL training pipeline, Support RAG system, and Elasticsearch autocomplete from scratch; led BERT → GPT-4 migration (~20% accuracy gain, zero production hotfixes). Mentored junior data scientists and partnered cross-functionally with Product/DevOps/QA.

Now building agentic systems independently — Jupiter (jupiterpath.dev) is live. I also publish hands-on walkthroughs on YouTube @LLMImplementation (2K+ subscribers, 35+ videos) and fine-tune/align LLMs using LoRA, DPO, and prompt distillation.

6+
Years NLP experience
35+
Tutorials published
2K+
YouTube subscribers
98%
Intent classification accuracy

Where I've built.
Jul 2024 – Present
Senior AI Engineer | Remote
Independent · YouTube @LLMImplementation
  • Jupiter Career Agent (jupiterpath.dev) — Live production app: Built LangGraph agentic system with intent-based routing and hybrid RAG retrieval (Milvus + BM25 + Cohere Rerank). Stack: FastAPI, PostgreSQL, Milvus, Redis.
  • Tiered model routing: lightweight models for intent classification, reasoning models for synthesis. Indexed 5,300+ jobs with semantic + keyword hybrid search. 98% intent classification accuracy (N=120).
  • Rolling out premium features: Text-to-SQL, web search (Tavily), data analysis dashboards, multi-step planning (Planner → Executor → Reflector → Composer), and artifact generation pipeline.
  • Technical Content & Consulting:
  • Fine-tuned and aligned LLMs using LoRA, DPO, and prompt distillation (120B → 30B for $0.62). Published hands-on walkthroughs (2K+ subs, 35+ videos; top tutorial: 16K+ views).
  • Built the channel using an AI-native content pipeline: LLMs for transcript generation and video structure, TTS voice cloning from a single voice sample, and human-in-the-loop review for technical accuracy.
  • Delivered paid industry collaboration building end-to-end SFT + RL fine-tuning demos using partner SDK.
Jan 2022 – Jul 2024
Data Scientist, NLP Tech Lead
Aunalytics · South Bend, Indiana
  • NLP Tech Lead for a Generative AI product serving community banks. Led migration from BERT/LSTM to LLM APIs (Codex → GPT-3.5 → GPT-4), improving SQL accuracy ~20% and reducing client onboarding from ~4 weeks to 1 sprint. Owned release lifecycle with zero hotfixes; mentored junior data scientists; partnered cross-functionally with Product/DevOps/QA.
  • Text-to-SQL: Vector-based schema pruning cutting token costs ~60%; business term disambiguation; compliance-first SQL validator with strict whitelisting; self-improving few-shot example store.
  • Financial Agent: Multi-tool agent with context-aware routing achieving >95% precision/recall, eliminating tool-hallucination blockers. Validation & retry for JSON payloads (+20% execution success).
  • Support RAG: Identified bottleneck via stakeholder interviews (100+ issues/week). Indexed Jira tickets + logs into Elasticsearch vector search with LLM summarization — triage reduced from hours to <1 min.
Feb 2020 – Dec 2021
Data Scientist, NLP / NL2SQL
Aunalytics · South Bend, Indiana
  • Co-authored "An Optimized NL2SQL System for Enterprise Data Mart" (ECML PKDD 2021, Springer LNCS). Created synthetic training data from scratch when no banking NL-to-SQL datasets existed — grew to 150K pairs.
  • Semantic value-matching (exact-match accuracy: ~1% → ~45%).
  • Re-architected autoregressive Transformer prototype into low-latency Elasticsearch autocomplete (seconds → milliseconds) for production banking clients.

What I've shipped.
Live Product
Jupiter Career Agent — jupiterpath.dev
Live production agentic system with intent-based routing and hybrid RAG retrieval (Milvus + BM25 + Cohere Rerank). Tiered model routing: lightweight models for intent classification, reasoning models for synthesis. 5,300+ jobs indexed with semantic + keyword hybrid search. 98% intent classification accuracy (N=120). Rolling out premium features: Text-to-SQL, web search (Tavily), data analysis dashboards, multi-step planning, and artifact generation.
LangGraphFastAPINext.jsPostgreSQLMilvusRedisBM25Cohere RerankTavily
Alignment
DPO Safety — Llama 3.2
Compared PPO-RLHF vs DPO for safety alignment. DPO achieved 100% jailbreak refusal at ~72% lower cost. Custom eval harness with LLM-as-a-judge rubric and adversarial testing.
DPORLHFLoRAUnslothEvaluation
RAG
Local RAG Agent — Zero Cost
Fully local RAG with LangGraph + Ollama and self-correcting retrieval. EmbeddingGemma with dimension truncation (768→256). 1.3s latency — 60% faster than GPT-4o.
LangGraphOllamasqlite-vecEmbeddingGemma
Fine-tuning
$0.62 Prompt Distillation
Knowledge distillation from 120B teacher to Qwen 30B MoE student via LoRA on Tinker. Synthetic data generation + distributed training for under $1. Tutorial reached 16K+ views.
TinkerLoRAQwen 30B MoEDistillation
Evaluation
LLM Evaluation Pipeline
Perplexity diagnostics, Functional Correctness (pass@k), Semantic Similarity (embeddings), and LLM-as-a-Judge with position bias testing. Three-pillar prompt design for robust scoring.
Pythonsentence-transformersHugging FaceOpenRouter

What I teach.

LLM Implementation

Hands-on LLM engineering — RAG, agents, fine-tuning, evaluation.

Subscribe

What I work with.
Agents & RAG
  • LangGraph / LangChain
  • Google ADK / MCP
  • Tool Routing
  • RAG Design
  • LLM Evaluation
  • Prompt Engineering
Training
  • PyTorch / Hugging Face
  • Unsloth / Tinker
  • LoRA / QLoRA
  • SFT / DPO
  • 4-bit Quantization
  • Weights & Biases
Engineering
  • Python / FastAPI
  • Docker / CI/CD
  • SQL / Elasticsearch
  • Milvus / sqlite-vec
  • GCP (Vertex AI, Cloud Run)
  • LangSmith / Langfuse
Models
  • GPT-4 / GPT-OSS
  • Gemini 2.5 / 3
  • Claude
  • Llama 3 / Qwen / DeepSeek
  • IBM Granite
  • Sentence-BERT / EmbeddingGemma
Data Science
  • NL2SQL / Semantic Parsing
  • Statistical Modeling
  • scikit-learn / XGBoost
  • Pandas / NumPy
  • Experiment Design
  • Semantic Search
Workflow
  • Git / GitHub
  • uv / pip
  • Google Colab
  • VS Code / Cline
  • LangGraph Studio
  • macOS / Linux

Let's build something.

Open to full-time, contract, and remote opportunities. Based in Burnaby, BC, Canada — Open Work Permit.