Senior LLM Engineer · Data Scientist (NLP)

I build production AI
for enterprise.

6+ years shipping NLP products end-to-end — from NL2SQL and semantic search to RAG, agent orchestration, and fine-tuning — for banking and financial services clients. Columbia University, M.A. Statistics.


Engineer, not evangelist.

I'm an AI/LLM engineer who ships production systems — not demos, not wrappers. My career has been spent at the intersection of NLP, data science, and enterprise software, building tools that real users depend on daily.

At Aunalytics, I led the modernization of legacy deep learning systems into LLM-based architectures for banking clients — building NL2SQL engines, intelligent agent routing, RAG pipelines, and compliance-first safeguards.

Now I focus on LLM fine-tuning (LoRA, DPO, RLHF), agentic workflows (LangGraph, Google ADK), and evaluation frameworks. I publish hands-on walkthroughs on YouTube to share what I learn building.

6+
Years NLP experience
35+
Tutorials published
2K+
YouTube subscribers
95%
Agent eval accuracy

Where I've built.
Jul 2024 – Present
Independent AI Researcher & Educator
Remote · YouTube @LLMImplementation
  • Implemented DPO safety alignment on Llama 3.2, achieving 100% jailbreak refusal at ~72% cost reduction vs RLHF/PPO ($28 vs $100+).
  • Engineered $0.62 prompt-distillation pipeline — 120B teacher → Qwen 30B student via LoRA (15K+ views).
  • Built fully local RAG agent (LangGraph + Ollama) with ~60% latency reduction versus GPT-4o baseline.
  • Designed production Career Agent — LangGraph state machine with multi-layer filtering, self-reflection, and 4-layer evaluation framework (95% intent accuracy, 100% SQL correctness).
  • Delivered paid industry collaboration building end-to-end fine-tuning demos (SFT + RL) using partner SDK.
Jan 2022 – Jul 2024
Data Scientist — NLP / Product Tech Lead
Aunalytics · South Bend, Indiana
  • NLP Tech Lead for Generative AI — re-architected legacy DL systems into LLM architectures (GPT-4) for enterprise banking clients.
  • Built NL2SQL engine with vector-based schema pruning (60% token cost reduction) and compliance-first SQL validation.
  • Engineered Financial Agent with context-aware routing — >95% Precision/Recall, with validation & retry for JSON payloads.
  • Deployed Support RAG system indexing Jira tickets and logs, reducing triage time by 90% via LLM summarization.
  • Owned release lifecycle — FastAPI, Docker, Elasticsearch consolidation.
Feb 2020 – Dec 2021
Data Scientist — NLP / NL2SQL
Aunalytics · South Bend, Indiana
  • Co-authored "An Optimized NL2SQL System for Enterprise Data Mart" for banking data marts.
  • Built synthetic data generation engine for finance-domain NL–SQL pairs at scale.
  • Designed semantic value-matching (Elasticsearch) improving exact-match accuracy from ~1% to ~45%.
  • Re-architected NL autocomplete to low-latency Elasticsearch service (seconds → milliseconds).
Jul – Dec 2019
Data Research & Analysis Intern
Impending Bloom · New York
  • Built ML classification and entity-matching pipelines (BigQuery, PostgreSQL, Elasticsearch, MongoDB) achieving ~90% precision.

What I've shipped.
Automation
Career Agent — Daily Pipeline
AI job matching system in daily production. LangGraph state machine with ReAct self-reflection, multi-layer filtering (heuristic + LLM), artifact generation, and 4-layer evaluation framework (95% intent accuracy).
LangGraphGeminiSQLiteMilvusFastAPI
Alignment
DPO Safety — Llama 3.2
Compared PPO-RLHF vs DPO for safety alignment. DPO achieved 100% jailbreak refusal at ~72% lower cost. Custom eval harness with LLM-as-a-judge rubric and adversarial testing.
DPORLHFLoRAUnslothEvaluation
RAG
Local RAG Agent — Zero Cost
Fully local RAG with LangGraph + Ollama and self-correcting retrieval. EmbeddingGemma with dimension truncation (768→256). 1.3s latency — 60% faster than GPT-4o.
LangGraphOllamasqlite-vecEmbeddingGemma
Fine-tuning
$0.62 Prompt Distillation
Knowledge distillation from 120B teacher to Qwen 30B MoE student via LoRA on Tinker. Synthetic data generation + distributed training for under $1. Tutorial reached 15K+ views.
TinkerLoRAQwen 30B MoEDistillation
Evaluation
LLM Evaluation Pipeline
Perplexity diagnostics, Functional Correctness (pass@k), Semantic Similarity (embeddings), and LLM-as-a-Judge with position bias testing. Three-pillar prompt design for robust scoring.
Pythonsentence-transformersHugging FaceOpenRouter

What I teach.

LLM Implementation

Hands-on LLM engineering — RAG, agents, fine-tuning, evaluation.

Subscribe

What I work with.
AI & Agents
  • LangGraph / LangChain
  • Google ADK / MCP
  • RAG Pipelines
  • Tool Routing & Orchestration
  • LLM Evaluation (AI-as-Judge)
  • Ollama / LiteLLM
Fine-tuning & Training
  • PyTorch / Hugging Face
  • Unsloth / QLoRA / LoRA
  • DPO / PPO / RLHF
  • Prompt Distillation (Tinker)
  • 4-bit Quantization
  • Weights & Biases
MLOps & Infra
  • Python / FastAPI
  • Docker / CI/CD
  • GCP (Vertex AI, Cloud Run)
  • LangSmith / Langfuse
  • SQL / Elasticsearch
  • Vector DBs (Milvus, sqlite-vec)
Models
  • GPT-4 / GPT-OSS
  • Gemini 2.5 / 3
  • Claude
  • Llama 3 / Qwen / DeepSeek
  • IBM Granite
  • Sentence-BERT / EmbeddingGemma
Data Science
  • NL2SQL / Semantic Parsing
  • Statistical Modeling
  • scikit-learn / XGBoost
  • Pandas / NumPy
  • Experiment Design
  • Semantic Search
Workflow
  • Git / GitHub
  • uv / pip
  • Google Colab
  • VS Code / Cline
  • LangGraph Studio
  • macOS / Linux

Let's build something.

Open to full-time, contract, and remote opportunities. Based in Burnaby, BC, Canada — Open Work Permit.