I build and deploy ML systems including LLM pipelines, recommendation engines, and analytics. I also do AI research, focusing on mechanistic interpretability.
Revealed a late semantic repair pathway inside I-JEPA using causal activation patching and semantic occlusion analysis. Found a decisive bottleneck at encoder layer 29 that specifically repairs missing object-level structure, not just generic masked pixels. Demonstrated that this mechanism generalizes across datasets, object scales, and masking geometries.
Mapped how cognitive capabilities emerge across the layers of LLM architectures using cross-model probing and causal tests. Found a shared broad hierarchy—spatial and logical signals appear earlier, while executive and pattern-based signals emerge later—alongside architecture-specific late-layer dynamics. Showed that these internal representations transfer to paraphrased prompts and can causally influence outputs.
Developed a cross-architecture auditing method to visualize and compare how LLMs internally process empathy and social reasoning. Found that similar compassionate responses can mask sharply different internal strategies, including brittle linear processing versus adaptive self-correction. This provides a practical framework for selecting safer, better-calibrated models for human-facing applications.
This research shows that self-supervised Vision Transformers use different strategies to recover missing image information: MAE repairs pixels early, while I-JEPA and BEiT infer meaning later through semantic circuits. It uses causal patching and sparse autoencoders to reveal object-selective mechanisms in modern vision models that could improve interpretability and robustness.
Mendeley dataset; 76 engineered features; gradient boosting; 73.6% accuracy on women's menstrual cycle phases.
PyTorch model growing patterns from a single seed; demonstrates self-healing from user damage via a learned convolutional update rule.
Troika PPG + accelerometer; feature engineering + RF regression; 8.8 BPM MAE @ 90% availability; validated on CAST clinical data.
Analyzed 1.48M cancer cases from SEER21 dataset using Python/ML to identify 15 novel early-onset risk factors (up to 5.4x) including tumor size paradox and racial disparities; achieved AUC=0.65.
Real‑time price + social sentiment; continuous YouTube ingestion and scoring; trend indicators and watchlist alerts.
MovieLens 10M: baselines → regularized movie + user + Release effects) → matrix factorization; parameter tuning; gives best RMSE 0.783.
Probed DeepSeek‑R1 layer activations across tasks; compared layer groups to reveal specialization and cognitive pattern emergence.
Pygame simulation; genetic algorithm trains a neural network; fitness based on survival time; features elitism and mutation.
Built a full top-down autonomous driving stack in Python using Highway-Env + Stable-Baselines3 PPO, iteratively tuning reward shaping and episode length to boost speed discipline and delivered a model that consistently survives 200-step evaluations while maintaining high-speed, collision-free highway behavior.
Built production-grade retrieval-augmented generation system processing 800+ PDFs (1M+ chunks) with hybrid search (ChromaDB + BM25), GPU-accelerated cross-encoder reranking, multi-tier conversation memory, streaming responses, and query rewriting—achieving 6-8s query times with context rot prevention and research-backed optimizations.
Converted the dataset FoodSeg103 masks to YOLO-seg polygons and fine-tuned YOLO11x-seg on my GPU to better segment foods.
Built an AI pipeline that converts food photos into precise macro and micro nutrient breakdowns using automated USDA data matching. Try it out by clicking here!
University (grade)
Professional
Independent
AI
Data Science
Analysis
Finance
Health
Other
Reach out to me on Linkedin or fill out the form below!