Jay Shah Jay Shah
Jay Shah

Jay Shah

Lead Data Scientist • LLM/RAG/Agents • AI Tinkerer

About Me

Eight years shipping ML systems in production: from wind turbine anomaly detection to LLM agent platforms.

Currently at Avathon, I led foundation-model MLOps, cut release cycles by 45%, and built an agent platform with MCP tools that now powers 20+ workflows. Before that: RAG compliance agents, solar-storage anomaly detection saving $500k+, and forecasting systems across energy domains.

I did my graduate research at Texas A&M under Dr. Yu Ding, working on wind energy failure prediction. That is where I learned that models in production behave very differently from models in notebooks.

Outside work: DataKind Ambassador (student dropout prediction with John Jay College), Gujarati Llama (fine-tuned Llama-2 for a language with almost no open datasets), and too many side projects to list cleanly.

I write about what I build. If something is on this site, it actually shipped.

Outside the Terminal

Cricket over everything else. Tea, not coffee, always. I follow Sadhguru's work and have a consistent yoga practice; it is the one thing that survives a newborn's sleep schedule.

Deeply interested in Indic languages and culture. Gujarati Llama started as a personal obsession before it became a project. I believe the next wave of AI needs to work for the billion people whose first language is not English.

I read across domains: systems thinking, philosophy, energy policy. The best ML insights I have had came from outside ML entirely.

Skills

Data Science & Machine Learning

Predictive Modeling Time Series Forecasting Anomaly Detection Deep Learning CNN RNN LSTM Transformers LLM Fine-tuning RAG AI Agents Hyperparameter Tuning Feature Engineering

Libraries & Frameworks

PyTorch TensorFlow Scikit-learn XGBoost Statsmodels Darts Transformers LangChain LlamaIndex ColBERT

Cloud & Data Engineering

AWS GCP Azure Apache Airflow Dask Pandas Spark Hadoop ETL Pipelines

MLOps & Development

MLflow BentoML Modal Docker Kubernetes Git FastAPI Flask CI/CD Python SQL Bash JavaScript

Honors & Awards

  • Winner at Ragathon by LlamaIndex (Feb 2024)
  • Outstanding Master's of Science Student (Apr 2019)
  • 2nd Runner-up at Texas Datathon by Citadel (Feb 2018)

Experience

  • Avathon — Data Scientist III (Jan 2022–Present), Data Scientist II (May 2021–Dec 2021), Data Scientist (June 2019–May 2021)
  • DataKind — Data Ambassador (Jan 2022–Sep 2022)
  • Texas A&M University — Graduate Research Assistant (Jan 2019–May 2019)
  • Utilities and Energy Services — Graduate Student Analyst (Dec 2017–Jun 2018)

Education

Patents

Languages

English, Hindi, Gujarati, German (elementary)