I'm a machine learning engineer working on LLM inference and applied AI. Currently building Vajra at Georgia Tech and consulting independently.
Building Vajra, an open-source LLM inference engine. Worked on quantization & MoE support, experiment infra, CI, and telemetry. Lead developer of Veeksha (LLM performance & quality evaluation).
Advising companies on end-to-end applied AI: requirements, data, training, eval, and serving.
Studied advanced batching for LLM inference during my MSc.
ML systems for asset & credit risk. Productionized work with measured impact in the tens of millions.
~2 years across two projects. Explainable AI research (evaluation of attention-based explanations) under Dr. Alvin Jia and Prof. Albert Bifet. Data analysis and modeling for COPEDI-Cat, a Catalan COVID-19 paediatric response network (~150 collaborating paediatricians).
First cohort of Spain's data science program. Thesis and internships centered on ML.
Me and Cookie 🍪