About Me
Videet Mehta.
I'm a student at MIT studying Computer Science. I'm passionate about frontier AI research in multi-modal large language models.
I'm currently interning at Mercuria Energy Trading, where I'm working on forecasting marginal prices and doing low-level GPU optimizations for weather models. I'm also working at Sarvam AI to build India's first conversational speech AI in Hindi and English.
I'm also doing research at MIT's Spoken Language Systems Lab under Jehanzeb Mirza on finding optimal attention heads for audio event classification & spoofing detection.
I'm also proud to have previously represented USA in the International Olympiad in Artificial Intelligence in 2024 and to have won a gold medal! I now am part of the scientific committee for 2025 USA AI Olympiad Team.
Here are a few technologies I've been working with recently:
- PyTorch
- JAX
- DDP
- Deepspeed
- CUDA
- SQL
- Node.JS
- React
Work Experience
- Machine Learning Engineer @ Sarvam AIJune 2025 - Present
Building Hindi and English speech-to-speech foundation models and large-scale distributed training infrastructure.
- AI Research Scientist Intern @ Mercuria Energy TradingMarch 2025 - Present
Built ML systems for power markets: forecasted data-center load, optimized GPU/CUDA pipelines to halve FM fine-tuning costs, and designed graph+sequence models that improved LMP MAE/RMSE by ~25%.
- AI Researcher @ MIT CSAIL Spoken Language SystemsSeptember 2024 - Present
Built a PEFT multimodal AVSR pipeline with distributed training that reduced WER by 15% and prototyped sparse attention-routing on Qwen2.5-Omni for audio tasks.
- Founding Engineer @ Hidden StudiosFeb. 2025 - June 2025
Built a full-stack in-game advertising platform with analytics, ML-based impression prediction, and automated gameplay data collection.
- AI Researcher @ Regenerative Neurotechnology LabMarch 2024 - August 2024
Built a Viterbi decoding pipeline with statistical LMs for neural-to-word decoding and aligned GPT-2 hidden states with EEG features.
- AI Research Scientist @ Houston Learning AlgorithmsMarch 2023 - April 2024
Built a conditional video-diffusion model for real-time wildfire prediction and co-authored an IEEE MMSP 2024 paper with an arXiv preprint.
Projects
Adaptive Splash Attention CUDA Kernel
- Pytorch
- CUDA
- GPU Profiling
- C++
Brain Wave Decoding Algorithm
- Pytorch
- OpenBCI
- Forecasting
Reasoning in Diffusion Language Models
- Pytorch
- Distributed Training
- LLM Reasoning
- SLURM
Cipher ML
- Python
- Node
- Sci-Kit Learn
- GPT API
Parameter Efficient Fine Tuning in Audio Visual Language Models
- Pytorch
- Python
- Distributed Training
- OpenAI Whisper
Recent Posts
View All Posts →Flow Matching Models
Flow Matching Models are Generative ModelsDenoising Diffusion Models (Literally)
Understand WTF a Diffusion Model via ProbabilitiesProximal Policy Optimization (PPO)
Learning foundational and popular reinforcement learning algorithms
What’s Next?
Get In Touch
I'm always open to new opportunities and connecting with other students and professionals. Whether you have a question or just want to say hi, feel free to reach out!
Say Hello