About Me

Hi, my name is

Videet Mehta.

I'm a student at MIT studying Computer Science. I'm passionate about frontier AI research in multi-modal LLMs and hardware/software model acceleration.

I'm currently working at HAN Lab at MIT, where I'm using diffusion language models for speculative decoding. I'm also doing research at MIT's Spoken Language Systems Lab on test-time training. I'll soon be joining Together AI as an incoming kernels researcher.

Earlier this year, I worked at Liquid AI pretraining audio encoders for on-device deployment. Previously, I worked at Mercuria Energy trading where I worked on forecasting marginal prices. Additionally, I did some work at Sarvam AI to build India's first conversational speech AI in Hindi and English.

I'm also proud to have previously represented USA in the International Olympiad in Artificial Intelligence in 2024 and to have won a gold medal! I am now on the organizing and scientific committee for the 2026 USA AI Olympiad Team.

Here are a few technologies I've been working with recently:

PyTorch
JAX
DDP
Deepspeed
CUDA
SQL
Node.JS
React

Check out my Github!

Work Experience

AI Researcher @ Liquid AI
Jan. 2026 - Present
MIT
Working on pre-training a new architecture for a causal audio encoder designed for CPU-constrained large audio language models.
AI Researcher @ H.A.N Lab
Sep. 2025 - Present
MIT
Working with Qinghao Hu on accelerating language models via speculative decoding.
Machine Learning Engineer @ Sarvam AI
June 2025 - September 2025
Building Hindi and English speech-to-speech foundation models and large-scale distributed training infrastructure.
AI Research Scientist Intern @ Mercuria Energy Trading
March 2025 - August 2025
Houston, TX
Built ML systems for power markets: forecasted data-center load, optimized GPU/CUDA pipelines to halve FM fine-tuning costs, and designed graph+sequence models that improved LMP MAE/RMSE by ~25%.
AI Researcher @ MIT CSAIL Spoken Language Systems
September 2024 - Present
Cambridge, MA
Built a PEFT multimodal AVSR pipeline with distributed training that reduced WER by 15% and prototyped sparse attention-routing on Qwen2.5-Omni for audio tasks.
Founding Engineer @ Hidden Studios
Feb. 2025 - June 2025
Cambridge, MA
Built a full-stack in-game advertising platform with analytics, ML-based impression prediction, and automated gameplay data collection.
AI Researcher @ Regenerative Neurotechnology Lab
March 2024 - August 2024
UT Health System
Built a Viterbi decoding pipeline with statistical LMs for neural-to-word decoding and aligned GPT-2 hidden states with EEG features.
AI Research Scientist @ Houston Learning Algorithms
March 2023 - April 2024
University of Houston
Built a conditional video-diffusion model for real-time wildfire prediction and co-authored an IEEE MMSP 2024 paper with an arXiv preprint.

Projects

Multivariate Distribution Parallelism
- CUDA
- GPU Profiling
- C++
Adaptive Splash Attention CUDA Kernel
- Pytorch
- CUDA
- GPU Profiling
- C++
Brain Wave Decoding Algorithm
- Pytorch
- OpenBCI
- Forecasting
Cipher ML
- Python
- Node
- Sci-Kit Learn
- GPT API
Parameter Efficient Fine Tuning in Audio Visual Language Models
- Pytorch
- Python
- Distributed Training
- OpenAI Whisper

What’s Next?

Get In Touch

I'm always open to new opportunities and connecting with other students and professionals. Whether you have a question or just want to say hi, feel free to reach out!

Say Hello

About Me

Videet Mehta.

Work Experience

Projects

Multivariate Distribution Parallelism

Adaptive Splash Attention CUDA Kernel

Brain Wave Decoding Algorithm

Cipher ML

Parameter Efficient Fine Tuning in Audio Visual Language Models

Recent Posts

Compute-Aware Hybrid Attention Architecture Search

Fast Humanoid Loco-Manipulation via Flow Matching

VLM Spatial Reasoning via GRPO

What’s Next?

Get In Touch