Profile

Hayden Prairie

CSE PhD @ University of California, San Diego

Hi, I’m Hayden! I am currently an PhD student at UCSD studying Computer Science and Engineering. I am part of Sandy Research Lab and advised by Dan Fu and Taylor Berg-Kirkpatrick. I am originally from Austin, Texas and previously did by undergraduate at University of Texas at Austin. My primary interests are core machine learning and computer systems!

My research mostly covers the intersection of ML and systems, including SSMs, structured sparsity, and all things GPU. I am broadly interested in developing an understanding of how we can better interpret and exploit sparsity to improve the efficiency and expressivity of large models.

Please check out my GitHub to see what I am currently working on and the projects I have contributed to!

Updates

Jul 2025
I started working part-time @together.ai as a research intern with the kernels team.
Apr 2025
I will be starting my PhD this semptember at UCSD working with Dan Fu.

Publications

Upweighting Easy Samples in Fine-Tuning Mitigates Forgetting
Sunny Sanyal*, Hayden Prairie*, Rudrajit Das*, Ali Kavis*, Sujay Sanghavi
ICML, Spotlight
Fine-tuning a pre-trained model on a downstream task often degrades its original capabilities, a phenomenon known as "catastrophic forgetting". This is especially an issue when one does not have access to the data and recipe used to develop the pre-trained model. Under this constraint, most existing methods for mitigating forgetting are inapplicable. To address this challenge, we propose a sample weighting scheme for the fine-tuning data solely based on the pre-trained model's losses. Specifically, we upweight the easy samples on which the pre-trained model's loss is low and vice versa to limit the drift from the pre-trained model. Our approach is orthogonal and yet complementary to existing methods; while such methods mostly operate on parameter or gradient space, we concentrate on the sample space. We theoretically analyze the impact of fine-tuning with our method in a linear setting, showing that it stalls learning in a certain subspace which inhibits overfitting to the target task. We empirically demonstrate the efficacy of our method on both language and vision tasks. As an example, when fine-tuning Gemma 2 2B on MetaMathQA, our method results in only a drop in accuracy on GSM8K (another math dataset) compared to standard fine-tuning, while preserving more accuracy on the pre-training datasets.

Blog Posts

No blog posts available yet.
Home - hayden prairie