Publications
Research & Papers
Sharing my work with the research community.
2025
Quant-Trim: Hardware-Neutral Quantization-Robust Checkpoints via Progressive Fake Quantization and Reverse Pruning
EurIPS 2025
A training-phase method producing hardware-neutral checkpoints robust to backend and precision choices. Combines progressive fake quantization with reverse pruning to tame outlier-driven scale inflation. Agnostic to quantization schemes (symmetric/asymmetric, per-tensor/per-channel, INT8/INT4), requires no vendor-specific graph changes, and avoids per-backend retraining.
2024
Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood
NeurIPS 2024
A principled Bayesian approach to neural network sparsification using the marginal likelihood, achieving state-of-the-art compression while maintaining model performance.
2024
Scaling Sparse Models with OPD
AABI 2024
Structured sparsity approaches for vision transformers and large language models, extending Bayesian methods to modern architectures at scale.