Research & Papers

Sharing my work with the research community.

2025

Quant-Trim: Hardware-Neutral Quantization-Robust Checkpoints via Progressive Fake Quantization and Reverse Pruning

EurIPS 2025

A training-phase method producing hardware-neutral checkpoints robust to backend and precision choices. Combines progressive fake quantization with reverse pruning to tame outlier-driven scale inflation. Agnostic to quantization schemes (symmetric/asymmetric, per-tensor/per-channel, INT8/INT4), requires no vendor-specific graph changes, and avoids per-backend retraining.

2024

Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood

NeurIPS 2024

A principled Bayesian approach to neural network sparsification using the marginal likelihood, achieving state-of-the-art compression while maintaining model performance.

2024

Scaling Sparse Models with OPD

AABI 2024

Structured sparsity approaches for vision transformers and large language models, extending Bayesian methods to modern architectures at scale.

2023

From Judgement's Premises Towards Key Points

arXiv

Extracting key points from legal documents using NLP techniques including BERT, IBM Debater, and graph-based approaches.