Publications

Research & Papers

Peer-reviewed work on efficient AI — Bayesian sparsification, quantization, and deployment-robust model compression.

2026 · Under Review

Efficient Image Enhancement on the Edge

Submitted to NeurIPS 2026 (first author)

First-author work on state-of-the-art image enhancement designed for edge deployment, conducted within the INSAIT / ETH Zurich research ecosystem in collaboration with academic and industry partners. Details to follow upon publication.

2025

Quant-Trim in Practice: Improved Cross-Platform Low-Bit Deployment on Edge NPUs

EurIPS 2025

A training-phase method producing hardware-neutral checkpoints robust to backend and precision choices. Combines progressive fake quantization with reverse pruning to tame outlier-driven scale inflation. Agnostic to quantization schemes (symmetric/asymmetric, per-tensor/per-channel, INT8/INT4), requires no vendor-specific graph changes, and avoids per-backend retraining.

📄 arXiv 📋 PDF

2024

Shaving Weights with Occam's Razor: Bayesian Sparsification for Neural Networks Using the Marginal Likelihood

NeurIPS 2024 (Main Track) & AABI 2024

A principled Bayesian approach to neural network sparsification using the marginal likelihood, achieving state-of-the-art compression while maintaining model performance — with applications to vision transformers and LLMs.

📄 NeurIPS 🔗 OpenReview (AABI)

2023

From Judgement's Premises Towards Key Points

arXiv

Advanced NLP methods for extracting key points from legal texts, enhancing automated document analysis with BERT, IBM Debater, and graph-based approaches.

📄 arXiv 📋 PDF

Academic Service

Reviewing

Scientific reviewer for top-tier machine learning conferences (ICML, NeurIPS) and journals (JAIR), covering probabilistic inference, compression, efficient AI, computer vision, and LLMs.