Research & Engineering
Projects
Research-driven work spanning model optimization, edge deployment, perception, and applied ML — shipped at Zeiss, BMW, Huawei, Intel, CARIAD, and academic labs.
Featured Research
Quant-Trim: Hardware-Neutral Quantization-Robust Checkpoints
Edge accelerators rely on low-bit quantization, but vendor compilers differ in scaling, clipping, and kernel support — often as black boxes. The same FP checkpoint yields inconsistent accuracy across backends. Quant-Trim introduces a training-phase method that produces hardware-neutral checkpoints robust to backend and precision choices, combining progressive fake quantization with reverse pruning to tame outlier-driven scale inflation. Agnostic to quantization schemes (symmetric/asymmetric, per-tensor/per-channel, INT8/INT4) with no vendor-specific graph changes required.
Bayesian Sparsification via Occam's Razor
A principled Bayesian approach to neural network sparsification using the marginal likelihood. Achieves state-of-the-art compression ratios while preserving model quality — applied to CNNs, ViTs, and LLMs. Extended at AABI 2024 to structured sparsity for vision transformers and large language models at scale.
Foundation Model Optimization for Edge Deployment
Optimizing large-scale foundation and world models — SAM, DepthAnything, DINOv2, Vision Transformers, and LLMs — for real-time inference on edge accelerators in safety-critical environments (surgical rooms, industrial inspection). Quantization, distillation, and architecture-aware compression pipelines targeting Jetson, FPGAs, and custom silicon with strict latency, energy, and memory budgets.
Research & Industry
Sparse & Quantized Networks for Edge Inference
Intel Data Innovation Lab, Munich — 2022
Researched and implemented combined sparsification + quantization pipelines for edge device deployment. Benchmarked across SparseZoo models targeting INT8/pruned inference on CPU and accelerator backends.
Technical ReportNeural Network Optimization for Automotive ECUs
BMW, Munich — 2023–2024
Designed an end-to-end optimization framework for deploying neural networks on automotive Edge Control Units. Pipeline encompasses quantization, graph optimization (ONNX), and backend-specific compilation targeting resource-constrained ECU hardware.
Robot Fleet Task Optimization with RL & LLMs
Huawei Munich Research Center — 2023
Optimized multi-robot task assignment and monitoring using reinforcement learning and LLM-based planners. Compressed control networks for on-device inference on Jetson platforms in SLAM-based environments.
AI Pipeline for Microchip Leakage Detection
Infineon Technologies, Munich — 2020–2023
Built a production AI pipeline for semiconductor leakage detection using efficient autoencoders. Also designed an NFC encoder/decoder in VHDL to replace RAM in embedded NFC systems.
Open-World Instance Segmentation with Knowledge Distillation
CARIAD (Volkswagen Group), Munich — 2024
Developed visual perception techniques for unknown/novel objects using instance segmentation, language-to-vision alignment, and knowledge distillation. Achieved SOTA on COCO benchmarks. Submitted to CVPR.
Data Augmentation for Saliency Prediction
EPFL, Lausanne — 2022–2023
Developed novel augmentation strategies for visual saliency prediction models, achieving competitive performance on the MIT300 Validation benchmark.
Real-Time License Plate Detection & Pedestrian Tracking
TUM, Munich
End-to-end pipeline combining OCR with Kalman filtering for license plate recognition and multi-object pedestrian tracking in real-time video streams.
Key Point Extraction from Legal Judgements
TUM, Munich — 2022 · Published on arXiv
Automated extraction of key points from legal document premises using BERT, IBM Debater, PageRank, and clustering. Paper published on arXiv.
arXiv PaperGlobywood — Global Cinema Trend Analysis
EPFL Applied Data Analysis — 2022
Large-scale NLP analysis of world cinema trends across decades and geographies using sentence embeddings, clustering, and statistical methods on movie plot corpora.
GitHubAutonomous Agricultural Robot with Real-Time Classification
UnternehmerTUM, Munich
Designed a field robot for real-time raspberry plant gender classification and precision spraying. On-device inference via TensorRT on Jetson with custom CUDA pipeline.
Demo VideoVision-SLAM Autonomous Navigation Robot
EPFL, Lausanne
Autonomous robot with visual navigation using SLAM-based localization and path planning. Fused vision and LIDAR data for obstacle avoidance in dynamic environments.
Instruction Set Simulator — ELF Loader
TUM, Chair of Electronic Design Automation — 2020–2021
Contributed to the ETISS Instruction Set Simulator. Implemented a full ELF loader and memory layout manager for RISC-V architecture simulation.
Breast Cancer Detection & Segmentation from MRI
Lauzhack Hackathon, Lausanne
Transformer-based model for 3D MRI breast cancer detection and segmentation. Built during a 24-hour hackathon with Docker-based inference pipeline.
Pre-Symptomatic Disease Detection from Wearables
TUM.AI, Munich
Anomaly detection on physiological time-series from wearable devices for early disease detection before symptom onset. Used neural ODE-inspired architectures.