Full PhD dissertation defense slides covering HDR-Q (MLLM for HDR VQA), LumaFlux (inverse tone mapping with diffusion transformers), and Rectified-CFG++ (geometry-aware guidance for flow models).
Presentation slides for HDR-Q (CVPR 2026) — the first multimodal LLM for HDR video quality assessment, featuring HAPO (HDR-Aware Policy Optimization) with contrastive KL, dual-entropy regularization, and SigLIP-2 HDR-aware encoding.
Presentation slides for LumaFlux — SDR-to-HDR inverse tone mapping using Flux 12B with Physically-Guided Adaptation (PGA), Perceptual Cross-Modulation (PCM), and Rational Quadratic Spline (RQS) decoder with only ~17M trainable parameters.
Presentation slides for Rectified-CFG++ (NeurIPS 2025) — a predictor-corrector guidance method that fixes CFG artifacts on flow models like Flux, SD3, and Lumina-Next with theoretical guarantees and zero extra training cost.
An intuition-first deep dive into Rectified Flow with the core equations, geometric interpretation of trajectory straightness, and minimal PyTorch-style code snippets for training and sampling.
Why measuring quality in high dynamic range videos is fundamentally different from SDR, and what makes it one of the most challenging problems in perceptual quality research.
Why classifier-free guidance breaks on flow models, and how a geometry-aware predictor-corrector fixes it — with a deep dive into text rendering quality and the adaptive schedule that makes it work.