Blogs - Shreshth Saini

Apr 2026 Slides HDR MLLM DM Flow Model

PhD Defense: Perceptual Quality Assessment and Enhancement of Visual Media using Generative Priors

Full PhD dissertation defense slides covering HDR-Q (MLLM for HDR VQA), LumaFlux (inverse tone mapping with diffusion transformers), and Rectified-CFG++ (geometry-aware guidance for flow models).

View slides →

For best viewing, adjust browser zoom to fit your display. Slides auto-enter fullscreen.

Mar 2026 Slides HDR MLLM VQA

HDR-Q: Quality-Aware HDR Video Assessment via Multimodal LLMs

Presentation slides for HDR-Q (CVPR 2026) — the first multimodal LLM for HDR video quality assessment, featuring HAPO (HDR-Aware Policy Optimization) with contrastive KL, dual-entropy regularization, and SigLIP-2 HDR-aware encoding.

View slides →

For best viewing, adjust browser zoom to fit your display. Slides auto-enter fullscreen.

Mar 2026 Slides HDR ITM DM

LumaFlux: Physically & Perceptually Guided Diffusion Transformers for Inverse Tone Mapping

Presentation slides for LumaFlux — SDR-to-HDR inverse tone mapping using Flux 12B with Physically-Guided Adaptation (PGA), Perceptual Cross-Modulation (PCM), and Rational Quadratic Spline (RQS) decoder with only ~17M trainable parameters.

View slides →

For best viewing, adjust browser zoom to fit your display. Slides auto-enter fullscreen.

Mar 2026 Slides Flow Model DM

Rectified-CFG++: Geometry-Aware Guidance for Rectified Flow Models

Presentation slides for Rectified-CFG++ (NeurIPS 2025) — a predictor-corrector guidance method that fixes CFG artifacts on flow models like Flux, SD3, and Lumina-Next with theoretical guarantees and zero extra training cost.

View slides →

For best viewing, adjust browser zoom to fit your display. Slides auto-enter fullscreen.

Apr 2026 DM Flow Model VQA MLLM HDR

Prepping for a Research Scientist, GenAI Position — A Pointer Notebook

A long revision notebook for Research Scientist / GenAI loops focused on image generation, perceptual quality, and video processing: diffusion and flow models, transformer internals (attention variants, RoPE, KV cache), the text-to-image design space, evaluation metrics, color and HDR, classical CV, RL alignment, and the coding tier — all in one place.

Read more →

Dec 2025 Flow Model DM

Rectified Flow, Explained from Geometry to Minimal Code

An intuition-first deep dive into Rectified Flow with the core equations, geometric interpretation of trajectory straightness, and minimal PyTorch-style code snippets for training and sampling.

Read more →

Coming Soon HDR VQA Subjective Study

The Challenge of HDR Video Quality Assessment

Why measuring quality in high dynamic range videos is fundamentally different from SDR, and what makes it one of the most challenging problems in perceptual quality research.

Read more →

Mar 2026 Flow Model DM

Rectified-CFG++: Fixing Guidance for Flow Models

Why classifier-free guidance breaks on flow models, and how a geometry-aware predictor-corrector fixes it — with a deep dive into text rendering quality and the adaptive schedule that makes it work.

Read more →