LumaGuide: Distribution Shaping for Training-Free HDR Generation in Diffusion Models

Abstract

Pretrained diffusion models generate realistic images, but their outputs remain constrained by the statistical biases of their training data, limiting their ability to produce high dynamic range (HDR) content. In this work, we introduce LumaGuide, a training-free framework for distribution shaping in diffusion models. Instead of modifying model parameters, LumaGuide steers the sampling process to match target feature distributions via differentiable energy-based guidance. We instantiate this framework for HDR generation by controlling luminance distributions in perceptually uniform PQ space. Our results show that aligning luminance histograms can induce HDR-consistent behavior, including coherent highlights and preserved shadow detail, while maintaining semantic fidelity. Beyond HDR, LumaGuide enables flexible specification of target distributions through data-driven presets, reference images, or text-driven predictors, and extends naturally to video generation with temporal consistency constraints.

Method

At every denoising step, LumaGuide computes a differentiable soft histogram of the predicted clean image in perceptually uniform PQ space and minimizes a Wasserstein-1 distance to a target histogram. The resulting gradient is back-propagated through the VAE decoder to shape the sampling velocity. Because the histogram is permutation-invariant, the diffusion prior is free to handle semantics and spatial structure — LumaGuide only constrains the global luminance statistics.

Results

LumaGuide reshapes the generated PQ-luminance histograms toward the target HDR distribution while preserving semantic structure, across Flux.1, SD3, SDXL, and CogVideoX. No fine-tuning is required.

Per-pixel PQ luminance comparison under exposure adjustment — **Figure 3.** Per-pixel PQ-luminance surface for the same seed and prompt. The Flux.1 baseline (top) overshoots into clipped highlights; LumaGuide (bottom) redistributes mass into mid-tones and preserves the highlight gradient, matching the target HDR distribution.

Histogram alignment plots — **Figure 4.** Luminance histogram alignment. Generated PQ histograms before and after guidance, alongside the HDR target distribution.

Subjective study results — **Figure 5.** Subjective study. Human-preference results against HDR baselines, reported in JOD units.

Ablations on guidance scale, schedule and bins — **Figure 6.** Ablations on guidance scale, schedule, and the number of histogram bins.

Quantitative comparison

Best value per column is in bold. LumaGuide rows are highlighted. Arrows indicate the direction of better.

Table 1. Comparison with HDR generation baselines.

Best alignment and largest dynamic range — competitive quality, moderate runtime, and entirely training-free.

Method	Q-quality ↑	Q-alignment ↑	DR_stops ↑	JOD ↑	Time ↓
LEDiff	0.425	0.612	4.71	−0.88	~8.6 s
BracketDiffusion	0.448	0.648	12.25	−0.30	~389 s
X2HDR	0.579	0.773	11.41	+0.43	~6 s
LumaGuide	0.568	0.814	14.99	+0.75	7.8 s

Table 2. Ablation of feature domain and distance.

PQ-space guidance with W₁ dominates linear-domain and ℓ₂/KL variants.

Domain	Distance	uW₁ ↓	p50_dist ↓	p99_dist ↓	DR_stops ↑
Linear	W₁	3.73	0.115	0.199	16.37
Linear	ℓ₂	3.79	0.116	0.207	16.33
PQ	ℓ₂	3.40	0.100	0.206	16.18
PQ	KL	2.06	0.065	0.143	16.51
PQ	W₁	0.58	0.024	0.053	14.99

Table 3. Cross-backbone results.

Drop-in across Flux.1, SD3 and SDXL — no retraining.

Backbone	Q-quality ↑	Q-alignment ↑	DR (stops) ↑
Flux.1	0.568	0.814	14.99
SD3	0.512	0.795	15.42
SDXL	0.431	0.655	15.90

HDR video generation

Applying the same distribution shaping objective to a pretrained video diffusion model (CogVideoX) yields zero-shot HDR video synthesis. A Temporal Luminance Coherence (TLC) term penalizes highlight flicker across frames while preserving motion and semantics.

HDR video frame strip — **Figure 7.** Selected frames from an HDR video clip generated under TLC guidance. Video reel coming soon.

BibTeX

@article{chen2026lumaguide,
  title   = {LumaGuide: Distribution Shaping for Training-Free HDR Generation in Diffusion Models},
  author  = {Chen, Bowen and Saini, Shreshth and Adsumilli, Balu and Bovik, Alan C.},
  journal = {arXiv preprint},
  year    = {2026}
}