dense+dpo+diffusion+paper

2025-05-14 21:33:43

拼音 [ 拼音 ]

A Dense Reward View on Aligning Text-to-Image Diffusion with...

In this paper, we take on a finer dense reward perspective and derive a tractable alignment objective that emphasizes the initial steps of the T2I reverse chain. In particular, we introduce temporal discounting into DPO-style explicit-reward-free objectives, to break the temporal symmetry therein ...