A mystifying aspect of diffusion model training—often hidden in opaque hyperparameter tables in appendices of research papers or default parameters in codebases—is the need to apply a very long average to get good results, often several percent of the entire length of the training. Using the ...