本文提出 SnapFusion,一种移动端高性能 Stable Diffusion 模型。SnapFusion 有两点核心贡献:(1)通过对现有 UNet 的逐层分析,定位速度瓶颈,提出一种新的高效 UNet 结构(Efficient UNet),可以等效替换原 Stable Diffusion 中的 UNet,实现 7.4x 加速;(2)对推理阶段的迭代步数进行
谷歌研究,大脑团队 我们介绍了 Imagen,这是一种文本到图像的扩散模型,具有前所未有的逼真度和深层次的语言理解。 Imagen 建立在理解文本的大型 Transformer 语言模型的强大功能之上,并依赖于扩散模型在高保真图像生成方面的优势。 我们的关键发现是,在纯文本语料库上预训练的通用大型语言模型(例如T5)令人惊讶 有效编码...
作为text-to-image生成模型的佼佼者(短短一年,Google引用1K+),Imagen有效地将大语言模型应用到视觉领域,并能根据给定的文本生成高分辨1024*1024的图像。下面让我们一起来解读一下这篇文章。 主页链接:Imagen: Text-to-Image Diffusion Models (research.google) 论文链接:arxiv.org/abs/2205.1148 video链接:Imagen ...
A latent text-to-image diffusion model. Contribute to CompVis/stable-diffusion development by creating an account on GitHub.
Stable Diffusion is a latent text-to-image diffusion model. Thanks to a generous compute donation from Stability AI and support from LAION, we were able to train a Latent Diffusion Model on 512x512 images from a subset of the LAION-5B database. Similar to Google's Imagen, this model us...
to 54.69/84.17/61.71. Similarly, on DrawBench, the observed improvements in position, attribute, and count, notably boosting the attribute success rate from 48.20% to 97.50%. Additionally, as mentioned on the research paper MIGC maintains a similar inference speed to the original stable dif...
Compared to the open-source text to image methods (Stable diffusion) and (DALL-E2), our model conistently leads to improved synthesis quality. Stable diffusion DALL-E2 eDiff-I (ours) There are two Chinese teapots on a table. One pot has a painting of a dragon, while the other pot has...
This research aims to tackle a challenging task called “MIG” and introduce a solution called MIGC to enhance the performance of stable diffusion in handling MIG tasks. One of the novel idea utilizes the strategy of divide and conquer which breaks down complex MIG task into simpler tasks, foc...
Title: SwiftBrush: One-Step Text-to-Image Diffusion Model with Variational Score DistillationFrom VinAI ResearchCVPR 2024 图1.SwiftBrush overview Highlight 作者提出了一个image-free的蒸馏方法SwiftBrush. 已有方法Score Distillation Sampling (SDS)有过饱和,过平滑和多样性差的问题,本文基于SDS来提出了Variatio...
Pipeline,支持通过 prompt 的方式动态加载 lora、textual_inversion 权重;新增 Stable Diffusion HiresFix Pipeline,支持高分辨率修复;新增关键点控制生成任务评价指标 COCO eval;新增多种模态扩散模型 Pipelines,包括视频生成(Text-to-Video-Synth、Text-to-Video-Zero)、音频生成(AudioLDM、Spectrogram Diffusion)...