针对这一全新的任务——自由式布局到图像生成(Freestyle Layout-to-Image Synthesis,FLIS),本文基于预训练的文生图大模型Stable Diffusion [1]构建了FreestyleNet。Stable Diffusion能够为我们提供丰富的语义,但是其只支持文本作为输入,如何将这些语义填入指定的布局是一个巨大的挑战。为此,本文引入了修正交叉注意力(Rect...
Freestyle Layout-to-Image Synthesis (FLIS) results generated by using our model. Each has two kinds of inputs: a layout of semantic masks (on the 1st column), and a text (on the top of each result). For each layout, we show three example results with edited texts (3rd-5th columns...
Layout-to-Image Synthesis (LIS) To generate images under the traditional LIS setting, run: python scripts/LIS.py --batch_size 8 --config /path/to/config --ckpt /path/to/trained_model --dataset <dataset name> --outdir /path/to/output --txt_file /path/to/dataset/with/val.txt --data...