moma+multimodal+llm

2025-04-14 00:57:14

拼音 [ 拼音 ]

MoMA: Multimodal LLM Adapter forFast Personalized Image...

Utilizing an open-source, Multimodal Large Language Model (MLLM), we train MoMA to serve a dual role as both a feature extractor and a generator. This approach effectively synergizes reference image and text prompt information to produce valuable image features, facilitating an image diffusion ...