BLIP-2(Bootstrapping Language-Image Pre-training) is an AI model that can perform various multi-modal tasks like visual question answering, image-text retrieval (image-text matching) and image captioning. It can analyze an image, understand its content, and generate a relevant and concise captio...
python src/analyze.py path/to/directory"Describe the image" BLIP3 Autocaptioning Tools Welcome to this XGEN-MM(BLIP3) Autocaptioning Tools repository! This project sets up tools for autocaptioning using state-of-the-art models. ✅ Chat Mode ✅ Caption Mode ✅ Fa...
python src/analyze.py path/to/directory "Describe the image" Saving Responses To save the AI's responses to text files, add the --save_response flag: python src/analyze.py path/to/image.jpg "Describe the image" --save_response Using the Chat Interface You can interact with the BLIP3 mo...