git clone https://github.com/baaivision/tokenize-anything cd tokenize-anything && pip install . 环境依赖: pip3 install fschat==0.2.20 pip3 install open-clip-torch==2.7.0 pip3 install gradio-image-prompter 代码执行: python3 scripts/app_gradio.py --model-type tap_vit_b --checkpoint ....
Tokenize Anything via Prompting 该文章提出了一种统一、可提示的模型,能够同时进行分割、识别和描述任何事物。与SAM不同,该模型旨在通过视觉提示构建通用区域表示。为此,作者训练了一个具有大量分割掩码(例如SA-1B掩码)和来自具有50亿参数的预训练CLIP模型的语义先验的通用模型。具体来说,他们构建了一个可提示的图像...
We present a unified, promptable model capableof simultaneously segmenting, recognizing, and captioning anything. Unlike SAM, we aim to build a versatile region representation inthe wild via visual prompting. To achieve this, we train a generalizable model with massive segmentation masks, e.g ., ...
We presentTokenizeAnything viaPrompting, a unified and promptable model capable of simultaneously segmenting, recognizing, and captioning arbitrary regions, with flexible visual prompts (point, box and sketch). The model is trained with exhaustive segmentation masks sourced from SA-1B, coupled with sem...
Tokenize Anything via Prompting Ting Pan1,2*, Lulu Tang2*, Xinlong Wang2¶, Shiguang Shan1 1ICT-CAS, 2BAAI * Equal Contribution, ¶Project Lead [Paper] [🤗 Demo] We present Tokenize Anything via Prompting, a unified and promptable model capable of simultaneously segmenting, recognizing...