GitHub is where people build software. More than 150 million people use GitHub to discover, fork, and contribute to over 420 million projects.
In this project, we propose VLM-R1, a stable and generalizable R1-style Large Vision-Language Model. Specifically, for the task of Referring Expression Comprehension (REC), we trained Qwen2.5-VL using both R1 and SFT approaches. The results reveal that, on the in-domain test data, the ...
论文名称:《VLM-R1: A stable and generalizable R1-style Large Vision-Language Model》 论文链接:https://arxiv.org/abs/2504.07615 Github地址:https://github.com/om-ai-lab/VLM-R1 之前在介绍Visual-RFT论文(https://zhuanlan.zhihu.com/p/28124490367)的时候,还提到VLM-R1只有项目工程、没出论文,这不在...
Solve Visual Understanding with Reinforced VLMs. Contribute to om-ai-lab/VLM-R1 development by creating an account on GitHub.
Solve Visual Understanding with Reinforced VLMs. Contribute to om-ai-lab/VLM-R1 development by creating an account on GitHub.
Solve Visual Understanding with Reinforced VLMs. Contribute to om-ai-lab/VLM-R1 development by creating an account on GitHub.
VLM-R1-evaluation 项目地址:github.com/om-ai-lab/VL 关于数据,这个项目是基于一个目标检测数据集的,Refcoco+。以前没做过这块,所以我去搜了下,说是这么个格式的数据集,每张图,有一些框,然后针对这些框有一些描述。 整个项目是基于open-r1项目的,那就很简单了,直接瞄准data处理和reward就完事了。 他的prompt...
onda3/envs/vlm-r1/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) frame#2: c10d::ProcessGroupNCCL::watchdogHandler() + 0x890 (0x7ec49ba2b7d0 in /home/ chensq/anaconda3/envs/vlm-r1/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so) ...
此仓库是为了提升国内下载速度的镜像仓库,每日同步一次。 原始仓库:https://github.com/om-ai-lab/VLM-R1 main 克隆/下载 git config --global user.name userName git config --global user.email userEmail 分支5 标签2 ying_yileadd OVD reward882d6228天前 ...
VLM-R1 是一款基于强化学习技术的视觉语言模型,能够通过自然语言指令精确定位图像目标,并支持多模态推理。1. 指代表达理解:解析自然语言指令,精准定位图像中的特定目标。2. 强化学习优化:采用 GRPO 技术,在复杂场景下表现出色,提升泛化能力。 VLM-R1 是什么 VLM-R1 是浙江大学 Om AI Lab 开发的一款基于强化学习技...