他们要求 GPT-4V 采取必要行动,并对其选择做出解释,从而挑战其在实际驾驶场景中的能力极限。测试采用了经过精心挑选的代表不同驾驶场景的图片和视频。测试样本来自不同渠道,包括 nuScenes、Waymo Open 数据集、Berkeley Deep Drive-X (eXplanation) Dataset (BDD-X)、D2 -city、Car Crash Dataset (CCD)、TSDD、...
[11]K. Kärkkäinen and J. Joo, “Fairface: Face attribute dataset for balanced race, gender, and age,” arXiv preprint arXiv:1908.04913 , 2019. [12]G. B. Huang, M. Mattar, T. Berg, and E. Learned-Miller, “Labeled faces in the wild: A database for studying face recognition...
D-city,Car Crash Dataset (CCD),TSD,CODA,ADD,以及 V2X 数据集如 DAIR-V2X 和 CitySim。
The ShareGPT4V dataset is a pioneering large-scale resource that features 1.2 million highly descriptive captions. These captions surpass existing datasets in terms of diversity and information content. They cover a wide range of topics, including world
Alignment Dataset:这块早先大家会用开源的 Laion400M 和 Laion5B 进行对齐训练,但实际情况可能是这些数据集中的 image-text pair 都过于 noisy,对于学习模态的 alignment 效率并不高。 一种解决思路是对alignment数据集进行更加细粒度的表述,进而能够帮助模型更好地学习图片中物体的相关位置等关系,和LLM原先的知识挂...
论文: https://arxiv.org/abs/2311.12793主页: https://sharegpt4v.github.io/Web Demo: https://huggingface.co/spaces/Lin-Chen/ShareGPT4V-7BCode 和 Dataset: https://github.com/InternLM/InternLM-XComp…
标题:HoloAssist: an Egocentric Human Interaction Dataset for Interactive AI Assistants in the Real World 机构:微软、苏黎世联邦理工学院 关键词:自我中心、人机交互、物理操作任务、实时交互数据 地址:arxiv.org/pdf/2309.1702 代码:holoassist.github.io/ 24. 基于生成预训练的方法在AI辅助放射学图像解释中重建...
图像来源:GitHub - linhandev/dataset: 医学影像数据集列表 『An Index for Medical Imaging Datasets』github.com/linhandev/da https://www.kaggle.com/datasets/nih-chest-xrays/data#:~:text=Class%20descriptions,Hernia 图片输入 GPT-V 判别结果:(肺炎,0.7) 1. 肺炎: ◦置信度: 0.7 ◦原因: X 光...
Design Principles and Characteristics of the RS-GPT4V Dataset Illustrates the dataset's design principles focusing on unity, diversity, correctness, complexity, richness, and robustness. Principles-Driven Pipeline for RS-GPT4V Dataset Construction The construction process follows a structured approach integr...
Dataset Cards All datasets can be foundhere. The structure of naming is shown below: ALLaVA-4V ├── ALLaVA-Caption-4V │ ├── ALLaVA-Caption-LAION-4V │ └── ALLaVA-Caption-VFLAN-4V ├── ALLaVA-Instruct-4V │ ├── ALLaVA-Instruct-LAION-4V │ └── ALLaVA-Instruct-V...