In this paper, we introduce 3D-GRAND, a pioneering large-scale dataset\ncomprising 40,087 household scenes paired with 6.2 million densely-grounded\nscene-language instructions. Our results show that instruction tuning with\n3D-GRAND significantly enhances grounding capabilities and reduces\n...