44 skills found · Page 1 of 2
liudaizong / Awesome 3D Visual Grounding😎 up-to-date & curated list of awesome 3D Visual Grounding papers, methods & resources.
GWxuan / TSP3D[CVPR 2025, All Strong Accept] TSP3D: Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding
iris0329 / SeeGround[CVPR'25] SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
worldbench / 3EED[NeurIPS 2025 DB Track] 3EED: Ground Everything Everywhere in 3D
be2rlab / Gsplatloc[IROS 2025] GSplatLoc: Grounding Keypoint Descriptors into 3D Gaussian Splatting for Improved Visual Localization
yanmin-wu / EDA[CVPR 2023] EDA: Explicit Text-Decoupling and Dense Alignment for 3D Visual Grounding
InternRobotics / VLM Grounder[CoRL 2024] VLM-Grounder: A VLM Agent for Zero-Shot 3D Visual Grounding
CognitiveAISystems / 3DGraphLLM[ICCV 2025] 3DGraphLLM is a model that uses a 3D scene graph and an LLM to perform 3D vision-language tasks.
jianghaojun / Awesome 3D Vision And LanguageA collection of 3D vision and language (e.g., 3D Visual Grounding, 3D Question Answering and 3D Dense Caption) papers and datasets.
ZCMax / ScanReason[ECCV 2024] Empowering 3D Visual Grounding with Reasoning Capabilities
sega-hsj / MVT 3DVG[CVPR 2022] Multi-View Transformer for 3D Visual Grounding
WHU-USI3DV / CityAnchor[ICLR'25] City-scale 3D Visual Grounding with Multi-modality LLMs
ZhanYang-nwpu / Mono3DVG[AAAI 2024] Mono3DVG: 3D Visual Grounding in Monocular Images, AAAI, 2024
CurryYuan / ZSVG3D[CVPR 2024] Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding
Ivan-Tang-3D / ViewRefer3D(ICCV2023) Official implementation of 'ViewRefer: Grasp the Multi-view Knowledge for 3D Visual Grounding with GPT and Prototype Guidance'
zlccccc / 3DVL Codebase[CVPR2022 Oral] 3DJCG: A Unified Framework for Joint Dense Captioning and Visual Grounding on 3D Point Clouds
pqh22 / ProxyTransformation[CVPR2025] ProxyTransformation : Preshaping Point Cloud Manifold With Proxy Attention For 3D Visual Grounding
daveredrum / D3Net[ECCV2022] D3Net: A Unified Speaker-Listener Architecture for 3D Dense Captioning and Visual Grounding
zyang-ur / SATSAT: 2D Semantics Assisted Training for 3D Visual Grounding, ICCV 2021 (Oral)
Leon1207 / 3DRefTRThis is a PyTorch implementation of 3DRefTR proposed by our paper "A Unified Framework for 3D Point Cloud Visual Grounding"