Lcd
[AAAI'20] LCD: Learned Cross-Domain Descriptors for 2D-3D Matching
Install / Use
/learn @hkust-vgd/LcdREADME
LCD: Learned Cross-domain Descriptors for 2D-3D Matching
This is the official PyTorch implementation of the following publication:
LCD: Learned Cross-domain Descriptors for 2D-3D Matching<br/> Quang-Hieu Pham, Mikaela Angelina Uy, Binh-Son Hua, Duc Thanh Nguyen, Gemma Roig, Sai-Kit Yeung<br/> AAAI Conference on Artificial Intelligence, 2020 (Oral)<br/> Paper | Homepage | Video
2D-3D Match Dataset
Download http://hkust-vgd.ust.hk/2d3dmatch/
We collect a new dataset of 2D-3D correspondences by leveraging the availability of several 3D datasets from RGB-D scans. Specifically, we use the data from SceneNN and 3DMatch. Our training dataset consists of 110 RGB-D scans, of which 56 scenes are from SceneNN and 54 scenes are from 3DMatch. The 2D-3D correspondence data is generated as follows. Given a 3D point which is randomly sampled from a 3D point cloud, we extract a set of 3D patches from different scanning views. To find a 2D-3D correspondence, for each 3D patch, we re-project its 3D position into all RGB-D frames for which the point lies in the camera frustum, taking occlusion into account. We then extract the corresponding local 2D patches around the re-projected point. In total, we collected around 1.4 millions 2D-3D correspondences.
Usage
Prerequisites
Required PyTorch 1.2 or newer. Some other dependencies are:
- h5py
- Open3D
Pre-trained models
We released three pre-trained LCD models with different descriptor size: LCD-D256, LCD-D128, and LCD-D64.
All of the models can be found in the logs folder.
Training
After downloading our dataset, put all of the hdf5 files into the data folder.
To train a model on the 2D-3D Match dataset, use the following command:
$ python train.py --config config.json --logdir logs/LCD
Log files and network parameters will be saved to the logs/LCD folder.
Applications
Aligning two point clouds with LCD
This demo aligns two 3D colored point clouds using our pre-trained LCD descriptor with RANSAC. How to run:
$ python -m apps.align_point_cloud samples/000.ply samples/002.ply --logdir logs/LCD-D256/
For more information, use the --help option.
After aligning two input point clouds, the final registration result will be shown. For example:
<p align="center"> <img src="https://github.com/hkust-vgd/lcd/blob/master/assets/aligned.png?raw=true" alt="Aligned point clouds"/> </p>Note: This demo requires Open3D installed.
Prepare your own dataset
We provide two scripts that we found useful during our data processing. Please take a look and adopt it to your need.
scripts/sample_train.py: Sample 2D-3D correspondences from the 3DMatch datasetscripts/convert_valtest.py: Convert theval-set.matandtest-set.matfiles from 3DMatch into HDF5 format.
Citation
If you find our work useful for your research, please consider citing:
@inproceedings{pham2020lcd,
title = {{LCD}: {L}earned cross-domain descriptors for 2{D}-3{D} matching},
author = {Pham, Quang-Hieu and Uy, Mikaela Angelina and Hua, Binh-Son and Nguyen, Duc Thanh and Roig, Gemma and Yeung, Sai-Kit},
booktitle = {AAAI Conference on Artificial Intelligence},
year = 2020
}
If you use our dataset, please cite the following papers:
@inproceedings{hua2016scenenn,
title = {{SceneNN}: {A} scene meshes dataset with a{NN}otations},
author = {Hua, Binh-Son, and Pham, Quang-Hieu and Nguyen, Duc Thanh and Tran, Minh-Khoi and Yu, Lap-Fai and Yeung, Sai-Kit},
booktitle = {International Conference on 3D Vision},
year = 2016
}
@inproceedings{zeng20173dmatch,
title = {{3DMatch}: {L}earning local geometric descriptors from {RGB}-{D} reconstructions},
author= {Zeng, Andy and Song, Shuran and Nie{\ss}ner, Matthias and Fisher, Matthew and Xiao, Jianxiong and Funkhouser, Thomas},
booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
year = 2017
}
License
Our code is released under BSD 3-Clause license (see LICENSE for more details).
Our dataset is released under CC BY-NC-SA 4.0 license.
Contact: Quang-Hieu Pham (pqhieu1192@gmail.com)
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
API
A learning and reflection platform designed to cultivate clarity, resilience, and antifragile thinking in an uncertain world.
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
sec-edgar-agentkit
10AI agent toolkit for accessing and analyzing SEC EDGAR filing data. Build intelligent agents with LangChain, MCP-use, Gradio, Dify, and smolagents to analyze financial statements, insider trading, and company filings.
