Rayemb
[ACCV 2024 (Oral)] Official Implementation of "RayEmb: Arbitrary Landmark Detection in X-Ray Images Using Ray Embedding Subspace", Pragyan Shrestha, Chun Xie, Yuichi Yoshii, Itaru Kitahara
Install / Use
/learn @Pragyanstha/RayembREADME
RayEmb
Arbitrary Landmark Detection in X-Ray Images Using Ray Embedding Subspace
<!--  --> <!-- Conference --> </div>
Comparison of landmark detection results between conventional fixed landmark
estimation and our arbitrary landmark estimation method. The 3D landmarks are
shown in magenta on the left, while the estimated 2D landmarks are displayed in
cyan and the ground truth in magenta on the right. Our method can generate a large
number of corresponding pairs of 3D landmarks and 2D projections, whereas the fixed
landmark estimation approach is limited to the pre-annotated landmarks.
🔥 Updates
2024/12/09: Interactive demo is live!2024/12/04: Project page is live!2024/12/01: Code available.
⭐ Overview
RayEmb introduces a novel approach for detecting arbitrary landmarks in X-ray images using ray embedding subspace. Our approach represents 3D points as distinct subspaces, formed by feature vectors (referred to as ray embeddings) corresponding to intersecting rays. Establishing 2D-3D correspondences then becomes a task of finding ray embeddings that are close to a given subspace, essentially performing an intersection test.
🚀 Features
- A CLI for downloading data, preprocessing, training and evaluating models.
- A PyTorch implementation of the RayEmb and FixedLandmark models.
- OpenCV based PnP + RANSAC based 2D-3D registration for initial pose estimates
- DiffDRR based refinement module.
📚 Requirements
- Python 3.10+
- PyTorch 2.0+
- CUDA 11.8+ We have tested the code on a RTX 3090 and an H100 GPU.
🛠️ Installation and Setup
Install the dependencies using poetry.
poetry install
Check that rayemb-cli is in your path and is executable.
rayemb --help
Usage: rayemb [OPTIONS] COMMAND [ARGS]...
Options:
--help Show this message and exit.
Commands:
evaluate
generate
train
Download the original DeepFluoro dataset (only the real x-raysfor testing) using the following command:
rayemb download deepfluoro \
--data_dir ./data
Preprocess the DeepFluoro dataset to get the templates using the following command:
sh scripts/generate/template_deepfluoro.sh
Download the RayEmb-CTPelvic1k checkpoint using the following command:
rayemb download checkpoint \
--checkpoint_dir ./checkpoints \
--model rayemb \
--dataset ctpelvic1k
This can be used to evaluate the model on the CTPelvic1k dataset as well as DeepFluoro dataset.
🎖️ Evaluation
rayemb evaluate arbitrary-landmark deepfluoro \
--checkpoint_path ./checkpoints/rayemb-ctpelvic1k.ckpt \
--num_templates 4 \
--image_size 224 \
--template_dir ./data/deepfluoro_templates \
--data_dir ./data/ipcai_2020_full_res_data.h5
🎉 Running the demo app locally
We use fastapi as the backend and vite react as the frontend so make sure you have node and npm installed. First, start the server.
cd server
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
Then, start the frontend.
cd ui
npm install
npm run dev
Visit http://localhost:5173/ to see the demo app.
📔 Training and Testing Custom Data
Below is an example for generating templates and training data from a single Nifti file. If you want to generate templates and images for multiple Nifti files, you can write a shell script to loop through the files and generate templates and images, please refer to the generate command for more details.
rayemb generate template custom \
--input_file <path_to_nifti_file> \
--output_dir <path_to_output_templates_dir> \
--height <image_height> \
--steps <number_of_steps> \
--pixel_size <pixel_size> \
--source_to_detector_distance <source_to_detector_distance>
Now, we can generate the training and testing images using the following command:
rayemb generate dataset custom \
--input_file <path_to_nifti_file> \
--mask_dir <path_to_mask_dir> \ # optional, if not provided, a mask will be generated using the threshold
--threshold <threshold> \ # only used if mask_dir is not provided
--output_dir <path_to_output_images_dir> \
--height <image_height> \
--num_samples <number_of_samples> \
--device <device> \
--source_to_detector_distance <source_to_detector_distance> \
--pixel_size <pixel_size> \
Split the generated images into training and validation sets using the following command:
rayemb generate splits \
--data_dir <path_to_data_dir> \
--type custom \
--split_ratio <split_ratio>
Train the model using the following command:
sh scripts/train/rayemb_custom.sh <path_to_data_dir> <path_to_template_dir>
🏷️ TODO
- [x] Update the readme for evaluation, synthetic data generation and template generation.
- [x] Update the readme for training.
- [ ] Add typing and annotations to the codebase.
Contact
For any questions or collaboration inquiries, please contact shrestha.pragyan@image.iit.tsukuba.ac.jp.
Acknowledgements
We would like to thank eigenvivek for his awesome DiffDRR codebase.
Citations
If you find our work helpful for your research, please consider citing the following BibTeX entry.
@InProceedings{Shrestha_2024_ACCV,
author = {Shrestha, Pragyan and Xie, Chun and Yoshii, Yuichi and Kitahara, Itaru},
title = {RayEmb: Arbitrary Landmark Detection in X-Ray Images Using Ray Embedding Subspace},
booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
month = {December},
year = {2024},
pages = {665-681}
}
Related Skills
node-connect
343.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
90.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
