Rayemb

[ACCV 2024 (Oral)] Official Implementation of "RayEmb: Arbitrary Landmark Detection in X-Ray Images Using Ray Embedding Subspace", Pragyan Shrestha, Chun Xie, Yuichi Yoshii, Itaru Kitahara

Generate Convert Improve

Install / Use

/learn @Pragyanstha/Rayemb

About this skill

Quality Score

0/100

README

RayEmb

Arbitrary Landmark Detection in X-Ray Images Using Ray Embedding Subspace

</div>

teaser Comparison of landmark detection results between conventional fixed landmark estimation and our arbitrary landmark estimation method. The 3D landmarks are shown in magenta on the left, while the estimated 2D landmarks are displayed in cyan and the ground truth in magenta on the right. Our method can generate a large number of corresponding pairs of 3D landmarks and 2D projections, whereas the fixed landmark estimation approach is limited to the pre-annotated landmarks.

🔥 Updates

2024/12/09 : Interactive demo is live!
2024/12/04 : Project page is live!
2024/12/01 : Code available.

⭐ Overview

RayEmb introduces a novel approach for detecting arbitrary landmarks in X-ray images using ray embedding subspace. Our approach represents 3D points as distinct subspaces, formed by feature vectors (referred to as ray embeddings) corresponding to intersecting rays. Establishing 2D-3D correspondences then becomes a task of finding ray embeddings that are close to a given subspace, essentially performing an intersection test.

🚀 Features

A CLI for downloading data, preprocessing, training and evaluating models.
A PyTorch implementation of the RayEmb and FixedLandmark models.
OpenCV based PnP + RANSAC based 2D-3D registration for initial pose estimates
DiffDRR based refinement module.

📚 Requirements

Python 3.10+
PyTorch 2.0+
CUDA 11.8+ We have tested the code on a RTX 3090 and an H100 GPU.

🛠️ Installation and Setup

Install the dependencies using poetry.

poetry install

Check that rayemb-cli is in your path and is executable.

rayemb --help

Usage: rayemb [OPTIONS] COMMAND [ARGS]...

Options:
  --help  Show this message and exit.

Commands:
  evaluate
  generate
  train

Download the original DeepFluoro dataset (only the real x-raysfor testing) using the following command:

rayemb download deepfluoro \
--data_dir ./data

Preprocess the DeepFluoro dataset to get the templates using the following command:

sh scripts/generate/template_deepfluoro.sh

Download the RayEmb-CTPelvic1k checkpoint using the following command:

rayemb download checkpoint \
--checkpoint_dir ./checkpoints \
--model rayemb \
--dataset ctpelvic1k

This can be used to evaluate the model on the CTPelvic1k dataset as well as DeepFluoro dataset.

🎖️ Evaluation

rayemb evaluate arbitrary-landmark deepfluoro \
--checkpoint_path ./checkpoints/rayemb-ctpelvic1k.ckpt \
--num_templates 4 \
--image_size 224 \
--template_dir ./data/deepfluoro_templates \
--data_dir ./data/ipcai_2020_full_res_data.h5

🎉 Running the demo app locally

We use fastapi as the backend and vite react as the frontend so make sure you have node and npm installed. First, start the server.

cd server
uvicorn main:app --host 0.0.0.0 --port 8000 --reload

Then, start the frontend.

cd ui
npm install
npm run dev

Visit http://localhost:5173/ to see the demo app.

📔 Training and Testing Custom Data

Below is an example for generating templates and training data from a single Nifti file. If you want to generate templates and images for multiple Nifti files, you can write a shell script to loop through the files and generate templates and images, please refer to the generate command for more details.

rayemb generate template custom \
--input_file <path_to_nifti_file> \
--output_dir <path_to_output_templates_dir> \
--height <image_height> \
--steps <number_of_steps> \
--pixel_size <pixel_size> \
--source_to_detector_distance <source_to_detector_distance>

Now, we can generate the training and testing images using the following command:

rayemb generate dataset custom \
--input_file <path_to_nifti_file> \
--mask_dir <path_to_mask_dir> \ # optional, if not provided, a mask will be generated using the threshold
--threshold <threshold> \ # only used if mask_dir is not provided
--output_dir <path_to_output_images_dir> \
--height <image_height> \
--num_samples <number_of_samples> \
--device <device> \
--source_to_detector_distance <source_to_detector_distance> \
--pixel_size <pixel_size> \

Split the generated images into training and validation sets using the following command:

rayemb generate splits \
--data_dir <path_to_data_dir> \
--type custom \
--split_ratio <split_ratio>

Train the model using the following command:

sh scripts/train/rayemb_custom.sh <path_to_data_dir> <path_to_template_dir>

🏷️ TODO

[x] Update the readme for evaluation, synthetic data generation and template generation.
[x] Update the readme for training.
[ ] Add typing and annotations to the codebase.

Contact

For any questions or collaboration inquiries, please contact shrestha.pragyan@image.iit.tsukuba.ac.jp.

Acknowledgements

We would like to thank eigenvivek for his awesome DiffDRR codebase.

Citations

If you find our work helpful for your research, please consider citing the following BibTeX entry.

@InProceedings{Shrestha_2024_ACCV,
    author    = {Shrestha, Pragyan and Xie, Chun and Yoshii, Yuichi and Kitahara, Itaru},
    title     = {RayEmb: Arbitrary Landmark Detection in X-Ray Images Using Ray Embedding Subspace},
    booktitle = {Proceedings of the Asian Conference on Computer Vision (ACCV)},
    month     = {December},
    year      = {2024},
    pages     = {665-681}
}

Related Skills

node-connect

343.1k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

90.0k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

343.1k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

343.1k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。