Fines

Code for paper "FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning" Neurips2025.

Generate Convert Improve

Install / Use

/learn @JiazuoYu/Fines

About this skill

Quality Score

0/100

README

🚀 Finers

Code for paper "FineRS: Fine-grained Reasoning and Segmentation of Small Objects with Reinforcement Learning" Neurips2025. ArXiv

📊 Benchmark

👉 🔥 Our Benchmark on HuggingFace

example image

🛠️Framework

example image

📦 Installation

# Create environment
conda create -n finers python=3.10
conda activate finers

# Project requirements
pip install -r requirements.txt

# Install PyTorch (CUDA 11.8)
pip install torch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1     --index-url https://download.pytorch.org/whl/cu118

# xFormers
pip install -U xformers==0.0.29     --index-url https://download.pytorch.org/whl/cu118

# Core dependencies
pip install bitsandbytes accelerate loguru pycocotools matplotlib sam2
pip install flash-attn --no-build-isolation   # may take long, or download from GitHub releases

# Editable install
pip install -e .

🤖 Download Model

apt install git-lfs
mkdir ckpts && cd ckpts

git lfs install
git clone https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct

▶️ Run

1️⃣ LR Data Processing

python data_process/data_converter_fixed_512_gt_crop_random_region.py

2️⃣ LR Training

bash training_scripts/final_lr_training.sh

3️⃣ HR Data Processing (Two Methods)

3.1 Faster Method

python data_process/data_converter_fixed_1920_qa.py

3.2 Paper Method (Search-Based)

# Step 1: region search based on LR model
bash data_process/data_convert_1920_with_best_region_by_LR_model.sh

# Step 2: HR conversion
python data_process/data_converter_fixed_1920_qa_with_best_region.py

4️⃣ HR Training

bash training_scripts/final_hr_training.sh 

# or for 3.1 faster data processing
bash training_scripts/final_hr_training_faster.sh

🔄 Model Conversion (HuggingFace Format)

python3 training_scripts/model_merger.py     --local_dir workdir/xxx/global_step_xxx/actor

🧪 Evaluation

bash eval.sh

🔥 Our Pretrained Models for Inference

https://huggingface.co/mycfhs/FineRS/tree/main

Acknowledgement

Our repo is built on Seg-Zero, EasyR1 and veRL. We thank the authors for sharing their codes.
This work utilizes models from Qwen2.5-VL and SAM2.

Related Skills

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

400

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

last30days-skill

19.1k

AI agent skill that researches any topic across Reddit, X, YouTube, HN, Polymarket, and the web - then synthesizes a grounded summary