SkillAgentSearch skills...

Samexporter

Exporting Segment Anything, MobileSAM, and Segment Anything 2 into ONNX format for easy deployment

Install / Use

/learn @vietanhdev/Samexporter
About this skill

Quality Score

0/100

Category

Operations

Supported Platforms

Universal

README

SAMExporter — SAM / SAM2 / SAM2.1 / SAM3 / MobileSAM → ONNX

Export Segment Anything, MobileSAM, Segment Anything 2 / 2.1, and Segment Anything 3 to ONNX for easy, dependency-free deployment.

PyPI version Downloads Downloads Downloads

Supported models:

| Model | Prompt types | Notes | |-------|-------------|-------| | SAM ViT-B / ViT-L / ViT-H | Point, Rectangle | Original Meta SAM | | SAM ViT-B / ViT-L / ViT-H (quantized) | Point, Rectangle | Smaller, faster variants | | MobileSAM | Point, Rectangle | Lightweight; fast on CPU | | SAM2 Tiny / Small / Base+ / Large | Point, Rectangle | Meta SAM 2 | | SAM2.1 Tiny / Small / Base+ / Large | Point, Rectangle | Improved SAM 2 | | SAM3 ViT-H | Text, Point, Rectangle | Open-vocabulary text-driven segmentation |


Installation

Requires Python 3.11+.

pip install torch==2.10.0 torchvision==0.25.0 --index-url https://download.pytorch.org/whl/cpu
pip install samexporter

Note — Windows users: The optional onnxsim model simplifier (used during ONNX export) has no pre-built wheel for Windows. If you plan to export models and want simplification, install with:

pip install "samexporter[export]"

or enable Windows Long Path support before installing.

From source

pip install torch==2.10.0 torchvision==0.25.0 --index-url https://download.pytorch.org/whl/cpu
git clone --recurse-submodules https://github.com/vietanhdev/samexporter
cd samexporter
pip install -e .

SAM / MobileSAM — Convert to ONNX

1. Download checkpoints

Place checkpoints in original_models/:

original_models/
  sam_vit_b_01ec64.pth
  sam_vit_l_0b3195.pth
  sam_vit_h_4b8939.pth
  mobile_sam.pt

Download links:

2. Export encoder

# SAM ViT-H (most accurate)
python -m samexporter.export_encoder \
    --checkpoint original_models/sam_vit_h_4b8939.pth \
    --output output_models/sam_vit_h_4b8939.encoder.onnx \
    --model-type vit_h \
    --quantize-out output_models/sam_vit_h_4b8939.encoder.quant.onnx \
    --use-preprocess

# SAM ViT-B (fastest)
python -m samexporter.export_encoder \
    --checkpoint original_models/sam_vit_b_01ec64.pth \
    --output output_models/sam_vit_b_01ec64.encoder.onnx \
    --model-type vit_b \
    --quantize-out output_models/sam_vit_b_01ec64.encoder.quant.onnx \
    --use-preprocess

3. Export decoder

python -m samexporter.export_decoder \
    --checkpoint original_models/sam_vit_h_4b8939.pth \
    --output output_models/sam_vit_h_4b8939.decoder.onnx \
    --model-type vit_h \
    --quantize-out output_models/sam_vit_h_4b8939.decoder.quant.onnx \
    --return-single-mask

Remove --return-single-mask to return multiple mask proposals.

Batch convert all SAM models:

bash convert_all_meta_sam.sh
bash convert_mobile_sam.sh

4. Run inference

python -m samexporter.inference \
    --encoder_model output_models/sam_vit_h_4b8939.encoder.onnx \
    --decoder_model output_models/sam_vit_h_4b8939.decoder.onnx \
    --image images/truck.jpg \
    --prompt images/truck_prompt.json \
    --output output_images/truck.png \
    --show

truck

python -m samexporter.inference \
    --encoder_model output_models/sam_vit_h_4b8939.encoder.onnx \
    --decoder_model output_models/sam_vit_h_4b8939.decoder.onnx \
    --image images/plants.png \
    --prompt images/plants_prompt1.json \
    --output output_images/plants_01.png \
    --show

plants_01


SAM2 / SAM2.1 — Convert to ONNX

1. Download checkpoints

cd original_models && bash download_sam2.sh

Or download manually:

original_models/
  sam2_hiera_tiny.pt
  sam2_hiera_small.pt
  sam2_hiera_base_plus.pt
  sam2_hiera_large.pt
  sam2.1_hiera_tiny.pt
  sam2.1_hiera_small.pt
  sam2.1_hiera_base_plus.pt
  sam2.1_hiera_large.pt

2. Install SAM2 PyTorch package

pip install git+https://github.com/facebookresearch/segment-anything-2.git

3. Export

# Single model example (SAM2 Tiny)
python -m samexporter.export_sam2 \
    --checkpoint original_models/sam2_hiera_tiny.pt \
    --output_encoder output_models/sam2_hiera_tiny.encoder.onnx \
    --output_decoder output_models/sam2_hiera_tiny.decoder.onnx \
    --model_type sam2_hiera_tiny

# SAM2.1 example
python -m samexporter.export_sam2 \
    --checkpoint original_models/sam2.1_hiera_tiny.pt \
    --output_encoder output_models/sam2.1_hiera_tiny.encoder.onnx \
    --output_decoder output_models/sam2.1_hiera_tiny.decoder.onnx \
    --model_type sam2.1_hiera_tiny

Batch convert all SAM2 / SAM2.1 models:

bash convert_all_meta_sam2.sh

4. Run inference

python -m samexporter.inference \
    --encoder_model output_models/sam2_hiera_tiny.encoder.onnx \
    --decoder_model output_models/sam2_hiera_tiny.decoder.onnx \
    --image images/truck.jpg \
    --prompt images/truck_prompt.json \
    --sam_variant sam2 \
    --output output_images/sam2_truck.png \
    --show

truck_sam2


SAM3 — Convert to ONNX

SAM3 extends the SAM family with open-vocabulary, text-driven segmentation. In addition to point and rectangle prompts, it accepts text prompts (e.g., "truck", "person") to detect and segment objects without any prior training on those classes.

SAM3 exports into three separate ONNX models: an image encoder, a language (text) encoder, and a decoder.

Pre-exported ONNX models

Pre-exported models are available on HuggingFace and are downloaded automatically:

vietanhdev/segment-anything-3-onnx-models
  sam3_image_encoder.onnx  (+ .data)
  sam3_language_encoder.onnx  (+ .data)
  sam3_decoder.onnx  (+ .data)

Export from PyTorch (optional)

# Clone the SAM3 source (required for export only, not inference)
git submodule update --init sam3

# Install SAM3 dependencies
pip install osam

# Export (add --simplify for ONNX simplification, requires [export] extra on Windows)
python -m samexporter.export_sam3 \
    --output_dir output_models/sam3 \
    --opset 18

Run inference

Text-only prompt (detects all instances matching the text):

python -m samexporter.inference \
    --sam_variant sam3 \
    --encoder_model output_models/sam3/sam3_image_encoder.onnx \
    --decoder_model output_models/sam3/sam3_decoder.onnx \
    --language_encoder_model output_models/sam3/sam3_language_encoder.onnx \
    --image images/truck.jpg \
    --prompt images/truck_sam3.json \
    --text_prompt "truck" \
    --output output_images/truck_sam3.png \
    --show

Text + rectangle prompt (text guides detection, rectangle refines region):

python -m samexporter.inference \
    --sam_variant sam3 \
    --encoder_model output_models/sam3/sam3_image_encoder.onnx \
    --decoder_model output_models/sam3/sam3_decoder.onnx \
    --language_encoder_model output_models/sam3/sam3_language_encoder.onnx \
    --image images/truck.jpg \
    --prompt images/truck_sam3_box.json \
    --text_prompt "truck" \
    --output output_images/truck_sam3_box.png \
    --show

Text + point prompt:

python -m samexporter.inference \
    --sam_variant sam3 \
    --encoder_model output_models/sam3/sam3_image_encoder.onnx \
    --decoder_model output_models/sam3/sam3_decoder.onnx \
    --language_encoder_model output_models/sam3/sam3_language_encoder.onnx \
    --image images/truck.jpg \
    --prompt images/truck_sam3_point.json \
    --text_prompt "truck" \
    --output output_images/truck_sam3_point.png \
    --show

Note: Always pass --text_prompt for SAM3. Without it the model defaults to a "visual" text token and may produce zero detections.


Prompt JSON format

Prompts are JSON files containing a list of mark objects:

[
  {"type": "point",     "data": [x, y],           "label": 1},
  {"type": "rectangle", "data": [x1, y1, x2, y2]},
  {"type": "text",      "data": "object description"}
]
  • label: 1 — foreground point; label: 0 — background point
  • type: "text" is specific to SAM3 (use --text_prompt on the CLI instead for convenience)

Tips

  • Use quantized models (*.quant.onnx) for faster inference and smaller file size with minimal accuracy loss.
  • SAM ViT-B is the fastest SAM1 variant; SAM ViT-H is the most accurate.
  • SAM2 Tiny / SAM2.1 Tiny are good CPU-friendly choices for SAM2.
  • SAM3 is slower due to its three-model pipeline but uniquely supports natural-language object queries.
  • Run the encoder once per image; the lightweight decoder handles prompt changes in real time.

Running tests

pip install pytest
pytest tests/

AnyLabeling

This package was originally develope

View on GitHub
GitHub Stars392
CategoryOperations
Updated3d ago
Forks42

Languages

Python

Security Score

100/100

Audited on Apr 4, 2026

No findings