DeepImageSearch is a Python library for building AI-powered image search systems. It supports text-to-image search, image-to-image search, hybrid search, and LLM-powered captioning using CLIP/SigLIP/EVA-CLIP multimodal embeddings with FAISS/ChromaDB/Qdrant vector indexing. Built for the agentic RAG era with MCP server, LangChain tool, and PostgreSQL metadata storage out of the box.

Features

Text-to-Image Search -- find images using natural language queries like "a red car parked near a lake"
Image-to-Image Search -- find visually similar images from a query image
Hybrid Search -- combine text and image queries with weighted fusion
Multimodal Embeddings -- CLIP, SigLIP, EVA-CLIP via open_clip, plus 500+ legacy timm models
LLM Captioning -- auto-generate image captions using any OpenAI SDK-compatible provider
Image Records -- every image tracked with ID, index, name, path, caption, timestamp (like a database)
Multiple Vector Stores -- FAISS (default), ChromaDB, Qdrant with metadata filtering
Metadata Storage -- local JSON (default) or PostgreSQL for production
Agentic Integration -- MCP server for Claude, LangChain tool for agent pipelines
GPU & CPU Support -- auto-detects CUDA, MPS (Apple Silicon), or CPU
Modern Packaging -- uv/pip compatible via pyproject.toml, Python 3.10+

Installation

From PyPI (stable release)

pip install DeepImageSearch --upgrade

From GitHub (latest v3)

pip install git+https://github.com/TechyNilesh/DeepImageSearch.git

Or with uv (recommended):

uv pip install git+https://github.com/TechyNilesh/DeepImageSearch.git

With optional extras from GitHub:

pip install "DeepImageSearch[all] @ git+https://github.com/TechyNilesh/DeepImageSearch.git"

Optional Extras

pip install "DeepImageSearch[llm]"          # LLM captioning (OpenAI SDK)
pip install "DeepImageSearch[chroma]"       # ChromaDB vector store
pip install "DeepImageSearch[qdrant]"       # Qdrant vector store
pip install "DeepImageSearch[postgres]"     # PostgreSQL metadata store
pip install "DeepImageSearch[mcp]"          # MCP server for Claude
pip install "DeepImageSearch[langchain]"    # LangChain agent tool
pip install "DeepImageSearch[all]"          # Everything

If using a GPU, uninstall faiss-cpu and install faiss-gpu instead.

Quick Start

from DeepImageSearch import SearchEngine

engine = SearchEngine(model_name="clip-vit-b-32")

# Index from a folder or list of paths
engine.index("./photos")
engine.index(["img1.jpg", "img2.jpg", "img3.jpg"])

# Text search
results = engine.search("a sunset over mountains")

# Image search
results = engine.search("query.jpg")

# Hybrid search
results = engine.search("outdoor scene", image_query="photo.jpg", mode="hybrid")

# Filtered search
results = engine.search("red car", filters={"source": "instagram"})

# Plot results
engine.plot_similar_images("query.jpg", number_of_images=9)

Image-to-Image Search

Text-to-Image Search

Search Mode Comparison (Image vs Text vs Hybrid)

Search Results

Each result contains full image identity -- you always know which image matched:

{
    "id": "a1b2c3...",
    "score": 0.87,
    "metadata": {
        "image_id": "a1b2c3...",
        "image_index": 42,
        "image_name": "sunset_042.jpg",
        "image_path": "/data/photos/sunset_042.jpg",
        "caption": "A sunset over mountains with orange sky",
        "indexed_at": "2026-03-28T10:30:00+00:00"
    }
}

Image Records

Every indexed image is tracked as a structured record (maps directly to SQL):

records = engine.get_records()              # all records
record = engine.get_record("a1b2c3...")     # by ID
print(engine.count)                         # total indexed
print(engine.info())                        # engine summary

LLM Captioning

Auto-generate image captions using any OpenAI SDK-compatible provider. Just pass model, api_key, and base_url:

from DeepImageSearch import SearchEngine

engine = SearchEngine(
    model_name="clip-vit-l-14",
    captioner_model="your-model-name",
    captioner_api_key="your-api-key",
    captioner_base_url="https://your-provider.com/v1",
)

engine.index("./photos", generate_captions=True)
results = engine.search("person holding umbrella")

Works with OpenAI, Google Gemini, Anthropic Claude, Ollama, Together AI, Groq, vLLM, or any OpenAI SDK-compatible endpoint.

Vector Stores

# FAISS (default)
engine = SearchEngine(model_name="clip-vit-b-32")

# ChromaDB
engine = SearchEngine(model_name="clip-vit-b-32", vector_store="chroma")

# Qdrant
engine = SearchEngine(model_name="clip-vit-b-32", vector_store="qdrant")

Metadata Storage

Image records are stored locally in image_records.json by default. For production, use PostgreSQL:

from DeepImageSearch import SearchEngine
from DeepImageSearch.metadatastore.postgres_store import PostgresMetadataStore

store = PostgresMetadataStore(
    connection_string="postgresql://user:pass@localhost:5432/mydb"
)
engine = SearchEngine(model_name="clip-vit-b-32", metadata_store=store)
engine.index("./photos")   # records go to PostgreSQL, vectors go to FAISS

You can implement your own backend by subclassing BaseMetadataStore.

Embedding Presets

| Preset | Model | Text Search | Best For | |---|---|---|---| | clip-vit-b-32 | CLIP ViT-B/32 | Yes | Fast, general purpose | | clip-vit-b-16 | CLIP ViT-B/16 | Yes | Better accuracy | | clip-vit-l-14 | CLIP ViT-L/14 | Yes | High accuracy | | clip-vit-l-14-336 | CLIP ViT-L/14@336 | Yes | Highest accuracy | | siglip-vit-b-16 | SigLIP ViT-B/16 | Yes | Improved zero-shot | | clip-vit-bigg-14 | CLIP ViT-bigG/14 | Yes | Maximum quality | | vgg19 | VGG-19 (timm) | No | Legacy, image-only | | resnet50 | ResNet-50 (timm) | No | Legacy, image-only |

Any timm model name also works for image-only search.

Agentic Integration

MCP Server

Expose your image index as a tool for Claude:

deep-image-search-mcp --index-path ./my_index --model clip-vit-l-14

Claude Desktop config:

{
  "mcpServers": {
    "image-search": {
      "command": "deep-image-search-mcp",
      "args": ["--index-path", "./my_index"]
    }
  }
}

LangChain Tool

from DeepImageSearch.agents.langchain_tool import create_langchain_tool

tool = create_langchain_tool(index_path="./my_index")

Generic Tool

from DeepImageSearch import ImageSearchTool

tool = ImageSearchTool(index_path="./my_index")
results = tool("a photo of a dog", k=5)

Advanced Usage

For full control, use core modules directly:

from DeepImageSearch.core.embeddings import EmbeddingManager
from DeepImageSearch.core.indexer import Indexer
from DeepImageSearch.core.searcher import Searcher
from DeepImageSearch.core.captioner import Captioner
from DeepImageSearch.vectorstores.faiss_store import FAISSStore
from DeepImageSearch.metadatastore.json_store import JsonMetadataStore

embedding = EmbeddingManager.create("clip-vit-l-14", device="cuda")
store = FAISSStore(dimension=embedding.dimension, index_type="hnsw")
metadata = JsonMetadataStore()

captioner = Captioner(
    model="your-model",
    api_key="your-key",
    base_url="https://your-provider.com/v1",
)

indexer = Indexer(embedding=embedding, vector_store=store, metadata_store=metadata, captioner=captioner)
searcher = Searcher(embedding=embedding, vector_store=store)

indexer.index(image_paths, generate_captions=True)
results = searcher.search_by_text("sunset photo")

Backward Compatibility (v2 API)

Existing v2 code continues to work:

from DeepImageSearch import Load_Data, Search_Setup

image_list = Load_Data().from_folder(["folder_path"])
st = Search_Setup(image_list=image_list, model_name="vgg19", pretrained=True)
st.run_index()
st.get_similar_images(image_path="query.jpg", number_of_images=10)
st.plot_similar_images(image_path="query.jpg", number_of_images=9)

Architecture

DeepImageSearch/
├── core/
│   ├── embeddings.py      # CLIP/SigLIP/EVA-CLIP + timm backends
│   ├── indexer.py          # Batch indexing pipeline
│   ├── searcher.py         # Text/image/hybrid search + plotting
│   └

DeepImageSearch

Install / Use

README