DeepImageSearch
DeepImageSearch is a Python library for fast and accurate image search. It offers seamless integration with Python, GPU support, and advanced capabilities for identifying complex image patterns using the Vision Transformer models.
Install / Use
/learn @TechyNilesh/DeepImageSearchREADME
DeepImageSearch is a Python library for building AI-powered image search systems. It supports text-to-image search, image-to-image search, hybrid search, and LLM-powered captioning using CLIP/SigLIP/EVA-CLIP multimodal embeddings with FAISS/ChromaDB/Qdrant vector indexing. Built for the agentic RAG era with MCP server, LangChain tool, and PostgreSQL metadata storage out of the box.
Features
- Text-to-Image Search -- find images using natural language queries like "a red car parked near a lake"
- Image-to-Image Search -- find visually similar images from a query image
- Hybrid Search -- combine text and image queries with weighted fusion
- Multimodal Embeddings -- CLIP, SigLIP, EVA-CLIP via open_clip, plus 500+ legacy timm models
- LLM Captioning -- auto-generate image captions using any OpenAI SDK-compatible provider
- Image Records -- every image tracked with ID, index, name, path, caption, timestamp (like a database)
- Multiple Vector Stores -- FAISS (default), ChromaDB, Qdrant with metadata filtering
- Metadata Storage -- local JSON (default) or PostgreSQL for production
- Agentic Integration -- MCP server for Claude, LangChain tool for agent pipelines
- GPU & CPU Support -- auto-detects CUDA, MPS (Apple Silicon), or CPU
- Modern Packaging --
uv/pipcompatible viapyproject.toml, Python 3.10+
Installation
From PyPI (stable release)
pip install DeepImageSearch --upgrade
From GitHub (latest v3)
pip install git+https://github.com/TechyNilesh/DeepImageSearch.git
Or with uv (recommended):
uv pip install git+https://github.com/TechyNilesh/DeepImageSearch.git
With optional extras from GitHub:
pip install "DeepImageSearch[all] @ git+https://github.com/TechyNilesh/DeepImageSearch.git"
Optional Extras
pip install "DeepImageSearch[llm]" # LLM captioning (OpenAI SDK)
pip install "DeepImageSearch[chroma]" # ChromaDB vector store
pip install "DeepImageSearch[qdrant]" # Qdrant vector store
pip install "DeepImageSearch[postgres]" # PostgreSQL metadata store
pip install "DeepImageSearch[mcp]" # MCP server for Claude
pip install "DeepImageSearch[langchain]" # LangChain agent tool
pip install "DeepImageSearch[all]" # Everything
If using a GPU, uninstall
faiss-cpuand installfaiss-gpuinstead.
Quick Start
from DeepImageSearch import SearchEngine
engine = SearchEngine(model_name="clip-vit-b-32")
# Index from a folder or list of paths
engine.index("./photos")
engine.index(["img1.jpg", "img2.jpg", "img3.jpg"])
# Text search
results = engine.search("a sunset over mountains")
# Image search
results = engine.search("query.jpg")
# Hybrid search
results = engine.search("outdoor scene", image_query="photo.jpg", mode="hybrid")
# Filtered search
results = engine.search("red car", filters={"source": "instagram"})
# Plot results
engine.plot_similar_images("query.jpg", number_of_images=9)
Image-to-Image Search
<p align="center"><img src="https://raw.githubusercontent.com/TechyNilesh/DeepImageSearch/main/images/screenshots/01_image_to_image_search.png" alt="Image to Image Search" width="700"></p>Text-to-Image Search
<p align="center"><img src="https://raw.githubusercontent.com/TechyNilesh/DeepImageSearch/main/images/screenshots/02_text_to_image_search.png" alt="Text to Image Search" width="800"></p>Search Mode Comparison (Image vs Text vs Hybrid)
<p align="center"><img src="https://raw.githubusercontent.com/TechyNilesh/DeepImageSearch/main/images/screenshots/03_search_mode_comparison.png" alt="Search Mode Comparison" width="800"></p>Search Results
Each result contains full image identity -- you always know which image matched:
{
"id": "a1b2c3...",
"score": 0.87,
"metadata": {
"image_id": "a1b2c3...",
"image_index": 42,
"image_name": "sunset_042.jpg",
"image_path": "/data/photos/sunset_042.jpg",
"caption": "A sunset over mountains with orange sky",
"indexed_at": "2026-03-28T10:30:00+00:00"
}
}
Image Records
Every indexed image is tracked as a structured record (maps directly to SQL):
records = engine.get_records() # all records
record = engine.get_record("a1b2c3...") # by ID
print(engine.count) # total indexed
print(engine.info()) # engine summary
LLM Captioning
Auto-generate image captions using any OpenAI SDK-compatible provider. Just pass model, api_key, and base_url:
from DeepImageSearch import SearchEngine
engine = SearchEngine(
model_name="clip-vit-l-14",
captioner_model="your-model-name",
captioner_api_key="your-api-key",
captioner_base_url="https://your-provider.com/v1",
)
engine.index("./photos", generate_captions=True)
results = engine.search("person holding umbrella")
Works with OpenAI, Google Gemini, Anthropic Claude, Ollama, Together AI, Groq, vLLM, or any OpenAI SDK-compatible endpoint.
Vector Stores
# FAISS (default)
engine = SearchEngine(model_name="clip-vit-b-32")
# ChromaDB
engine = SearchEngine(model_name="clip-vit-b-32", vector_store="chroma")
# Qdrant
engine = SearchEngine(model_name="clip-vit-b-32", vector_store="qdrant")
Metadata Storage
Image records are stored locally in image_records.json by default. For production, use PostgreSQL:
from DeepImageSearch import SearchEngine
from DeepImageSearch.metadatastore.postgres_store import PostgresMetadataStore
store = PostgresMetadataStore(
connection_string="postgresql://user:pass@localhost:5432/mydb"
)
engine = SearchEngine(model_name="clip-vit-b-32", metadata_store=store)
engine.index("./photos") # records go to PostgreSQL, vectors go to FAISS
You can implement your own backend by subclassing BaseMetadataStore.
Embedding Presets
| Preset | Model | Text Search | Best For |
|---|---|---|---|
| clip-vit-b-32 | CLIP ViT-B/32 | Yes | Fast, general purpose |
| clip-vit-b-16 | CLIP ViT-B/16 | Yes | Better accuracy |
| clip-vit-l-14 | CLIP ViT-L/14 | Yes | High accuracy |
| clip-vit-l-14-336 | CLIP ViT-L/14@336 | Yes | Highest accuracy |
| siglip-vit-b-16 | SigLIP ViT-B/16 | Yes | Improved zero-shot |
| clip-vit-bigg-14 | CLIP ViT-bigG/14 | Yes | Maximum quality |
| vgg19 | VGG-19 (timm) | No | Legacy, image-only |
| resnet50 | ResNet-50 (timm) | No | Legacy, image-only |
Any timm model name also works for image-only search.
Agentic Integration
MCP Server
Expose your image index as a tool for Claude:
deep-image-search-mcp --index-path ./my_index --model clip-vit-l-14
Claude Desktop config:
{
"mcpServers": {
"image-search": {
"command": "deep-image-search-mcp",
"args": ["--index-path", "./my_index"]
}
}
}
LangChain Tool
from DeepImageSearch.agents.langchain_tool import create_langchain_tool
tool = create_langchain_tool(index_path="./my_index")
Generic Tool
from DeepImageSearch import ImageSearchTool
tool = ImageSearchTool(index_path="./my_index")
results = tool("a photo of a dog", k=5)
Advanced Usage
For full control, use core modules directly:
from DeepImageSearch.core.embeddings import EmbeddingManager
from DeepImageSearch.core.indexer import Indexer
from DeepImageSearch.core.searcher import Searcher
from DeepImageSearch.core.captioner import Captioner
from DeepImageSearch.vectorstores.faiss_store import FAISSStore
from DeepImageSearch.metadatastore.json_store import JsonMetadataStore
embedding = EmbeddingManager.create("clip-vit-l-14", device="cuda")
store = FAISSStore(dimension=embedding.dimension, index_type="hnsw")
metadata = JsonMetadataStore()
captioner = Captioner(
model="your-model",
api_key="your-key",
base_url="https://your-provider.com/v1",
)
indexer = Indexer(embedding=embedding, vector_store=store, metadata_store=metadata, captioner=captioner)
searcher = Searcher(embedding=embedding, vector_store=store)
indexer.index(image_paths, generate_captions=True)
results = searcher.search_by_text("sunset photo")
Backward Compatibility (v2 API)
Existing v2 code continues to work:
from DeepImageSearch import Load_Data, Search_Setup
image_list = Load_Data().from_folder(["folder_path"])
st = Search_Setup(image_list=image_list, model_name="vgg19", pretrained=True)
st.run_index()
st.get_similar_images(image_path="query.jpg", number_of_images=10)
st.plot_similar_images(image_path="query.jpg", number_of_images=9)
Architecture
DeepImageSearch/
├── core/
│ ├── embeddings.py # CLIP/SigLIP/EVA-CLIP + timm backends
│ ├── indexer.py # Batch indexing pipeline
│ ├── searcher.py # Text/image/hybrid search + plotting
│ └
