MultiModalSearch
In this repository I demonstrate how you can perform multimodal(image+text) search to find similar images+texts given a test image+text from a multimodal (texts+images) database . I use the Kaggle Shopee dataset. I use Tensorflow MobileNet CNN and hugging face sentence transformers BERT to extract image and text embeddings to create a joint embedding search space. Given an image and it text description I extract joint embedding and then use nearest neighbours algorithm to find top 5 similar images+texts description from my joint embedding search space
Install / Use
/learn @rsreetech/MultiModalSearchREADME
Multi Modal Search (Text+Image) using Tensorflow, HuggingFace in Python on Kaggle Shopee Dataset
In this repository I demonstrate how you can perform multimodal(image+text) search to find similar images+texts given a test image+text from a multimodal (texts+images) database . I use the Kaggle Shopee dataset. I use Tensorflow MobileNet CNN and hugging face sentence transformers BERT to extract image and text embeddings to create a joint embedding search space. Given an image and it text description I extract joint embedding and then use nearest neighbours algorithm to find top 5 similar images+texts description from my joint embedding search space
Pre-requisites
Python 3.6 https://www.python.org/downloads/release/python-360/
Tensorflow 2.0 and above https://www.tensorflow.org/install
Hugging Face transformers https://huggingface.co/transformers/
Sentence transformers https://www.sbert.net/
Kaggle Shopee dataset: https://www.kaggle.com/c/shopee-product-matching/data Download dataset and copy to appropriate path
References: MobileNet : https://arxiv.org/pdf/1704.04861.pdf
Sk-learn nearest neighbours : https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.NearestNeighbors.html#sklearn.neighbors.NearestNeighbors , https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.DistanceMetric.html#sklearn.neighbors.DistanceMetric , https://scikit-learn.org/stable/modules/generated/sklearn.metrics.pairwise.paired_cosine_distances.html
BERT: http://jalammar.github.io/illustrated-bert/
Related Skills
feishu-drive
352.5k|
things-mac
352.5kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
clawhub
352.5kUse the ClawHub CLI to search, install, update, and publish agent skills from clawhub.com
codebase-memory-mcp
1.3kHigh-performance code intelligence MCP server. Indexes codebases into a persistent knowledge graph — average repo in milliseconds. 66 languages, sub-ms queries, 99% fewer tokens. Single static binary, zero dependencies.
