289 skills found · Page 1 of 10
jina-ai / Clip As Service🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
mazzzystar / QueryableRun OpenAI's CLIP and Apple's MobileCLIP model on iOS to search photos.
michaelfeil / InfinityInfinity is a high-throughput, low-latency serving engine for text-embeddings, reranking models, clip, clap and colpali
facebookresearch / Perception ModelsState-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
SunzeY / AlphaCLIP[CVPR 2024] Alpha-CLIP: A CLIP Model Focusing on Wherever You Want
AndreyGuzhov / AudioCLIPSource code for models described in the paper "AudioCLIP: Extending CLIP to Image, Text and Audio" (https://arxiv.org/abs/2106.13043)
LAION-AI / CLIP BenchmarkCLIP-like model evaluation
zhengli97 / Awesome Prompt Adapter Learning For VLMsA curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.
moein-shariatnia / OpenAI CLIPSimple implementation of OpenAI CLIP model in PyTorch.
ljwztc / CLIP Driven Universal Model[ICCV 2023] CLIP-Driven Universal Model; Rank first in MSD Competition.
v-iashin / Video FeaturesExtract video features from raw videos using multiple GPUs. We support RAFT flow frames as well as S3D, I3D, R(2+1)D, VGGish, CLIP, and TIMM models.
microsoft / LLM2CLIPLLM2CLIP significantly improves already state-of-the-art CLIP models.
slavabarkov / TidyOffline semantic Text-to-Image and Image-to-Image search on Android powered by quantized state-of-the-art vision-language pretrained CLIP model and ONNX Runtime inference engine
patrickjohncyh / Fashion ClipFashionCLIP is a CLIP-like model fine-tuned for the fashion domain.
greyovo / PicQuery🔍 Search local images with natural language on Android, powered by OpenAI's CLIP model. / 在 Android 上用自然语言搜索本地图片 (基于 OpenAI 的 CLIP 模型)
johanmodin / ClifsContrastive Language-Image Forensic Search allows free text searching through videos using OpenAI's machine learning model CLIP
Syliz517 / CLIP ReIDOfficial implementation for "CLIP-ReID: Exploiting Vision-Language Model for Image Re-identification without Concrete Text Labels" (AAAI 2023)
Meituan-Dianping / Vision Ui视觉UI分析工具
NasirKhalid24 / CLIP MeshOfficial implementation of CLIP-Mesh: Generating textured meshes from text using pretrained image-text models
PathologyFoundation / PlipPathology Language and Image Pre-Training (PLIP) is the first vision and language foundation model for Pathology AI (Nature Medicine). PLIP is a large-scale pre-trained model that can be used to extract visual and language features from pathology images and text description. The model is a fine-tuned version of the original CLIP model.