233 skills found · Page 4 of 8
Govind-S-B / Pdf To Text Chroma SearchPython scripts that converts PDF files to text, splits them into chunks, and stores their vector representations using GPT4All embeddings in a Chroma DB. It also provides a script to query the Chroma DB for similarity search based on user input.
SAP-samples / Cap AI Vector Engine SampleThis repository contains sample code for a CAP application utilizing the CAP-LLM-Plugin to establish a connection to SAP AI Core and SAP HANA Cloud for creating and storing vector embeddings, perform similarity searches and requesting RAG responses.
botirk38 / SemanticcacheA Go library for semantic caching with LRU eviction, supporting vector-based similarity search with pluggable embedding backends (local or cloud).
jstrosch / Graph Maldoc Similar ImagesA script that extracts embedded images from Office Open XML (OOXML) documents and generates image hash similarity graphs that cluster visually similar images together. The script computes the Average Hash of each extracted image, then graphs the images if they meet the similarity threshold. The script can be used as a technique for visually identifying malware campaigns involving documents. To use the script, supply a directory containing OOXML files. If LibreOffice is in your PATH you can optionally convert non-OOXML Word, Excel, PowerPoint and Rich Text File documents to OOXML. The script outputs DOT files that can be exported as images using Graphviz. If Graphviz is in your PATH you can also export to an SVG (preferred) or PNG image.
ViCCo-Group / SPoSESparse Positive Object Similarity Embedding(s)
avidale / Dependency ParaphraserA sentence paraphraser based on dependency parsing and word embedding similarity.
tahmedge / CETE LRECSource code of the paper "Contextualized Embeddings based Transformer Encoder for Sentence Similarity Modeling in Answer Selection Task" published in LREC 2020.
rustyneuron01 / Conversation Genome ProjectStructured data & semantic tagging pipeline. Turns raw text (conversations, web pages, surveys) into tagged data for AI and search. Coordinators set ground truth; workers run LLM inference on windows. Scoring via cosine similarity. Python, FastAPI, OpenAI/Anthropic/OpenRouter, embeddings, Docker.
code-kern-ai / EmbeddersWith embedders, you can easily convert your texts into sentence- or token-level embeddings within a few lines of code. Use cases for this include similarity search between texts, information extraction such as named entity recognition, or basic text classification.
dongfang91 / Text SimilarityText similarity using BERT sentence embeddings
qtwang / SEAnetKDD21 Deep Learning Embeddings for Data Series Similarity Search
totogot / ImageSimilarityA repository for converting images to feature embeddings, for the purpose of assessing image similarity
maovshao / PLMAlignPLMAlign utilizes per-residue embeddings as input to obtain specific alignments and more refined similarity
lingtengqiu / Facial Expression SimilarityThis project aims at providing a fast, modular reference implementation for A Compact Embedding for Facial Expression Similarity models using PyTorch.
VioletCranberry / Coco SearchLocal-first hybrid semantic code search tool. Indexes codebases into PostgreSQL with pgvector embeddings via Ollama, combines vector similarity + keyword search with RRF fusion. Supports 30+ languages. Features CLI, MCP server, WEB dashboard and interactive REPL.
mavalliani / Semantic Similarity Of SentencesMethods used: Cosine Similarity with Glove, Smooth Inverse Frequency, Word Movers Difference, Sentence Embedding Models (Infersent and Google Sentence Encoder), ESIM with pre-trained FastText embedding. Best performing method on Quora Question pair dataset was an Ensemble method with 0.27 log-loss.
volom / PornStarSimilaritySearch the most similar face porn actress for your input photo from more than 5k available photos. Fetching porn stars photos, extraction, and embedding their faces for the next cosine similarity estimation with face embedding of the input photo.
prakhargurawa / Drug Similarity And Link Prediction Using Graph Embeddings On Medical Knowledge GraphUtilizing graphical neural networks and embeddings on a medical database KEGG to perform link predictions and drug similarity systems.
Snehil-Shah / Multimodal Image Search EngineText to Image & Reverse Image Search Engine built upon Vector Similarity Search utilizing CLIP VL-Transformer for Semantic Embeddings & Qdrant as the Vector-Store
AnjaliDharmik / Text Similarity Using Siamese Deep Neural NetworkIt is a keras based implementation of Deep Siamese Bidirectional LSTM network to capture phrase/sentence similarity using word embedding.