296 skills found · Page 8 of 10
mark-watson / Docs QA SwiftSwift Example for Documents Question Answering Using OpenAI GPT3 APIs and a Local Embeddings Vector Database
Nirerp / Knowledge Graph RagHybrid Retrieval-Augmented Generation stack that ingests documents, extracts graph relations via LiteLLM, and stores them in Qdrant (vector DB) plus Neo4j (knowledge graph), all orchestrated with Docker, .env-based config, and notebooks for testing.
0xarchit / DocumentChat Simple Rag VectorDB ProjectDocuments Chatting with Gemini AI API, MiniLM Embedding model, Chroma Vector DB, RAG and Gradio UI
braillerap / DesktopBrailleRAPDocument authoring tool for the open source Braille embosser BrailleRAP. Allowing you to mix svg vector graphics with Braille to build tactile documents
Syed007Hassan / Document Querying With VectorDBDocument Querying with LLMs - Google PaLM API: Semantic Search With LLM Embeddings
chena / Text Proc CraigText processing with NLTK and building vector space model for collection of documents.
thieled / DictvectoR'dictvectoR' measures the similarity between a concept dictionary and documents, using fastText word vectors. Implements the "Distributed-Dictionary-Representation" (Garten et al. 2018) method in R.
CyberAgentAILab / MultipaletteImplementation of Color Recommendation for Vector Graphic Documents based on Multi-Palette Representation, WACV 2023
gosha70 / Document AssistantRAG (Retrieval-Augmented Generation) framework of merging private vector databases storing unstructured document with LLM, and providing Chat/QA application which can be run locally.
prokill-werewolfbb9 / CorelDRAW Graphics SuiteCorelDRAW Graphics Suite provides vector illustration, page layout, photo editing and typography tools. It includes CorelDRAW for drawing, PHOTO-PAINT for raster editing, font management, color palettes, multi-page documents, and file compatibility with industry formats.
JingheZ / TextMiningIn this project, there are two major tasks: text data processing and text categorization. In text data processing, we have done tokenization, stemming, normalization, etc. Also, vector space model and statistical language models are used to retrieve similar documents to query. In text categorization, we build a text classification system which includes feature selection, classifiers (Naive Bayes and K Nearest Neighbor using brute force and random vectors), cross validation, and parameter tuning.
bradwellsb / Blazor Pdf ChatA Blazor web app to load, vectorize, and chat with PDF documents using AI
abideenml / Detecting SocialEngineering AttacksDetecting ☎️ Telephone-based Social Engineering Attacks Using Document 📑 Vectorization (Doc2Vec, Universal Vector Encoder), Clustering (K-means, DBSCAN, EM) and Classification of Scam Signatures
srslynow / Legal Text MiningExperiments classifying legal documents into their sub-categories: e.g. civil law, criminal law or administrative law. Classifiers used: k-Nearest Neighbours linear Support Vector Machine Random Forest Convolutional Neural Network Long Short Term Memory Neural Network
patakuti / Local Knowledge Rag MCPA semantic search and retrieval system for local documents using vector embeddings. Powered by MCP (Model Context Protocol).
ca-srg / RagentCLI tool for building production RAG systems from Markdown, CSV, and PDF documents using hybrid search (BM25 + vector) with OpenSearch. Features MCP server, Slack bot, Web UI, multi-source ingestion (local/S3/GitHub), and multi-provider embeddings (Bedrock/Gemini).
harshvardhan-11 / ChatwithMultiplePDFsThe project is a Streamlit-based app that allows users to upload PDF files, processes the text into chunks, stores it in a FAISS vector store using Google Generative AI embeddings, and enables conversational Q&A with the documents using Google Generative AI’s chat model for answers.
sfeng15 / Machine LearningImplemented Naïve Bayes, Support Vector Machines, Random Forests to classify faces and documents and achieved 92% average accuracy using R and Python. Implemented EM algorithms to segment images in Python. Used TensorFlow framework to implement Convolution Neural Networks to recognize hand written digits from MNIST datasets and images from famous Cifar 10 datasets.
PiyushShinde / Yelp Challenge Dataset Information RetrievalFiltered the the Yelp Challenge Dataset based on restaurant businesses for three major cities and built a recommendation system based on user reviews. Employed techniques like Non-Negative Matrix Factorization (NMF), Term Frequency Inverse Document Frequency (TF-IDF), Document to Vector (Doc2Vec) and Word2Vec with Latent Dirichlet Allocation (LDA) to create Collaborative Filtering and Content-Based Recommendation Systems.
TinyRag / TinyRagTinyRag is a minimal Python library for retrieval-augmented generation. It offers easy document ingestion, automatic text extraction, embedding generation, and retrieval with vector stores. Designed for quick setup and flexible provider configuration, TinyRag enables fast, contextual responses from language models.