1,086 skills found · Page 1 of 37
Significant-Gravitas / AutoGPTAutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters.
mudler / LocalAILocalAI is the open-source AI engine. Run any model - LLMs, vision, voice, image, video - on any hardware. No GPU required.
ashishpatel26 / 500 AI Machine Learning Deep Learning Computer Vision NLP Projects With Code500 AI Machine learning Deep learning Computer vision NLP Projects with code
jacobgil / Pytorch Grad CamAdvanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
web-infra-dev / MidsceneAI-powered, vision-driven UI automation for every platform.
kornia / Kornia🐍 Geometric Computer Vision Library for Spatial AI
dusty-nv / Jetson InferenceHello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.
GetStream / Vision AgentsOpen Vision Agents by Stream. Build Vision Agents quickly with any model or video provider. Uses Stream's edge network for ultra-low latency.
facebookresearch / MmfA modular framework for vision & language multimodal research from Facebook AI Research (FAIR)
TarrySingh / Artificial Intelligence Deep Learning Machine Learning TutorialsA comprehensive list of Deep Learning / Artificial Intelligence and Machine Learning tutorials - rapidly expanding into areas of AI/Deep Learning / Machine Vision / NLP and industry specific areas such as Climate / Energy, Automotives, Retail, Pharma, Medicine, Healthcare, Policy, Ethics and more.
NVlabs / VILAVILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
SkyworkAI / Skywork R1VSkywork-R1V is an advanced multimodal AI model series developed by Skywork AI, specializing in vision-language reasoning.
jonyzhang2023 / Awesome Embodied Vla Va VlnA curated list of state-of-the-art research in embodied AI, focusing on vision-language-action (VLA) models, vision-language navigation (VLN), and related multimodal learning approaches.
roflcoopter / ViseronSelf-hosted, local only NVR and AI Computer Vision software. With features such as object detection, motion detection, face recognition and more, it gives you the power to keep an eye on your home, office or any other place you want to monitor.
enpeizhao / CVprojectscomputer vision projects | 计算机视觉相关好玩的AI项目(Python、C++、embedded system)
icereed / Paperless GptUse LLMs and LLM Vision (OCR) to handle paperless-ngx - Document Digitalization powered by AI
qingchencloud / Clawpanel🦞 OpenClaw 可视化管理面板 — 内置 AI 助手(工具调用 + 图片识别 + 多模态),一键安装 | Visual management panel with built-in AI assistant (tool calling + vision + multimodal + i18n(11))
Intent-Lab / VisionClawReal-time AI assistant for Meta Ray-Ban smart glasses -- voice + vision + agentic actions via Gemini Live and OpenClaw
szczyglis-dev / Py GptDesktop AI Assistant powered by GPT-5, GPT-4, o1, o3, Gemini, Claude, Ollama, DeepSeek, Perplexity, Grok, Bielik, chat, vision, voice, RAG, image and video generation, agents, tools, MCP, plugins, speech synthesis and recognition, web search, memory, presets, assistants,and more. Linux, Windows, Mac
cvzone / CvzoneThis is a Computer vision package that makes its easy to run Image processing and AI functions. At the core it uses OpenCV and Mediapipe libraries.