5 skills found
bytedance / UI-TARS-desktopThe Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
NPC-Worldwide / npcpyThe python library for research and development in NLP, multimodal LLMs, Agents, ML, Knowledge Graphs, and more.
the-ai-merge / multimodal-agents-courseAn MCP Multimodal AI Agent with eyes and ears!
Ejb503 / multimodal-mcp-clientA Multi-modal MCP client for voice powered agentic workflows
WeberG619 / cadre-aiVoice-driven AI professional agent. Real-time conversations powered by Gemini Live API, native audio streaming, and multimodal intelligence. BIM/Revit, financial analysis, and web search tools.