6 skills found
bytedance / UI-TARS-desktopThe Open-Source Multimodal AI Agent Stack: Connecting Cutting-Edge AI Models and Agent Infra
SamurAIGPT / Generative-Media-SkillsMulti-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai.
NPC-Worldwide / npcpyThe python library for research and development in NLP, multimodal LLMs, Agents, ML, Knowledge Graphs, and more.
the-ai-merge / multimodal-agents-courseAn MCP Multimodal AI Agent with eyes and ears!
WeberG619 / cadre-aiVoice-driven AI professional agent. Real-time conversations powered by Gemini Live API, native audio streaming, and multimodal intelligence. BIM/Revit, financial analysis, and web search tools.
rayk / cookbook--- description: Look up examples from Claude Cookbooks for implementation patterns arguments: - name: topic description: What to look up (e.g., "tool use", "skills", "agents", "multimodal")