12 skills found
openclaw / openai-whisper-apiTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
openclaw / songseeGenerate spectrograms and feature-panel visualizations from audio with the songsee CLI.
mudler / LocalAI:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference
SamurAIGPT / Generative-Media-SkillsMulti-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai.
SamurAIGPT / Generative-Media-SkillsMulti-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai.
A Model Context Protocol (MCP) server that enables AI assistants to generate images, text, and audio through the Pollinations APIs. Supports customizable parameters, image saving, and multiple model options.
AgriciDaniel / claude-shortsInteractive longform-to-shortform video creator — Claude Code skill with Remotion-rendered animated captions, AI segment scoring, cursor tracking, and audio-aware boundary snapping
wells1137 / media-skillsA collection of open-source Agent Skills for content creation — images, audio, and video.
WeberG619 / cadre-aiVoice-driven AI professional agent. Real-time conversations powered by Gemini Live API, native audio streaming, and multimodal intelligence. BIM/Revit, financial analysis, and web search tools.
agrathwohl / carla-mcp-serverAn MCP server for controlling the Carla audio plugin host
jftuga / transcript-criticClaude Code skill that transcribes audio/video with whisper.cpp to get structured critical analysis including timestamped summaries, evidence notes, logical fallacies, and underdeveloped areas
samson-art / transcriptor-mcpAn MCP server (stdio + HTTP/SSE) that fetches video transcripts/subtitles via yt-dlp, with pagination for large responses. Supports YouTube, Twitter/X, Instagram, TikTok, Twitch, Vimeo, Facebook, Bilibili, VK, Dailymotion. Whisper fallback — transcribes audio when subtitles are unavailable (local or OpenAI API). Works with Cursor and other MCP host