55 skills found · Page 1 of 2
SamurAIGPT / Generative Media SkillsMulti-modal Generative Media Skills for AI Agents (Claude Code, Cursor, Gemini CLI). High-quality image, video, and audio generation powered by muapi.ai.
archinetai / Audio AI TimelineA timeline of the latest AI models for audio generation, starting in 2023!
NeptuneHub / AudioMuse AIAudioMuse-AI is an Open Source Dockerized environment that brings automatic playlist generation to Jellyfin, Navidrome, LMS, Lyrion and Emby. Using powerful tools like Librosa and ONNX, it performs sonic analysis on your audio files locally, allowing you to curate the perfect playlist for any mood or occasion without relying on external APIs.
fspecii / HeartMuLa StudioSuno-like music generation studio for HeartMuLa/heartlib - AI-powered music creation with reference audio style transfer
ammaarreshi / OpenjourneyOpen-source clone of the MidJourney web interface featuring real AI image and video generation powered by Google's Gemini SDK. Use Imagen 4 to generate images and Veo 2 and 3 for image and text to video with audio.
jgravelle / GroqCastersGroqCasters is a Python application that generates podcast scripts and corresponding audio using AI technologies. It leverages PocketGroq for script generation and Bark for text-to-speech conversion, allowing for custom voice cloning.
okio-ai / Nendo PlatformNendo is an open source platform for AI-driven audio management, intelligence, and generation.
innovatorved / Realtime Interview CopilotRealtime Interview Copilot is a web application that assists users in crafting responses during interviews. It leverages real-time audio transcription and AI-powered response generation to provide relevant and concise answers.
aastroza / AI Podcast GeneratorAI-powered tool for automatic podcast script and audio generation.
Fantety / FrameForgeFrameForge is a web application built with FastAPI and React. As an AI-powered asset generation tool designed specifically for game developers, it offers a variety of AI-driven features to help developers quickly create visual and audio assets required for games.
BernieTv / ElevenLabs CloneA self-hosted ElevenLabs clone for text-to-speech, voice conversion, and AI audio generation with Docker, FastAPI, and Next.js. 🔊🎙️💡💻
kousen / OpenAIClientDemonstrates how to use Spring to access OpenAI restful web services without using the Spring AI project. Tests call ChatGPT for text, DALL-E for image generation, and Whisper for audio transcriptions.
deepsingh132 / AionairA cutting-edge AI SaaS platform that enables users to create, discover, and enjoy podcasts with advanced features like text-to-audio conversion with multi-voice AI, podcast thumbnail image generation, and seamless playback. The platform is built using Next.js, TypeScript, Convex, OpenAI, Stripe, Clerk, ShadCN, and Tailwind CSS.
CloudAI-X / Z AI Playground V2Z.AI API Playground - Complete examples for GLM-4.7, Vision, Image/Video Generation, Audio, and more. Powered by Z.AI-GLM-4.7-Coding Plan
RowanUnderwood / Synesthesia AI Video DirectorAutomate your AI music video workflow with Synesthesia Engine. This local Gradio app bridges audio analysis, LLM-driven storytelling, and LTX Desktop video generation. Simply drop in your song stems and lyrics, let the AI direct your storyboard, and batch-render your final cut.
RhythrosaLabs / SoundstormSoundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
georgbuechner / DissonanceA command line and keyboard based strategy-game written in c++, where audio-input determines the AI-strategy and lays the seed for the map-generation.
ebowwa / HeyCyanSmartGlassesSDKCross-platform SDK for HeyCyan smart glasses - Control photo/video capture, audio recording, and AI image generation via Bluetooth LE on iOS and Android
nikhil-robinson / Openrouter ClientA comprehensive OpenRouter API client library for ESP32 (ESP-IDF), enabling seamless integration with OpenRouter’s AI models. Supports text generation, streaming responses, function calling, and multimodal capabilities including image and audio processing.
wasenderapi / Audio Chat N8n WasenderapiAn n8n workflow for creating an AI-powered audio chat assistant. This project uses Wasenderapi for messaging, OpenAI for transcription and response generation, and Google Drive for file handling.