1,319 skills found · Page 1 of 44
mudler / LocalAI:robot: The free, Open Source alternative to OpenAI, Claude and others. Self-hosted and local-first. Drop-in replacement, running on consumer-grade hardware. No GPU required. Runs gguf, transformers, diffusers and many more. Features: Generate Text, MCP, Audio, Video, Images, Voice Cloning, Distributed, P2P and decentralized inference
bluenviron / MediamtxReady-to-use SRT / WebRTC / RTSP / RTMP / LL-HLS / MPEG-TS / RTP media server and media proxy that allows to read, publish, proxy, record and playback video and audio streams.
xhzengAIB / MessageDisplayKitAn IM App like WeChat App has to send text, pictures, audio, video, location messaging, managing local address book, share a circle of friends, drifting friends, shake a fun and more interesting features.
QwenLM / Qwen2.5 OmniQwen2.5-Omni is an end-to-end multimodal model by Qwen team at Alibaba Cloud, capable of understanding text, audio, vision, video, and performing real-time speech generation.
ampache / AmpacheA web based audio/video streaming application and file manager allowing you to access your music & videos from anywhere, using almost any internet enabled device.
QwenLM / Qwen3 OmniQwen3-omni is a natively end-to-end, omni-modal LLM developed by the Qwen team at Alibaba Cloud, capable of understanding text, audio, images, and video, as well as generating speech in real time.
vidstack / PlayerUI components and hooks for building video/audio players on the web. Robust, customizable, and accessible. Modern alternative to JW Player and Video.js.
aserbao / AndroidCamera🔥🔥🔥自定义Android相机(仿抖音 TikTok),其中功能包括视频人脸识别贴纸,美颜,分段录制,视频裁剪,视频帧处理,获取视频关键帧,视频旋转,添加滤镜,添加水印,合成Gif到视频,文字转视频,图片转视频,音视频合成,音频变声处理,SoundTouch,Fmod音频处理。 Android camera(imitation Tik Tok), which includes video editor,audio editor,video face recognition stickers, segment recording,video cropping, video frame processing, get the first video frame, key frame, video rotation, add filter Mirror ,add watermark ,add gif to video, add text to video, picture to video, audio and video synthesis, audio change processing, SoundTouch, Fmod audio processing.
pedroSG94 / RootEncoderRootEncoder for Android (rtmp-rtsp-stream-client-java) is a stream encoder to push video/audio to media servers using protocols RTMP, RTSP, SRT and UDP with all code written in Java/Kotlin
muaz-khan / RTCMultiConnectionRTCMultiConnection is a WebRTC JavaScript library for peer-to-peer applications (screen sharing, audio/video conferencing, file sharing, media streaming etc.)
yangjie10930 / EpMediaAndroid上基于FFmpeg开发的视频处理框架,简单易用,体积小,帮助使用者快速实现视频处理功能。包含以下功能:剪辑,裁剪,旋转,镜像,合并,分离,变速,添加LOGO,添加滤镜,添加背景音乐,加速减速视频,倒放音视频。 The video processing framework based on FFmpeg developed on Android is simple, easy to use, and small in size, helping users quickly realize video processing functions. Contains the following functions: editing, cropping, rotating, mirroring, merging, separating, variable speed, adding LOGO, adding filters, adding background music, accelerating and decelerating video, rewinding audio and video.
polywock / GlobalSpeedWeb extension to set a default speed for video and audio
tencent-ailab / V ExpressV-Express aims to generate a talking head video under the control of a reference image, an audio, and a sequence of V-Kps images.
muammar / MkchromecastCast macOS and Linux Audio/Video to your Google Cast and Sonos Devices
brianwernick / ExoMediaAn Android ExoPlayer wrapper to simplify Audio and Video implementations
hkchengrex / MMAudio[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
ThioJoe / Auto Synced Translated DubsAutomatically translates the text of a video based on a subtitle file, and then uses AI voice services to create a new dubbed & translated audio track where the speech is synced using the subtitle's timings.
LucklySpace / Lucky ClientA cross-platform instant messaging client application built with Tauri and Vue 3, featuring one-to-one chat, group chat, file transfer, audio/video calling, screen recording, screenshot capture, and QR code login capabilities.
vkohaupt / VokoscreenNGvokoscreenNG is a powerful screencast creator in many languages to record the screen, an area or a window (Linux only). Recording of audio from multiple sources is supported. With the built-in camera support, you can make your video more personal. Other tools such as systray, magnifying glass, countdown, timer, Showclick and Halo support will help
SociallyIneptWeeb / AICoverGenA WebUI to create song covers with any RVC v2 trained AI voice from YouTube videos or audio files.