442 skills found · Page 1 of 15
pytube / PytubeLightweight, dependency-free Python library and CLI for downloading YouTube videos, playlists, and captions.
instaloader / InstaloaderDownload pictures (or videos) along with their captions and other metadata from Instagram.
mwaterfall / MWPhotoBrowserA simple iOS photo and video browser with grid view, captions and selections.
vladmandic / SdnextSD.Next: All-in-one WebUI for AI generative image and video creation, captioning and processing
stephengpope / No Code Architects ToolkitThe NCA Toolkit API eliminates monthly subscription fees by consolidating common API functionalities into a single FREE API. Designed for businesses, creators, and developers, it streamlines advanced media processing, including video editing and captioning, image transformations, cloud storage, and Python code execution.
NVlabs / Describe Anything[ICCV 2025] Implementation for Describe Anything: Detailed Localized Image and Video Captioning
ShareGPT4Omni / ShareGPT4Video[NeurIPS 2024] An official implementation of "ShareGPT4Video: Improving Video Understanding and Generation with Better Captions"
YehLi / XmodalerX-modaler is a versatile and high-performance codebase for cross-modal analytics(e.g., image captioning, video captioning, vision-language pre-training, visual question answering, visual commonsense reasoning, and cross-modal retrieval).
m-bain / WebvidLarge-scale text-video dataset. 10 million captioned short videos.
snap-research / Panda 70M[CVPR 2024] Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers
younatics / MediaBrowser🏞 A simple iOS photo and video browser with optional grid view, captions and selections written in Swift5.0
IuvenisSapiens / ComfyUI Qwen3 VL InstructThe successful integration of Qwen3-VL-Instruct series into the ComfyUI platform has enabled a smooth operation, supporting (but not limited to) text-based queries, video queries, single-image queries, and multi-image queries for generating captions or responses.
mira-space / MiraDataOfficial repo for paper "MiraData: A Large-Scale Video Dataset with Long Durations and Structured Captions"
wesbos / Wes Bos CaptionsCaptions for my video courses
ncounterspecialist / TwickAI-powered video editor SDK built with React. Features canvas timeline, drag-and-drop editing, AI captions, and serverless MP4 export. Perfect for building custom video apps.
forence / Awesome Visual CaptioningThis repository focus on Image Captioning & Video Captioning & Seq-to-Seq Learning & NLP
xiadingZ / Video Caption.pytorchpytorch implementation of video captioning
scopeInfinity / Video2DescriptionVideo to Text: Natural language description generator for some given video. [Video Captioning]
facebookresearch / Grounded Video DescriptionVideo Grounding and Captioning
rom1504 / Cc2datasetEasily convert common crawl to a dataset of caption and document. Image/text Audio/text Video/text, ...