Local Meeting Transcription & Speaker Diarization AI Skill (SenseVoice + 3D-Speaker). Offline Video/Audio to Text with Speaker Separation (< 3GB). 本地纯离线会议转写与说话人分离智能体技能,集成音视频提取及高精度 ASR。