KanTV

KanTV("Kan", aka English "watch") , an open source project focus on study and practise on-device AI technology in real scenario(such as perform online-TV playback and realtime transcription and online-TV record at the same time) on Android phone:

Watch online TV and local media by customized . this project is derived from original (that project has stopped maintenance since 2021), with much enhancements and new features. source code of customized FFmpeg 6.1 could be found in <a href="https://github.com/kantv-ai/kantv/tree/master/external/ffmpeg-6.1"> external/ffmpeg </a>according to <a href="https://ffmpeg.org/legal.html">FFmpeg's license</a>. source code of FFmpeg 6.1's all dependent libraries could be found in <a href="https://github.com/kantv-ai/kantv/tree/master/external/ffmpeg-deps"> external/ffmpeg-deps </a>.
Watch online TV by customized , source code of customized Exoplayer2.15.1 could be found in <a href="https://github.com/kantv-ai/kantv/tree/master/android/kantvplayer-exo2"> android/kantvplayer-exo2 </a>.
Record online TV to local file on phone.
2D graphic performance benchmark.
AI subtitle(real-time English subtitle for English online-TV(aka OTT TV) via the great & excellent & amazing<a href="https://github.com/ggerganov/whisper.cpp"> whisper.cpp </a>).
Well-maintained turn-key / self-contained workbench for AI experts/researchers whom focus on highly-value on-device AI R&D activity on Android. some on-device AI R&D activities (AI algorithm validation and AI model validation and performance benchmark with ASR/Text2Image/LLM on Android) could be done via this project easily.
Well-maintained turn-key / self-contained workbench for AI beginners to learning on-device AI technology on Android.
Built-in Google's Gemma3-4B(multimodal text + image), Google's Gemma3-12B(multimodal text + image) , Alibaba's Qwen1.5-1.8B, Alibaba's Qwen2.5-3B, Alibaba's Qwen3-4B, Alibaba's Qwen3-8B, Nvidia's Llama-3.1-Nemotron-Nano-4B, Microsoft's Phi-4-mini-reasoning, Huggingface's SmolVLM2-256M(highly-optimized multimodal for realtime-video-recognition), Alibaba's Qwen2.5-Omni-3B(multimoda text + audio), DeepSeek's DeepSeek-R1-0528-Qwen3-8B, Xiaomi's MiMo-VL-7B supportive and runs entirely offline(no Internet required). these supported LLM models can be downloadded in the Android APK directly without manually preparation. APK's users can compare the real experience of these LLM models on the Android phone. developers can add other LLM models manually in source code KANTVAIModelMgr.java#L284.
Text2Image on Android phone via the amazing stable-diffusion.cpp.
The ggml-hexagon(original name is ggml-qnn) in this project is probably the first open-source reference implementation of a specified llama.cpp backend for Qualcomm Hexagon NPU on Android phone.

Software architecture of KanTV Android

Building the project

Clone this repository and build locally, see how to build
Download pre-built Android APK from https://github.com/kantv-ai/kantv/releases
Download pre-built Android APK from Github CI-build: https://github.com/kantv-ai/kantv/actions/

Run Android APK on Android phone

Android 8.0(2017.08) --- Android 15(2024.10) and higher version with ANY mainstream arm64 mobile SoC.
Android smartphone equipped with ANY mainstream high-end mobile SoC is highly recommented for realtime AI-subtitle feature otherwise unexpected behavior would happen.
Android smartphone equipped with one of below Qualcomm mobile SoCs(Qualcomm's state-of-the-art high-end mobile SoC Snapdragon 8Gen3 series and Snapdragon 8Elite series are highly recommended) is required for verify/running ggml-hexagon backend on Android phone:

    Snapdragon 8 Gen 1
    Snapdragon 8 Gen 1+
    Snapdragon 8 Gen 2
    Snapdragon 8 Gen 3
    Snapdragon 8 Elite

Screenshots

here is a short video to demostrate realtime AI subtitle by running the great & excellent & amazing<a href="https://github.com/ggerganov/whisper.cpp"> whisper.cpp </a> on an Android phone equipped with Qualcomm Snapdragon 8Gen3 mobile SoC - fully offline, on-device.

https://github.com/kantv-ai/kantv/assets/6889919/2fabcb24-c00b-4289-a06e-05b98ecd22b8

a screenshot to demostrate multi-modal inference by running the magic <a href="https://github.com/ggerganov/llama.cpp"> llama.cpp </a> on an Android phone equipped with Qualcomm Snapdragon 8Elite mobile SoC - fully offline, on-device.

a screenshot to demostrate realtime-video-recognition via MTMD from llama.cpp + a lightweight multimodal model SmolVLM2-256M from Huggingface on an Android phone equipped with Qualcomm Snapdragon 8Elite mobile SoC - fully offline, on-device.

<details> <summary>some other screenshots</summary> <ol>

a screenshot to demostrate ASR inference by running the excellent <a href="https://github.com/ggerganov/whisper.cpp"> whisper.cpp </a> on an Android phone equipped with Qualcomm Snapdragon 8Gen3 mobile SoC - fully offline, on-device.

a screenshot to demostrate Text-2-Image inference by running the amazaing <a href="https://github.com/leejet/stable-diffusion.cpp"> stable-diffusion.cpp </a> on an Android phone equipped with Qualcomm Snapdragon 8Elite mobile SoC - fully offline, on-divice.

713992135

a screenshot to demostrate download LLM model in APK.

1213951738 1242080159

</ol> </details>

Docs

Contribution

Report issue in Android phone equipped with mainstream mobile SoC or submit PR to this project is greatly welcomed.

We use GitHub issues for tracking feature requests and issue reports, please see how to submit issue in this project .

Special Acknowledgement

<ul>AI inference framework <ul> <li> <a href="https://github.com/ggml-org/ggml">GGML</a> </li> </ul> </ul> <ul>AI application engine <ul> <li> ASR engine <a href="https://github.com/ggml-org/whisper.cpp">whisper.cpp</a> </li> <li> LLM engine <a href="https://github.com/ggml-org/llama.cpp">llama.cpp</a> </li> <li> Text2Image engine <a href="https://github.com/leejet/stable-diffusion.cpp">stable-diffusion.cpp</a> </li> <li> CV engine <a href="https://github.com/nihui/opencv-mobile">opencv-mobile</a> </li> <li> MTMD(multimodal) engine <a href="https://github.com/ggml-org/llama.cpp/blob/master/tools/mtmd/README.md">MTMD subsystem in llama.cpp</a> </li> </ul> </ul>

Kantv

Install / Use

README