FCMBench
Prompts and evaluation scripts of FCMBench
Install / Use
/learn @QFIN-tech/FCMBenchREADME

FCMBench is a multimodal benchmark for credit-risk–oriented workflows. It aims to provide a standard playground to promote collaborative development between academia and industry and provides standardized datasets, prompts, and evaluation scripts across multiple tracks (image, video, speech, agents, etc.)
🔥 News
- 【2026. 03. 16】✨ We released FCMBench-V1.1. This version adds English document images and corresponding QA pairs, expands the covered document types to 26, and increases the dataset to 5,198 images and 13,806 QA samples.
- 【2026. 01. 01】We are proud to launch FCMBench-V1.0, which covers 18 core certificate types, including 4,043 privacy-compliant images and 8,446 QA samples. It involves 3 types of Perception tasks and 4 types of Reasoning tasks, which are cross-referenced with 10 categories of robustness inferences. All the tasks and inferences are derived from real-world critical scenarios.
Status: Public release (v1.1).
Maintainers: 奇富科技 / Qfin Holdings
Contact: [yangyehuisw@126.com]
Tracks Overview
1) Vision-Language Track (✅ Available)
Image-based financial document understanding:
- Entry: Vision-Language Track
- Inputs: document images + text prompts (JSONL, one sample per line)
- Outputs: text responses (JSONL, one sample per line)
- Evaluation: Evaluation Script
Paper & Project Links
- Paper (arXiv)
- Paper (PDF)
- Project Page
- Leaderboard
- Sample Data
- Dataset (ModelScope)
- Dataset (Hugging Face)
Reference Model Demo
We also provide access to an interactive demo of our Qfin-VL-Instruct model, which achieves strong performance on FCMBench. If you are interested in trying the Gradio demo, please contact [yangyehui-jk@qifu.com] with the following information:
- Name
- Affiliation / Organization
- Intended use (e.g., research exploration, benchmarking reference)
- Contact email
Access will be granted on a case-by-case basis.
2) Video Understanding Track (🕒 Coming Soon)
3) Speech Understanding & Generation Track (🕒 Coming Soon)
4) Multi-step / Agentic Track (🕒 Coming Soon)
Citation
@misc{yang2026fcmbenchcomprehensivefinancialcredit,
title={FCMBench: A Comprehensive Financial Credit Multimodal Benchmark for Real-world Applications},
author={Yehui Yang and Dalu Yang and Wenshuo Zhou and Fangxin Shang and Yifan Liu and Jie Ren and Haojun Fei and Qing Yang and Yanwu Xu and Tao Chen},
year={2026},
eprint={2601.00150},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2601.00150},
}
Star History
Related Skills
node-connect
350.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.9kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
350.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
350.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
