SkillAgentSearch skills...

InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型

Install / Use

/learn @OpenGVLab/InternVL

README

<div align="center">

InternVL Family: Closing the Gap to Commercial Multimodal Models with Open-Source Suites —— A Pioneering Open-Source Alternative to GPT-5

<div align="center"> <img width="500" alt="image" src="https://github.com/user-attachments/assets/930e6814-8a9f-43e1-a284-118a5732daa4"> <br> </div>

[🆕 Blog] [🤔 FAQs] [🗨️ Chat Demo] [📖 Document] [🌐 API] [🚀 Quick Start]

[🔥 InternVL3.5 Report] [📜 InternVL3.0 Report] [📜 InternVL2.5 MPO] [📜 InternVL2.5 Report]

[📜 Mini-InternVL Paper] [📜 InternVL2 Blog] [📜 InternVL 1.5 Paper] [📜 InternVL 1.0 Paper]

[📖 2.0 中文解读] [📖 1.5 中文解读] [📖 1.0 中文解读]

Switch to the Chinese version (切换至中文版)

<a href="https://trendshift.io/repositories/9803" target="_blank"><img src="https://trendshift.io/api/badge/repositories/9803" alt="OpenGVLab%2FInternVL | Trendshift" style="width: 250px; height: 55px;" width="250" height="55"/></a> <img height="55" alt="image" src="https://github.com/user-attachments/assets/bd62ab46-f0ea-40c6-ab10-7fde671716cc">

image/jpg

</div>

News 🚀🚀🚀

<details> <summary>More News</summary>
  • 2024/10/21: We release the Mini-InternVL series. These models achieve impressive performance with minimal size: the 4B model achieves 90% of the performance with just 5% of the model size. For more details, please check our project page and document.
  • 2024/08/01: The Chartmimic team evaluated the InternVL2 series models on their benchmark. The InternVL2-26B and 76B models achieved the top two performances among open-source models, with the InternVL2 76B model surpassing GeminiProVision and exhibiting comparable results to Claude-3-opus.
  • 2024/08/01: InternVL2-Pro achieved the SOTA performance among open-source models on the CharXiv dataset, surpassing many closed-source models such as GPT-4V, Gemini 1.5 Flash, and Claude 3 Sonnet.
  • 2024/07/24: The MLVU team evaluated InternVL-1.5 on their benchmark. The average performance on the multiple-choice task was 50.4%, while the performance on the generative tasks was 4.02. The performance on the multiple-choice task ranked #1 among all open-source MLLMs.
  • 2024/07/04: We release the InternVL2 series. InternVL2-Pro achieved a 62.0% accuracy on the MMMU benchmark, matching the performance of leading closed-source commercial models like GPT-4o.
  • 2024/07/18: InternVL2-40B achieved SOTA performance among open-source models on the Video-MME dataset, scoring 61.2 when inputting 16 frames and 64.4 when inputting 32 frames. It significantly outperforms other open-source models and is the closest open-source model to GPT-4o mini.
  • 2024/07/18: InternVL2-Pro achieved the SOTA performance on the DocVQA and InfoVQA benchmarks.
  • 2024/06/19: We propose Needle In A Multimodal Haystack (MM-NIAH), the first benchmark designed t
View on GitHub
GitHub Stars9.9k
CategoryContent
Updated1h ago
Forks765

Languages

Python

Security Score

100/100

Audited on Mar 21, 2026

No findings