Results for "llava-next"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

17 skills found

LLaVA-VL / LLaVA NeXT

4.6k

No description available

universal

Updated 8h ago

Coobiw / MPP LLaVA

667

Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.

universal

deepspeedfine-tuningmllm+7

Updated 1d ago

RLHF-V / RLAIF V

449

[CVPR'25 highlight] RLAIF-V: Open-Source AI Feedback Leads to Super GPT-4V Trustworthiness

universal

chatbotcvpr2025gpt-4v+6

Updated 11d ago

xiaoachen98 / Open LLaVA NeXT

436

An open-source implementation for training LLaVA-NeXT.

universal

chatbotchatgptgpt-4+10

Updated 25d ago

zjysteven / Lmms Finetune

370

A minimal codebase for finetuning large multimodal models, supporting llava-1.5/1.6, llava-interleave, llava-next-video, llava-onevision, llama-3.2-vision, qwen-vl, qwen2-vl, phi3-v etc.

universal

finetuningfoundation-modelsinstruction-tuning+9

Updated 5d ago

mu-cai / Matryoshka Mm

123

Matryoshka Multimodal Models

universal

chatbllamallava+4

Updated 10d ago

chuangchuangtan / LLaVA NeXT Image Llama3 Lora

LLaVA-NeXT-Image-Llama3-Lora, Modified from https://github.com/arielnlee/LLaVA-1.6-ft

universal

finetuningllama3llava-next+1

Updated 2mo ago

hasanar1f / HiRED

[AAAI 2025] HiRED strategically drops visual tokens in the image encoding stage to improve inference efficiency for High-Resolution Vision-Language Models (e.g., LLaVA-Next) under a fixed token budget.

universal

llavallava-nextlvlm+2

Updated 20d ago

Farzad-R / Finetune LLAVA NEXT

This repository contains codes for fine-tuning LLAVA-1.6-7b-mistral (Multimodal LLM) model.

universal

Updated 1mo ago

justinsunyt / MultiAgent

Generative web browsing chat agent with text + vision input. Powered by MultiOn, llama-3, llava, qwen, Next.js, FastAPI, and Supabase. Landed me an internship at MultiOn :)

universal

Updated 7mo ago

Jorffy / NoteMR

[CVPR 2025] Code for "Notes-guided MLLM Reasoning: Enhancing MLLM with Knowledge and Visual Notes for Visual Question Answering".

universal

cvprcvpr2025gradcam+16

Updated 4d ago

zaiquanyang / LLaVA Next STVG

LLaVA-Next for STVG

universal

Updated 2mo ago

friedrichor / LLaVA NeXT Reproduced

Reproduced LLaVA-NeXT with training code and scripts.

universal

Updated 5mo ago

Darren-greenhand / LLaVA Next

LLaVA_OpenVLA part 3, Use LLaVA to train a stronger VLA model

universal

Updated 2mo ago

hari-huynh / ViVQA Voice Assistant

Voice assistant using Multimodal LLMs - LLaVA-NeXT (Mistral 7B) finetuned & PhoWhisper

universal

audio-speech-recognitionllavalora+4

Updated 4mo ago

alyakin314 / CNS Obsidian

CNS-Obsidian: A Neurosurgical Vision-Language Model Built From Scientific Publications

universal

llava-nextneurosurgerypython+2

Updated 2mo ago

luxus180 / LLaVA OneVision 1.5

🛠️ Build and train multimodal models easily with LLaVA-OneVision 1.5, an open framework designed for seamless integration of vision and language tasks.

universal

finetuningfoundation-modelsinstruction-tuning+12

Updated 2h ago