29 skills found
Blaizzy / Mlx VlmMLX-VLM is a package for inference and fine-tuning of Vision Language Models (VLMs) on your Mac using MLX.
allenai / MolmoCode for the Molmo Vision-Language Model
instavm / ClickclickclickA framework to enable autonomous android and computer use using any LLM (local or remote)
allenai / Molmo2Code for the Molmo2 Vision-Language Model
allenai / MolmowebNo description available
allenai / MolmoactOfficial Repository for MolmoAct
allenai / MolmospacesAn end-to-end open ecosystem for robot learning
CY-CHENYUE / ComfyUI MolmoGenerate detailed image descriptions and analysis using Molmo models in ComfyUI.
ThetaCursed / Clean UiSimple UI for Llama-3.2-11B-Vision & Molmo-7B-D
SeanScripts / ComfyUI PixtralLlamaMolmoVisionFor loading and running Pixtral models
2U1 / Molmo FinetuneAn open-source implementaion for fine-tuning Molmo-7B-D and Molmo-7B-O by allenai.
molmod / MolmodMolMod is a collection of molecular modelling tools for python.
mbzuai-oryx / VideoMolmoOfficial code of the paper "VideoMolmo: Spatio-Temporal Grounding meets Pointing"
allenai / MolmoBotCode and website for "MolmoB0T: Large-Scale Simulation Enables Zero-Shot Manipulation".
abhaybd / GraspMolmoCode and website for "GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation"
cyan2k / Molmo 7b Bnb 4bit4bit bitsandbytes quants of the best 7B vlms
sovit-123 / SAM Molmo WhisperAn integration of Segment Anything Model, Molmo, and, Whisper to segment objects using voice and natural language.
ComputeCanada / Molmodsim Md Theory Lesson NoviceSome practical theoretic background needed for running MD simulations
durrantlab / MolmodaMolModa provides a secure, accessible environment where users can perform molecular docking entirely in their web browsers.
2dameneko / Ide Cap Chanide-cap-chan is a utility for batch image captioning with natural language using various VL models