Results for "visual-context"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

201 skills found · Page 1 of 7

thu-coai / Glyph

576

Official Repository for "Glyph: Scaling Context Windows via Visual-Text Compression"

universal

Updated 1d ago

UX-Decoder / DINOv

531

[CVPR 2024] Official implementation of the paper "Visual In-context Learning"

universal

Updated 13d ago

HumanAIGC / Omnitalker

422

[NeurIPS 2025] OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication

universal

audio-visualdiffusion-transformerreal-time+1

Updated 5h ago

Atomic-man007 / Awesome Multimodel LLM

365

Awesome_Multimodel is a curated GitHub repository that provides a comprehensive collection of resources for Multimodal Large Language Models (MLLM). It covers datasets, tuning techniques, in-context learning, visual reasoning, foundational models, and more. Stay updated with the latest advancement.

universal

chatgptdatasetgpt+5

Updated 6h ago

lupantech / MathVista

358

MathVista: data, code, and evaluation for Mathematical Reasoning in Visual Contexts

universal

ai4mathlarge-language-modelslarge-multimadality-models+5

Updated 3d ago

yosungho / LineTR

251

Line as a Visual Sentence: Context-aware Line Descriptor for Visual Localization (Line Transformer)

universal

line-descriptorvisual-localizationvisual-slam

Updated 1mo ago

SOTAMak1r / VINO Code

211

A Unified Visual Generator with Interleaved OmniModal Context

universal

image-editingimage-generationomni-model+2

Updated 1d ago

ZhangYuanhan-AI / Visual Prompt Retrieval

183

[NeurIPS2023] Official implementation and model release of the paper "What Makes Good Examples for Visual In-Context Learning?"

universal

in-context-learningpromptvisual-prompting

Updated 16d ago

USTCPCS / CVPR2018 Attention

179

Context Encoding for Semantic Segmentation MegaDepth: Learning Single-View Depth Prediction from Internet Photos LiteFlowNet: A Lightweight Convolutional Neural Network for Optical Flow Estimation PWC-Net: CNNs for Optical Flow Using Pyramid, Warping, and Cost Volume On the Robustness of Semantic Segmentation Models to Adversarial Attacks SPLATNet: Sparse Lattice Networks for Point Cloud Processing Left-Right Comparative Recurrent Model for Stereo Matching Enhancing the Spatial Resolution of Stereo Images using a Parallax Prior Unsupervised CCA Discovering Point Lights with Intensity Distance Fields CBMV: A Coalesced Bidirectional Matching Volume for Disparity Estimation Learning a Discriminative Feature Network for Semantic Segmentation Revisiting Dilated Convolution: A Simple Approach for Weakly- and Semi- Supervised Semantic Segmentation Unsupervised Deep Generative Adversarial Hashing Network Monocular Relative Depth Perception with Web Stereo Data Supervision Single Image Reflection Separation with Perceptual Losses Zoom and Learn: Generalizing Deep Stereo Matching to Novel Domains EPINET: A Fully-Convolutional Neural Network for Light Field Depth Estimation by Using Epipolar Geometry FoldingNet: Interpretable Unsupervised Learning on 3D Point Clouds Decorrelated Batch Normalization Unsupervised Learning of Depth and Egomotion from Monocular Video Using 3D Geometric Constraints PU-Net: Point Cloud Upsampling Network Real-Time Monocular Depth Estimation using Synthetic Data with Domain Adaptation via Image Style Transfer Tell Me Where To Look: Guided Attention Inference Network Residual Dense Network for Image Super-Resolution Reflection Removal for Large-Scale 3D Point Clouds PlaneNet: Piece-wise Planar Reconstruction from a Single RGB Image Fully Convolutional Adaptation Networks for Semantic Segmentation CRRN: Multi-Scale Guided Concurrent Reflection Removal Network DenseASPP: Densely Connected Networks for Semantic Segmentation SGAN: An Alternative Training of Generative Adversarial Networks Multi-Agent Diverse Generative Adversarial Networks Robust Depth Estimation from Auto Bracketed Images AdaDepth: Unsupervised Content Congruent Adaptation for Depth Estimation DeepMVS: Learning Multi-View Stereopsis GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose GeoNet: Geometric Neural Network for Joint Depth and Surface Normal Estimation Single-Image Depth Estimation Based on Fourier Domain Analysis Single View Stereo Matching Pyramid Stereo Matching Network A Unifying Contrast Maximization Framework for Event Cameras, with Applications to Motion, Depth, and Optical Flow Estimation Image Correction via Deep Reciprocating HDR Transformation Occlusion Aware Unsupervised Learning of Optical Flow PAD-Net: Multi-Tasks Guided Prediciton-and-Distillation Network for Simultaneous Depth Estimation and Scene Parsing Surface Networks Structured Attention Guided Convolutional Neural Fields for Monocular Depth Estimation TextureGAN: Controlling Deep Image Synthesis with Texture Patches Aperture Supervision for Monocular Depth Estimation Two-Stream Convolutional Networks for Dynamic Texture Synthesis Unsupervised Learning of Single View Depth Estimation and Visual Odometry with Deep Feature Reconstruction Left/Right Asymmetric Layer Skippable Networks Learning to See in the Dark

ykdojo / Super Voice Assistant

179

macOS voice assistant with global hotkeys - transcribe speech to text with offline models (WhisperKit or Parakeet) or cloud-based Gemini API, capture and transcribe screen recordings with visual context, and read selected text aloud with Gemini Live.

gemini cli

Updated 5d ago

hjerpbakk / OpenFolderInVSCode

155

With this macOS service, you can quickly open any given folder as a project in Visual Studio Code from the Finders context menu.

vscode copilot

automator-workflowbash-scriptfinder+4

Updated 6d ago

virtualgenius / Contextflow

145

Visual DDD context mapper with Flow and Strategic views for analyzing bounded contexts and their relationships

universal

Updated 1d ago

KaihuaTang / VCTree Scene Graph Generation

123

Code for the Scene Graph Generation part of CVPR 2019 oral paper: "Learning to Compose Dynamic Tree Structures for Visual Contexts"

universal

Updated 3mo ago

gcucurull / Visual Compatibility

121

Context-Aware Visual Compatibility Prediction (https://arxiv.org/abs/1902.03646)

universal

cvpr2019fashionfashionai+3

Updated 2d ago

lancopku / Livebot

121

LiveBot: Generating Live Video Comments Based on Visual and Textual Contexts (AAAI 2019)

universal

Updated 3mo ago

cdoersch / Deepcontext

119

Author's implementation of 'Unsupervised Visual Representation Learning by Context Prediction'

universal

Updated 2mo ago

IDEA-Research / DINO X MCP

117

Official DINO-X Model Context Protocol (MCP) server that empowers LLMs with real-world visual perception through image object detection, localization, and captioning APIs.

claude codecursor

image-recognitionmcpmcp-server+2

Updated 9h ago

xiaozhen228 / VCP CLIP

103

(ECCV 2024) VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation

universal

Updated 4d ago

petiky / MCP Manager

This is a visual client tool used to manage MCP (Model Context Protocol). With this tool, you can easily manage and operate the MCP environment without manually performing complex command-line operations.

claude codecursor

Updated 1d ago

kevalmorabia97 / CoVA Web Object Detection

A Context-aware Visual Attention-based training pipeline for Object Detection from a Webpage screenshot!

universal

attentioncomputer-visionconvolutional-neural-networks+8

Updated 1mo ago