Results for "cross-modality-pretraining"

Claude Code Claude Desktop GitHub Copilot Cursor Windsurf Cline Zed JetBrains

📄SKILL.md 🤖CLAUDE.md ⚡Claude Commands 📐.cursorrules 📐Cursor Rules 🕹️AGENTS.md 🧬codex.md 🏄.windsurfrules 🔧.clinerules 🧑‍✈️Copilot Instructions

All Development Operations Data Product Marketing Customer Design Sales

7 skills found

CASIA-IVA-Lab / VAST

299

[NIPS2023] Code and Model for VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

universal

audio-languagecross-modality-pretrainingdataset+3

Updated 13d ago

RunpeiDong / ACT

103

[ICLR 2023] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?

universal

3d-point-cloudscross-modal-learningrepresentation-learning+1

Updated 2mo ago

cclaess / SPECTRE

[CVPR 2026] This repo contains the code and models of SPECTRE: Self-Supervised & Cross-Modal Pretraining for CT Representation Extraction.

universal

Updated 7d ago

mshukor / VLPCook

Official implementation of VLPCook: Vision and Structured-Language Pretraining for Cross-Modal Food Retrieval

universal

Updated 16d ago

mlbio-epfl / STRUCTURE

[NeurIPS 2025] TL;DR: Aligning pretrained unimodal models with the proposed framework using limited paired data yields ~52% gains in cross-modality zero-shot classification and ~92% in retrieval.

universal

multimodalmultimodal-learningneurips+2

Updated 13d ago

chincharles / U Emo

[TIP-2025] Official code for work "UniEmoX: Cross-modal Semantic-Guided Large-Scale Pretraining for Universal Scene Emotion Perception".

universal

Updated 22d ago

Heidelberg-NLP / Counting Probe

Counting dataset for Vision & Language models. Introduced in the paper "Seeing Past Words: Testing the Cross-Modal Capabilities of Pretrained V&L Models". https://arxiv.org/abs/2012.12352

universal

countingdatasetmultimodal+2

Updated 1y ago