530 skills found · Page 1 of 18
rerun-io / RerunAn open source SDK for logging, storing, querying, and visualizing multimodal and multi-rate data
apache / SeatunnelSeaTunnel is a multimodal, high-performance, distributed, massive data integration tool.
lance-format / LanceOpen Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..
Eventual-Inc / DaftHigh-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
NVlabs / VILAVILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.
docarray / DocarrayRepresent, send, store and search multimodal data
datachain-ai / DatachainAnalytics, Versioning and ETL for multimodal data: video, audio, PDFs, images
OpenGVLab / InternVideo[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
Data Infrastructure providing a declarative, incremental approach for multimodal AI workloads.
jrzaurin / Pytorch WidedeepA flexible package for multimodal-deep-learning to combine tabular data with text and images using Wide and Deep models in Pytorch
xtreme1-io / Xtreme1Xtreme1 is an all-in-one data labeling and annotation platform for multimodal data training and supports 3D LiDAR point cloud, image, and LLM.
lhotse-speech / LhotseTools for handling multimodal data in machine learning projects.
lupantech / ScienceQAData and code for NeurIPS 2022 Paper "Learn to Explain: Multimodal Reasoning via Thought Chains for Science Question Answering".
georgian-io / Multimodal ToolkitMultimodal model for text and tabular data with HuggingFace transformers as building block for text data
Flame-Code-VLM / Flame Code VLMFlame is an open-source multimodal AI system designed to translate UI design mockups into high-quality React code. It leverages vision-language modeling, automated data synthesis, and structured training workflows to bridge the gap between design and front-end development.
MultimodalUniverse / MultimodalUniverseLarge-Scale Multimodal Dataset of Astronomical Data
drmuskangarg / Multimodal DatasetsThis repository is build in association with our position paper on "Multimodality for NLP-Centered Applications: Resources, Advances and Frontiers". As a part of this release we share the information about recent multimodal datasets which are available for research purposes. We found that although 100+ multimodal language resources are available in literature for various NLP tasks, still publicly available multimodal datasets are under-explored for its re-usage in subsequent problem domains.
friedrichor / Awesome Multimodal PapersA curated list of awesome Multimodal studies.
healthylaife / MIMIC IV Data PipelineA customizable pipeline for multimodal data extraction from MIMIC-IV!
ilaria-manco / Multimodal Ml MusicList of academic resources on Multimodal ML for Music