Imageinwords
Data release for the ImageInWords (IIW) paper.
Install / Use
/learn @google/ImageinwordsREADME
arXiv: https://arxiv.org/abs/2405.02793
Please visit the webpage for all the information about the IIW project, data, visualizations, and much more. The data can be downloaded directly from the datasets/ folder, as well as from Huggingface (see below).
Please reach out to iiw-dataset@google.com for thoughts/feedback/questions/collaborations.
License: CC-BY-4.0
<h3>Other resources</h3> <h4>🤗Hugging Face🤗</h4> <li><a href="https://huggingface.co/datasets/google/imageinwords">IIW-Benchmark Eval Dataset</a></li>from datasets import load_dataset
# `name` can be one of: IIW-400, DCI_Test, DOCCI_Test, CM_3600, LocNar_Eval
# refer: https://github.com/google/imageinwords/blob/main/datasets/README.md
dataset = load_dataset('google/imageinwords', token=None, name="IIW-400", trust_remote_code=True)
<li><a href="https://huggingface.co/spaces/google/imageinwords-explorer">Dataset-Explorer</a></li>
<h3>Cite</h3>
If you use our data or refer to our work, please include the following citation
@misc{garg2024imageinwords,
title={ImageInWords: Unlocking Hyper-Detailed Image Descriptions},
author={Roopal Garg and Andrea Burns and Burcu Karagol Ayan and Yonatan Bitton and Ceslee Montgomery and Yasumasa Onoe and Andrew Bunner and Ranjay Krishna and Jason Baldridge and Radu Soricut},
year={2024},
eprint={2405.02793},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
Related Skills
node-connect
345.9kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
106.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
345.9kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
345.9kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
