Dmmcs
[ACL 2024 Findings] Distance from Median Maximum Cosine Similarity
Install / Use
/learn @nlpaueb/DmmcsREADME
DMMCS: A Data-driven Guided Decoding Mechanism for Diagnostic Captioning
Distance from Median Maximum Cosine Similarity (DMMCS)
This repository contains the official codebase for DMMCS, our novel data-driven guided decoding algorithm featured in ACL Findings 2024. You can find our paper "A data-driven guided decoding mechanism for Diagnostic Captioning" here. DMMCS stands for Distance from Median Maximum Cosine Similarity.
Installation
To get started with our framework, follow these steps to clone the repository and install the required packages. We recommend using a virtual environment for package installation to ensure a clean and isolated setup.
Step 1: Clone the repository
git clone https://github.com/nlpaueb/dmmcs.git
cd dmmcs
Step 2: Create and activate a virtual environment
We have tested our framework for both Conda and Virtualenv environments.
Conda
conda create -n dmmcs_venv python=3.9
conda activate dmmcs_venv
pip install -r requirements.txt
Virtualenv
virtualenv dmmcs_venv
source dmmcs_venv/bin/activate
pip install -r requirements.txt
Usage
Step 1: Calculate your data-specific stats
First, you need to download the en_core_web_sm package from the spacy library.
python -m spacy download en_core_web_sm
Then, you have to run the stats_extraction.py script.
python3 utils/stats_extraction.py --config config/stats_extractor_config.json
Please make sure to adjust the config/stats_extractor_config.json in order to match your local file directories.
This script generates four files that will be needed for our guided-decoding mechanism. These files can be found under the snapshots/artifacts directory.
Step 2: Run training and/or inference
You can train and/or infer from an InstructBLIP model using the proposed guided-decoding mechanism with:
python3 instructBLIP-ft.py --config ../config/config.json
Please make sure to adjust the config/config.json args file to your own local paths and directories.
Set the do_dmmcs option equal to True in order to use the dmmcs guided-decoding mechanism during inference instead of the vanilla beam search.
Licence
This repository is licensed under the MIT license. See LICENSE for more details.
Contact
For any questions, inquiries or suggestions, please feel free to reach out at pkaliosis@aueb.gr and/or annis@aueb.gr.
Citation
If you would like to use our work, please cite us using the following bibtex reference:
@inproceedings{kaliosis-etal-2024-data,
title = "A Data-Driven Guided Decoding Mechanism for Diagnostic Captioning",
author = "Kaliosis, Panagiotis and
Pavlopoulos, John and
Charalampakos, Foivos and
Moschovis, Georgios and
Androutsopoulos, Ion",
editor = "Ku, Lun-Wei and
Martins, Andre and
Srikumar, Vivek",
booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
month = aug,
year = "2024",
address = "Bangkok, Thailand and virtual meeting",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.findings-acl.444",
pages = "7450--7466",
}
Related Skills
node-connect
343.1kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
90.0kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
343.1kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
343.1kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
