UMGF
Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance
Install / Use
/learn @TransformersWsz/UMGFREADME
UMGF: Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance
This repository contains the source code for the paper: UMGF: Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance
Install
- python3.7
- transformers==3.4.0
- torch==1.7.1
- pytorch-crf==0.7.2
- pillow==7.1.2
- tqdm==4.62.3
Dataset
- You can download original data from UMT
Preprocess
Image
- Download twitter images from UMT
- To detect visual objects, please follow onestage_grounding or you can directly download them from twitter2015_img.tar.gz(password: l75t) and twitter2017_img.tar.gz(password: 2017)
- Unzip and put the images under the corresponding folder(e.g.
./data/twitter2015/image)
Text
- The proprocessed text has been put under
./my_data/folder
Run
Train
python ddp_mmner.py --do_train --txtdir=./my_data/twitter2015 --imgdir=./data/twitter2015/image --ckpt_path=./model.pt --num_train_epoch=30 --train_batch_size=16 --lr=0.0001 --seed=2019
Test
python ddp_mmner.py --do_test --txtdir=./my_data/twitter2015 --imgdir=./data/twitter2015/image --ckpt_path=./model.pt --test_batch_size=32
- Checkpoint on twitter2015(password: j9ib) has beed provided.
- Checkpoint on twitter2017(password: 2017) has beed provided.
Acknowledgements
- Using these two datasets means you have read and accepted the copyrights set by Twitter and dataset providers.
- Part of the code are from:
Related Skills
node-connect
345.4kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
104.6kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
345.4kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
345.4kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
