UMGF

Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance

Generate Convert Improve

Install / Use

/learn @TransformersWsz/UMGF

About this skill

Quality Score

0/100

README

UMGF: Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance

This repository contains the source code for the paper: UMGF: Multi-modal Graph Fusion for Named Entity Recognition with Targeted Visual Guidance

Install

python3.7
transformers==3.4.0
torch==1.7.1
pytorch-crf==0.7.2
pillow==7.1.2
tqdm==4.62.3

Dataset

You can download original data from UMT

Preprocess

Image

Download twitter images from UMT
To detect visual objects, please follow onestage_grounding or you can directly download them from twitter2015_img.tar.gz(password: l75t) and twitter2017_img.tar.gz(password: 2017)
Unzip and put the images under the corresponding folder(e.g. ./data/twitter2015/image)

Text

The proprocessed text has been put under ./my_data/ folder

Run

Train

python ddp_mmner.py --do_train --txtdir=./my_data/twitter2015 --imgdir=./data/twitter2015/image --ckpt_path=./model.pt --num_train_epoch=30 --train_batch_size=16 --lr=0.0001 --seed=2019

Test

python ddp_mmner.py --do_test --txtdir=./my_data/twitter2015 --imgdir=./data/twitter2015/image --ckpt_path=./model.pt --test_batch_size=32

Checkpoint on twitter2015(password: j9ib) has beed provided.
Checkpoint on twitter2017(password: 2017) has beed provided.

Acknowledgements

Using these two datasets means you have read and accepted the copyrights set by Twitter and dataset providers.
Part of the code are from:

Related Skills

node-connect

345.4k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

104.6k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

345.4k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

345.4k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。