Fnet
Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings
Install / Use
/learn @abhipec/FnetREADME
FNET
Publication
Please use the following BibTex code for citing this work.
@InProceedings{abhishek-anand-awekar:2017:EACLlong,
author = {Abhishek, Abhishek and Anand, Ashish and Awekar, Amit},
title = {Fine-Grained Entity Type Classification by Jointly Learning Representations and Label Embeddings},
booktitle = {Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics: Volume 1, Long Papers},
month = {April},
year = {2017},
address = {Valencia, Spain},
publisher = {Association for Computational Linguistics},
pages = {797--807},
url = {http://www.aclweb.org/anthology/E17-1075}
}
Compatibility with TensorFlow 1.12
An updated version of the main code, compatible with TensorFlow 1.12 is available at https://github.com/abhipec/FgEC. Transfer learning related experiments are not part of that code.
data
Download the necessary data as per instructions mentioned in data/processed/f1/README.md file.
Directory structure:
- /home/
- EACL-2017
- fnet
- glove.840B.300d
- EACL-2017
dependencies
Python3 version of TensorFlow (0.10.0rc0) framework is used in this experiment.
pip install numpy docopt pandas plotly matplotlib scipy sklearn
Compile Cpp libraries.
cd src/lib
bash compile_gcc_5.bash
run
cd src
bash scripts/BBN.bash
bash scripts/OntoNotes.bash
bash scripts/Wiki.bash
This will create model checkpoints in the ckpt directory.
Please have a look at the scripts and modify necessary variables.
Report result:
python report_results.py ~/EACL-2017/fnet/ckpt/
Feature level transfer learning experiment
Download the necessary data as per instructions mentioned in data/processed/f4/README.md file.
bash scripts/tl.bash
python report_results.py ~/EACL-2017/fnet/ckpt/
Preprocessing steps (Optional)
These steps will convert the original data https://github.com/shanzhenren/AFET to tfrecord format used in this code.
Download the necessary data as per instructions mentioned in data/AFET/dataset/README.md file.
Also download and extract GloVe vectors (http://nlp.stanford.edu/data/glove.840B.300d.zip) in glove.840B.300d directory.
Dataset names used: BBN, Wiki and OntoNotes.
Preprocess data and generate train, development and test set.
cd data_processing/
python sanitizer.py BBN ~/EACL-2017/fnet/data/AFET/ 10 ~/EACL-2017/fnet/data/sanitized/
Convert json to Tfrecord format
python data_processing/json_to_tfrecord.py BBN ~/EACL-2017/fnet/data/sanitized/ ~/EACL-2017/glove.840B.300d/glove.840B.300d.txt f1 ~/EACL-2017/fnet/data/processed/
python data_processing/json_to_tfrecord.py BBN ~/EACL-2017/fnet/data/sanitized/ ~/EACL-2017/glove.840B.300d/glove.840B.300d.txt f2 ~/EACL-2017/fnet/data/processed/
python data_processing/json_to_tfrecord.py BBN ~/EACL-2017/fnet/data/sanitized/ ~/EACL-2017/glove.840B.300d/glove.840B.300d.txt f3 ~/EACL-2017/fnet/data/processed/
| data_format | alias | remarks | |---|---|---| | our | f1 | Used in our, our-NoM, our-AllC| | Attentive | f2 | Used in Attentive| | transfer-learning-model | f3 | Used in model level transfer learning|
Transfer learning experiments
- Train our model on Wiki dataset.
- Note down its uid.
- Modify ../ckpt/uid/checkpint file such that it points to the best performing checkpoint.
- Change the fintune_directory parameter in the following scripts to include uid noted in step 2.
Model level transfer learning
bash scripts/transfer_learning_model.bash
Feature level transfer learning
bash scripts/transfer_learning_feature_dumping.bash
bash scripts/tl.bash
Report result
python report_results.py ~/EACL-2017/fnet/ckpt/
type-wise analysis
Please change the dataset and the path of result file that need to be analysed type wise.
python class_wise_analysis.py --all_labels_file=../data/sanitized/BBN/sanitized_labels.txt --json_file=../data/sanitized/BBN/sanitized_test.json --result_file=../ckpt/Wiki_1.2/result_7.txt --dataset=Wiki
Related Skills
YC-Killer
2.7kA library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.
best-practices-researcher
The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app
groundhog
398Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).
isf-agent
a repo for an agent that helps researchers apply for isf funding
