HAABSA

Code for A Hybrid Approach for Aspect-Based Sentiment Analysis Using a Lexicalized Domain Ontology and Attentional Neural Models

All software is written in PYTHON3 (https://www.python.org/) and makes use of the TensorFlow framework (https://www.tensorflow.org/).

Installation Instructions (Windows):

Download ontology: https://github.com/KSchouten/Heracles/tree/master/src/main/resources/externalData
Download SemEval2015 Datasets: http://alt.qcri.org/semeval2015/task12/index.php?id=data-and-tools
Download SemEval2016 Dataset: http://alt.qcri.org/semeval2016/task5/index.php?id=data-and-tools
Download Glove Embeddings: http://nlp.stanford.edu/data/glove.42B.300d.zip
Download Stanford CoreNLP parser: https://nlp.stanford.edu/software/stanford-parser-full-2018-02-27.zip
Download Stanford CoreNLP Language models: https://nlp.stanford.edu/software/stanford-english-corenlp-2018-02-27-models.jar

Install chocolatey (a package manager for Windows): https://chocolatey.org/install
Open a command prompt.
Install python3 by running the following command: code(choco install python) (http://docs.python-guide.org/en/latest/starting/install3/win/).
Make sure that pip is installed and use pip to install the following packages: setuptools and virtualenv (http://docs.python-guide.org/en/latest/dev/virtualenvs/#virtualenvironments-ref).
Create a virtual environemnt in a desired location by running the following command: code(virtualenv ENV_NAME)
Direct to the virtual environment source directory.
Unzip the HAABSA_software.zip file in the virtual environment directrory.
Activate the virtual environment by the following command: 'code(Scripts\activate.bat)`.
Install the required packages from the requirements.txt file by running the following command: code(pip install -r requirements.txt).
Install the required space language pack by running the following command: code(python -m spacy download en)

Configure one of the three main files to the required configuration (main.py, main_cross.py, main_hyper.py)
Run the program from the command line by the following command: code(python PROGRAM_TO_RUN.py) (where PROGRAM_TO_RUN is main/main_cross/main_hyper)

The environment contains the following main files that can be run: main.py, main_cross.py, main_hyper.py

main.py: program to run single in-sample and out-of-sample valdition runs. Each method can be activated by setting its corresponding boolean to True e.g. to run the CABASC method set runCABASC = True.
main_cross.py: similar to main.py but runs a 10-fold cross validation procedure for each method.
main_hyper.py: program that is able to do hyperparameter optimzation for a given space of hyperparamters for each method. To change a method change the objective and space parameters in the run_a_trial() function.
config.py: contains parameter configurations that can be changed such as: dataset_year, batch_size, iterations.
dataReader2016.py, loadData.py: files used to read in the raw data and transform them to the required formats to be used by one of the algorithms
lcrModel.py: Tensorflow implementation for the LCR-Rot algorithm
lcrModelAlt.py: Tensorflow implementation for the LCR-Rot-hop algorithm
lcrModelInverse.py: Tensorflow implementation for the LCR-Rot-inv algorithm
cabascModel.py: Tensorflow implementation for the CABASC algorithm
OntologyReasoner.py: PYTHON implementation for the ontology reasoner
svmModel.py: PYTHON implementation for a BoW model using a SVM.
att_layer.py, nn_layer.py, utils.py: programs that declare additional functions used by the machine learning algorithms.

The following directories are necessary for the virtual environment setup: __pycache, \Include, \Lib, \Scripts, \tcl, \venv

cross_results_2015: Results for a k-fold cross validation process for the SemEval-2015 dataset
cross_results_2016: Results for a k-fold cross validation process for the SemEval-2015 dataset
data:
- externalData: Location for the external data required by the methods
- programGeneratedData: Location for preprocessed data that is generated by the programs
hyper_results: Contains the stored results for hyperparameter optimzation for each method
results: temporary store location for the hyperopt package

This code uses ideas and code of the following related papers:

Zheng, S. and Xia, R. (2018). Left-center-right separated neural network for aspect-based sentiment analysis with rotatory attention. arXiv preprint arXiv:1802.00892.
Schouten, K. and Frasincar, F. (2018). Ontology-driven sentiment analysis of product and service aspects. In Proceedings of the 15th Extended Semantic Web Conference (ESWC 2018). Springer. To appear
Liu, Q., Zhang, H., Zeng, Y., Huang, Z., and Wu, Z. (2018). Content attention model for aspect based sentiment analysis. In Proceedings of the 27th International World Wide Web Conference (WWW 2018). ACM Press.