SkillAgentSearch skills...

Acl2017

A Neural Model for User Geolocation and Lexical Dialectology

Install / Use

/learn @afshinrahimi/Acl2017
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

A Neural Model for User Geolocation and Lexical Dialectology

Abstract

We propose a simple yet effective text- based user geolocation model based on a neural network with one hidden layer, which achieves state of the art performance over three Twitter benchmark geolocation datasets, in addition to producing word and phrase embeddings in the hidden layer that we show to be useful for detecting dialectal terms. As part of our analysis of dialectal terms, we release DAREDS, a dataset for evaluating dialect term detection methods.

DAREDS dataset

DAREDS is a dataset consisting of dialect words and their dialect region extracted from Dictionary of American Regional English (DARE) available online at http://www.daredictionary.com. It is available at the data directory.

Geolocation datasets

We experiment with three Twitter geolocation datasets available at https://github.com/utcompling/textgrounder or alternatively https://www.amazon.com/clouddrive/share/kfl0TTPDkXuFqTZ17WJSnhXT0q6fGkTlOTOLZ9VVPNu or in preprocessed pickle from from https://www.dropbox.com/sh/vofn0awjcjxhwbc/AABHekl2pmFk2Q3qdVO60JTKa?dl=0.

Usage

usage: geodare.py [-h] [-dataset str] [-model str] [-datadir DATADIR] [-tune]

optional arguments:
  -h, --help            show this help message and exit
  -dataset str, --dataset str
                        dataset name (cmu, na, world)
  -model str, --model str
                        dialectology model (mlp, lr, word2vec)
  -datadir DATADIR      directory where input datasets (cmu.pkl, na.pkl,
                        world.pkl) are located.
  -tune                 if true, tune the hyper-parameters.

The preprocessed pickle files (e.g. na.pkl) are the vectorized version of the geolocation datasets and are available upon request.

Citation

@InProceedings{rahimi2017a,
  author    = {Rahimi, Afshin  and  Cohn, Trevor  and  Baldwin, Timothy},
  title     = {A Neural Model for User Geolocation and Lexical Dialectology},
  booktitle = {Proceedings of ACL-2017 (short papers) preprint},
  year      = {2017},
  address   = {Vancouver, Canada},
  publisher = {Association for Computational Linguistics}
}

Contact

Afshin Rahimi afshinrahimi@gmail.com

View on GitHub
GitHub Stars16
CategoryDevelopment
Updated1y ago
Forks8

Languages

Python

Security Score

60/100

Audited on Jun 16, 2024

No findings