SkillAgentSearch skills...

Cr5

Code and data for the WSDM '19 paper "Crosslingual Document Embedding as Reduced-Rank Ridge Regression (Cr5)"

Install / Use

/learn @epfl-dlab/Cr5
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Cr5

This repository contains the code for the following paper, which proposes a novel approach of learning crosslingual word embeddings optimized for document level aggregation.

"Crosslingual Document Embedding as Reduced-Rank Ridge Regression". Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining. ACM, 2019.


Pretrained crosslingual embeddings

We also publish a dataset of pretrained word embeddings in 28 languages, where words are embedded in a shared latent space. The dataset is available here.


If you found the provided resources useful, please cite the above paper. Here's a BibTeX entry you may use:

@inproceedings{josifoski-wsdm2019-cr5,
  title={Crosslingual Document Embedding as Reduced-Rank Ridge Regression},
  author={Josifoski, Martin and Paskov, Ivan S. and Paskov, Hristo S. and Jaggi, Martin and West, Robert},
  booktitle={Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining},
  organization={ACM},
  year={2019}
}

Any questions or suggestions?

Contact martin.josifoski@epfl.ch.

View on GitHub
GitHub Stars30
CategoryDevelopment
Updated2y ago
Forks3

Languages

Jupyter Notebook

Security Score

60/100

Audited on Jan 4, 2024

No findings