David Nogueira 6dd2e11b5f Fix Aligned word vectors URL 6 ani în urmă
..
README.md 6dd2e11b5f Fix Aligned word vectors URL 6 ani în urmă
align.py 6dd2e11b5f Fix Aligned word vectors URL 6 ani în urmă
eval.py 99f23802d4 Supervised alignement 7 ani în urmă
example.sh 294daf1c44 Fix broken URLs 7 ani în urmă
unsup_align.py 29c728ff4f Adding script for unsupervised alignment 7 ani în urmă
utils.py 99f23802d4 Supervised alignement 7 ani în urmă

README.md

Alignment of Word Embeddings

This directory provides code for learning alignments between word embeddings in different languages.

The code is in Python 3 and requires NumPy.

The script example.sh shows how to use this code to learn and evaluate a bilingual alignment of word embeddings.

The word embeddings used in [1] can be found on the fastText project page and the supervised bilingual lexicons on the MUSE project page.

Supervised alignment

The script align.py aligns word embeddings from two languages using a bilingual lexicon as supervision. The details of this approach can be found in [1].

Unsupervised alignment

The script unsup_align.py aligns word embeddings from two languages without requiring any supervision. The details of this approach can be found in [2].

In addition to NumPy, the unsupervised method requires the Python Optimal Transport toolbox.

Download

Wikipedia fastText embeddings aligned with our method can be found here.

References

If you use the supervised alignment method, please cite:

[1] A. Joulin, P. Bojanowski, T. Mikolov, H. Jegou, E. Grave, Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion

@InProceedings{joulin2018loss,
    title={Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion},
    author={Joulin, Armand and Bojanowski, Piotr and Mikolov, Tomas and J\'egou, Herv\'e and Grave, Edouard},
    year={2018},
    booktitle={Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing},
}

If you use the unsupervised alignment method, please cite:

[2] E. Grave, A. Joulin, Q. Berthet, Unsupervised Alignment of Embeddings with Wasserstein Procrustes

@article{grave2018unsupervised,
    title={Unsupervised Alignment of Embeddings with Wasserstein Procrustes},
    author={Grave, Edouard and Joulin, Armand and Berthet, Quentin},
    journal={arXiv preprint arXiv:1805.11222},
    year={2018}
}