An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces

Kelly Marchisio,Youngser Park,Ali Saad-Eldin,Anton Alyakin,Kevin Duh,Carey E. Priebe,Philipp Koehn

An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces

2021

Kelly Marchisio
Youngser Park
Ali Saad-Eldin
Anton Alyakin
Kevin Duh
Carey E. Priebe
Philipp Koehn

Much recent work in bilingual lexicon induction (BLI) views word embeddings as vectors in Euclidean space. As such, BLI is typically solved by finding a linear transformation that maps embeddings to a common space. Alternatively, word embeddings may be understood as nodes in a weighted graph. This framing allows us to examine a node’s graph neighborhood without assuming a linear transform, and exploits new techniques from the graph matching optimization literature. These contrasting approaches have not been compared in BLI so far. In this work, we study the behavior of Euclidean versus graph-based approaches to BLI under differing data conditions and show that they complement each other when combined. We release our code at https://github.com/kellymarchisio/euc-v-graph-bli.

Keywords:

Euclidean space
Theoretical computer science
Word (computer architecture)
Word embedding
Linear map
Graph (abstract data type)
Code (cryptography)
Complement (set theory)
Computer science
Euclidean geometry

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations