Learning Rare Word Representations using Semantic Bridging

Prokhorov, Victor; Pilehvar, Mohammad Taher; Kartsaklis, Dimitri; Lió, Pietro; Collier, Nigel

Computer Science > Computation and Language

arXiv:1707.07554 (cs)

[Submitted on 24 Jul 2017]

Title:Learning Rare Word Representations using Semantic Bridging

Authors:Victor Prokhorov, Mohammad Taher Pilehvar, Dimitri Kartsaklis, Pietro Lió, Nigel Collier

View PDF

Abstract:We propose a methodology that adapts graph embedding techniques (DeepWalk (Perozzi et al., 2014) and node2vec (Grover and Leskovec, 2016)) as well as cross-lingual vector space mapping approaches (Least Squares and Canonical Correlation Analysis) in order to merge the corpus and ontological sources of lexical knowledge. We also perform comparative analysis of the used algorithms in order to identify the best combination for the proposed system. We then apply this to the task of enhancing the coverage of an existing word embedding's vocabulary with rare and unseen words. We show that our technique can provide considerable extra coverage (over 99%), leading to consistent performance gain (around 10% absolute gain is achieved with w2v-gn-500K cf.§3.3) on the Rare Word Similarity dataset.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
Cite as:	arXiv:1707.07554 [cs.CL]
	(or arXiv:1707.07554v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1707.07554

Submission history

From: Victor Prokhorov [view email]
[v1] Mon, 24 Jul 2017 13:38:00 UTC (716 KB)

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computation and Language

Title:Learning Rare Word Representations using Semantic Bridging

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

✅2024-10-01: arxiv.org is back to normal.✅

Computer Science > Computation and Language

Title:Learning Rare Word Representations using Semantic Bridging

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators