LargeEA: Aligning Entities for Large-scale Knowledge Graphs

Ge, Congcong; Liu, Xiaoze; Chen, Lu; Zheng, Baihua; Gao, Yunjun

doi:10.14778/3489496.3489504

Computer Science > Databases

arXiv:2108.05211 (cs)

[Submitted on 11 Aug 2021 (v1), last revised 13 Dec 2021 (this version, v3)]

Title:LargeEA: Aligning Entities for Large-scale Knowledge Graphs

Authors:Congcong Ge, Xiaoze Liu, Lu Chen, Baihua Zheng, Yunjun Gao

View PDF

Abstract:Entity alignment (EA) aims to find equivalent entities in different knowledge graphs (KGs). Current EA approaches suffer from scalability issues, limiting their usage in real-world EA scenarios. To tackle this challenge, we propose LargeEA to align entities between large-scale KGs. LargeEA consists of two channels, i.e., structure channel and name channel. For the structure channel, we present METIS-CPS, a memory-saving mini-batch generation strategy, to partition large KGs into smaller mini-batches. LargeEA, designed as a general tool, can adopt any existing EA approach to learn entities' structural features within each mini-batch independently. For the name channel, we first introduce NFF, a name feature fusion method, to capture rich name features of entities without involving any complex training process. Then, we exploit a name-based data augmentation to generate seed alignment without any human intervention. Such design fits common real-world scenarios much better, as seed alignment is not always available. Finally, LargeEA derives the EA results by fusing the structural features and name features of entities. Since no widely-acknowledged benchmark is available for large-scale EA evaluation, we also develop a large-scale EA benchmark called DBP1M extracted from real-world KGs. Extensive experiments confirm the superiority of LargeEA against state-of-the-art competitors.

Comments:	To appear in VLDB 2022
Subjects:	Databases (cs.DB)
Cite as:	arXiv:2108.05211 [cs.DB]
	(or arXiv:2108.05211v3 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.2108.05211
Related DOI:	https://doi.org/10.14778/3489496.3489504

Submission history

From: Congcong Ge [view email]
[v1] Wed, 11 Aug 2021 13:21:14 UTC (1,302 KB)
[v2] Sat, 14 Aug 2021 15:11:32 UTC (1,297 KB)
[v3] Mon, 13 Dec 2021 11:43:55 UTC (1,296 KB)

Computer Science > Databases

Title:LargeEA: Aligning Entities for Large-scale Knowledge Graphs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:LargeEA: Aligning Entities for Large-scale Knowledge Graphs

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators