Efficient Knowledge Graph Accuracy Evaluation

Gao, Junyang; Li, Xian; Xu, Yifan Ethan; Sisman, Bunyamin; Dong, Xin Luna; Yang, Jun

Computer Science > Databases

arXiv:1907.09657 (cs)

[Submitted on 23 Jul 2019]

Title:Efficient Knowledge Graph Accuracy Evaluation

Authors:Junyang Gao, Xian Li, Yifan Ethan Xu, Bunyamin Sisman, Xin Luna Dong, Jun Yang

View PDF

Abstract:Estimation of the accuracy of a large-scale knowledge graph (KG) often requires humans to annotate samples from the graph. How to obtain statistically meaningful estimates for accuracy evaluation while keeping human annotation costs low is a problem critical to the development cycle of a KG and its practical applications. Surprisingly, this challenging problem has largely been ignored in prior research. To address the problem, this paper proposes an efficient sampling and evaluation framework, which aims to provide quality accuracy evaluation with strong statistical guarantee while minimizing human efforts. Motivated by the properties of the annotation cost function observed in practice, we propose the use of cluster sampling to reduce the overall cost. We further apply weighted and two-stage sampling as well as stratification for better sampling designs. We also extend our framework to enable efficient incremental evaluation on evolving KG, introducing two solutions based on stratified sampling and a weighted variant of reservoir sampling. Extensive experiments on real-world datasets demonstrate the effectiveness and efficiency of our proposed solution. Compared to baseline approaches, our best solutions can provide up to 60% cost reduction on static KG evaluation and up to 80% cost reduction on evolving KG evaluation, without loss of evaluation quality.

Comments:	in VLDB 2019
Subjects:	Databases (cs.DB)
Cite as:	arXiv:1907.09657 [cs.DB]
	(or arXiv:1907.09657v1 [cs.DB] for this version)
	https://doi.org/10.48550/arXiv.1907.09657

Submission history

From: Junyang Gao [view email]
[v1] Tue, 23 Jul 2019 02:06:48 UTC (894 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.DB

< prev | next >

new | recent | 2019-07

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Junyang Gao
Xian Li
Yifan Ethan Xu
Bunyamin Sisman
Xin Luna Dong

…

export BibTeX citation

Computer Science > Databases

Title:Efficient Knowledge Graph Accuracy Evaluation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Databases

Title:Efficient Knowledge Graph Accuracy Evaluation

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators