TiFi: Taxonomy Induction for Fictional Domains [Extended version]

Chu, Cuong Xuan; Razniewski, Simon; Weikum, Gerhard

Computer Science > Computation and Language

arXiv:1901.10263 (cs)

[Submitted on 29 Jan 2019]

Title:TiFi: Taxonomy Induction for Fictional Domains [Extended version]

Authors:Cuong Xuan Chu, Simon Razniewski, Gerhard Weikum

View PDF

Abstract:Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin.

Comments:	Extended version of The Web Conference 2019 paper
Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
Cite as:	arXiv:1901.10263 [cs.CL]
	(or arXiv:1901.10263v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1901.10263

Submission history

From: Simon Razniewski [view email]
[v1] Tue, 29 Jan 2019 13:07:13 UTC (8,579 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-01

Change to browse by:

cs
cs.AI
cs.IR

References & Citations

DBLP - CS Bibliography

listing | bibtex

Cuong Xuan Chu
Simon Razniewski
Gerhard Weikum

export BibTeX citation

Computer Science > Computation and Language

Title:TiFi: Taxonomy Induction for Fictional Domains [Extended version]

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:TiFi: Taxonomy Induction for Fictional Domains [Extended version]

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators