Authors:
George Fotopoulos
1
;
Paris Koloveas
2
;
Paraskevi Raftopoulou
2
and
Christos Tryfonopoulos
2
Affiliations:
1
WITSIDE (Intelligence for Business Ltd), Athens, Greece
;
2
Department of Informatics & Telecommunications, University of the Peloponnese, Tripolis, Greece
Keyword(s):
Database Performance, Text Search, NoSQL Data Stores, Relational Databases, Performance Comparison.
Abstract:
The amount of textual data produced nowadays is constantly increasing as the number and variety of both new and reproduced textual information created by humans and (lately) also by bots is unprecedented. Storing, handling and querying such high volumes of textual data have become more challenging than ever and both research and industry have been using various alternatives, ranging from typical Relational Database Management Systems to specialised text engines and NoSQL databases, in an effort to cope with the volume. However, all these decisions are, largely, based on experience or personal preference for one system over another, since there is no performance comparison study that compares the available solutions regarding full-text search and retrieval. In this work, we fill this gap in the literature by systematically comparing four popular databases in full-text search scenarios and reporting their performance across different datasets, full-text search operators and parameters.
To the best of our knowledge, our study is the first to go beyond the comparison of characteristics, like expressiveness of the query language or popularity, and actually compare popular relational, NoSQL, and textual data stores in terms of retrieval efficiency for full-text search. Moreover, our findings quantify the differences in full-text search performance between the examined solutions and reveal both anticipated and less anticipated results.
(More)