Efficient mining of frequent XML query patterns with repeating-siblings

LH Yang, ML Lee, W Hsu, D Huang, L Wong - Information and Software …, 2008 - Elsevier
LH Yang, ML Lee, W Hsu, D Huang, L Wong
Information and Software Technology, 2008Elsevier
A recent approach to improve the performance of XML query evaluation is to cache the
query results of frequent query patterns. Unfortunately, discovering these frequent query
patterns is an expensive operation. In this paper, we develop a two-pass mining algorithm
2PXMiner that guarantees the discovery of frequent query patterns by scanning the
database at most twice. By exploiting a transaction summary data structure, and an
enumeration tree, we are able to determine the upper bounds of the frequencies of the …
A recent approach to improve the performance of XML query evaluation is to cache the query results of frequent query patterns. Unfortunately, discovering these frequent query patterns is an expensive operation. In this paper, we develop a two-pass mining algorithm 2PXMiner that guarantees the discovery of frequent query patterns by scanning the database at most twice. By exploiting a transaction summary data structure, and an enumeration tree, we are able to determine the upper bounds of the frequencies of the candidate patterns, and to quickly prune away the infrequent patterns. We also design an index to trace the repeating candidate subtrees generated by sibling repetition, thus avoiding redundant computations. Experiments results indicate that 2PXMiner is both efficient and scalable.
Elsevier
Showing the best result for this search. See all results