Hostname: page-component-586b7cd67f-t7fkt Total loading time: 0 Render date: 2024-11-26T18:38:27.257Z Has data issue: false hasContentIssue false

Regular expression filters for XML

Published online by Cambridge University Press:  09 June 2006

HARUO HOSAYA
Affiliation:
Department of Computer Science, The University of Tokyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113, Japan email: hahosoya@is.s.u-tokyo.ac.jp
Rights & Permissions [Opens in a new window]

Abstract

Core share and HTML view are not available for this content. However, as you have access to this content, a full PDF is available via the ‘Save PDF’ action button.

XML data are described by types involving regular expressions. This raises the question of what language feature is convenient for manipulating such data. Previously, we have given an answer to this question by proposing regular expression pattern matching. However, since this construct is derived from ML pattern matching, it does not have an iteration functionality in itself, which makes it cumbersome to process data typed by Kleene stars. In this paper, we propose a novel programming feature regular expression filters. This construct extends the previous proposal by permitting pattern clauses to be closed under arbitrary regular expression operators. This yields many convenient programming idioms such as non-uniform processing of sequences and almost-copying of trees. We further develop a type inference mechanism that obtains (1) types for pattern variables that are locally precise with respect to the type of input values and (2) a type for the result of the whole filter expression that is also locally precise with respect to the types of the body expressions. We discuss how our construct is useful in the practice of XML processing and, in particular, how our type inference is crucial for avoiding changes of programs when types of data to be processed evolve frequently.

Type
Article
Copyright
2006 Cambridge University Press
Submit a response

Discussions

No Discussions have been published for this article.