XML Query Processing: Storage and Query Model Interplay
Ioana Manolescu,
INRIA, France
Abstract
XML data management has received a lot of attention from various
perspectives. In this tutorial, we attempt to systematize the existing
techniques, from the "persistent database" perspective, in which documents
are stored once in a persistent repository, and queried/updated many times
in the sequel.
A variety of XML storage schemes have been proposed, many of which take
advantage of existing query processing systems; the most notable family is
that of relational stores for XML, in which XML query processing is
delegated to a relational query processor. Most recently, complex native
XML storage systems are being developed, and enhanced with indexing
schemes, view mechanisms, and early attempts at cost-based query
optimization.
Any XML storage scheme, though, is only as good as the XML query
processing performance it can achieve. Thus, after years of efforts on XML
storage as a goal per se, new approaches start from the current winner in
terms of XML querying - XQuery - and build from that point storage
strategies that are likely to gracefully support XQuery query processing.
The tutorial will provide a comparative, evolutionary view of these
approaches, emphasizing the interplay between XML storage and query
execution models. Each storage is best suited for some execution
strategies; in turn, execution strategies established as clear winners
push their matching storage schemes into the spotlight. Understanding
this storage-processing interplay is important, as no XML query can be
deemed "complex" or "easy" without referring to the storage and query
model chosen. The tutorial will cover the fundamental elements of XML
storage and execution models proposed in the last six years, systematizing
them to encompass current and foreseeable future efforts.