56453-An Introduction To Spatial Database PDF
56453-An Introduction To Spatial Database PDF
56453-An Introduction To Spatial Database PDF
held biannually since 1989 (Buchmann et al., 1989; Gtinther and Schek, 1991; Abel
and Ooi, 1993). This term is associated with a view of a database as containing
sets of objects in space rather than images or pictures of a space. Indeed, the
requirements and techniques for dealing with objects in space that have identity
and well-defined extents, locations, and relationships are rather different from those
for dealing with raster images. It has therefore been suggested that two classes of
systems, spatial database systems and image database systems, be clearly distinguished
(Gfinther and Buchmann, 1990; Frank, 1991). Image database systems may include
analysis techniques to extract objects in space from images, and offer some spatial
database functionality, but they are also prepared to store, manipulate, and retrieve
raster images as discrete entities. In this survey, we discuss only spatial database
systems in the restricted sense. Several articles in this special issue address image
database problems, and so complement the survey.
What is a spatial database system? We are not aware of a generally accepted
definition. The following reflects the author's personal view:
1. A spatial database system is a database system.
2. It offers spatial data types (SDTs) in its data model and query language.
3. It supports spatial data types in its implementation, providing at least spatial
indexing and effcient algorithms for spatial join.
Let us briefly justify these requirements. The first sounds trivial, but emphasizes
the fact that spatial, or geometric, information is in practice always connected
with "non-spatial" (e.g., alphanumeric) data. It is not sufficient to have a special
purpose system that cannot handle all the standard data modeling and querying
tasks. Hence, a spatial database system is a full-fledged database system with
additional capabilities for handling spatial data. The second requirement, SDTs
(e.g., POINT, LINE, REGION), provide a fundamental abstraction for modeling the
structure of geometric entities in space, as well as their relationships (l intersects r),
properties (area(r) > 1,000), and operations (intersection(l, r)--the part of l lying
within r). Which types are used may, of course, depend on a class of applications
to be supported (e.g., rectangles in VLSI design; surfaces and volumes in 3-D).
Without spatial data types, a system does not offer adequate support in modeling.
The third requirement is that a system must at least be able to retrieve from a
large collection of objects in some space those lying within a particular area without
scanning the whole set. Therefore, spatial indexing is mandatory. It should also
support connecting objects from different classes through some spatial relationship
in a better way than by filtering the cartesian product (at least for those relationships
that are important for the application).
The purpose of this survey is to coherently present some of the fundamental
problems and their solutions in spatial database systems. The focus is on describing
solutions that have been found, rather than on listing many open problems. We
consider spatial DBMSs to provide the underlying database technology for geographic
information systems (GIS) and other applications. As such, they can offer only
VLDB Journal 3 (4) Giiting: An Introduction to Spatial Database Systems 359
some basic capabilities; we do not claim that a spatial DBMS is directly usable as
an application-oriented GIS.
In the following four sections we consider modeling, querying, tools for im-
plementation (data structures and algorithms), and system architecture for spatial
database systems.
2. Modeling
2.1 What needs to be represented?
The main application that is driving research in spatial database systems is the
technology for GISs. Hence, we consider some modeling needs in this area that are
also typical for other applications. Examples are given for 2-D space but, almost
everywhere, extension to the three- or more-dimensional case is possible. There
are two important alternative views of what needs to be represented:
1. Objects in space: We are interested in distinct entities arranged in space, each
of which has its own geometric description.
2. Space: We wish to describe space itself, that is, to say something about every
point in space.
The first view allows one to model, for example, cities, forests, or rivers. The second
view is the one of thematic maps describing, for example, land use or the partition
of a country into districts. Since raster images reveal something about every point
in space, they are also closely related to the second view. We can reconcile both
views to some extent by offering concepts for modeling single objects, and spatially
related collections of objects.
For modeling single objects, the fundamental abstractions are point, line, and
region. A point represents (the geometric aspect of) an object for which only its
location in space, but not its extent, is relevant. For example, a city may be modeled
as a point in a model describing a large geographic area (a large scale map). A line
(in this context always understood to mean a curve in space, usually represented
by a polyline, a sequence of line segments) is the basic abstraction for facilities for
moving through space, or connections in space (e.g., roads, rivers, cables for phone,
electricity). A region is the abstraction for something having an extent in 2-D space
(e.g., a country, a lake, or a national park). A region may have holes and may also
consist of several disjoint pieces. Figure 1 shows the three basic abstractions for
single objects.
The two most important instances of spatially related collections of objects are
partitions (of the plane) and networks (Figure 2). A partition can be viewed as a
set of region objects that are required to be disjoint. The adjacency relationship is
of particular interest, that is, there are often pairs of region objects with a common
boundary. Partitions can be used to represent thematic maps. A network can be
viewed as a graph embedded in the plane, consisting of a set of point objects,
360
I AO
Figure 2. Partitions and networks
forming its nodes, and a set of line objects describing the geometry of the edges.
Networks are ubiquitous in geography (e.g., highways, rivers, public transport, or
power supply lines).
Obviously, we have mentioned only the most fundamental abstractions to be
supported in a spatial DBMS (for GIS, in this case). Other interesting spatially
related collections of objects include nested partitions (e.g., a country partitioned
into provinces partitioned into districts) and a digital terrain (elevation) model. For
a deeper discussion of modeling requirements for GIS see Smith et al. (1987) and
Frank (1991). In the sequel, we shall consider how the basic abstractions mentioned
above can be embedded into a DBMS data model.
Figure 4. A realm
Numeric robustness problems can be treated within the geometric base layer so
that spatial data types or algebras defined on top enjoy nice closure properties not
only in theory but also in an implementation (G~ting and Schneider, 1993a).
Systems of spatial data types, or spatial algebras, can capture the fundamental ab-
stractions for point, line, and region, together with relationships between them and
operations for composition (e.g., forming the intersection of regions). In Section 1
we stated that they are a mandatory part of the data model for a spatial DBMS, so
that all proposals for models and query languages, as well as prototype systems (Sec-
tion 5) offer them in some form. Spatial types and operations have been described
(e.g., Chang and Fu, 1980; Lipeck and Neumann, 1987; Gfiting, 1988; Joseph and
Cardenas, 1988; Orenstein and Manola, 1988; Roussopoulos et al., 1988; Svensson
and Huang, 1991). Dedicated work towards a formal definition has been reported
by Scholl and Voisard (1989), Gargano et al. (1991), and Gfiting and Schneider
(1993b). As an example of spatial algebra, we briefly consider the ROSE algebra
(Gfiting and Schneider, 1993b).
The ROSE algebra offers three data types called points, lines, and regions, whose
values are realm-based (i.e., composed from elements of a realm). To describe these
values, one needs intermediate notions of an R-block and an R-face. For a given
realm R, an R-block is a connected set of line segments of R. An R-face is essentially
a polygon with holes that can be defined over realm segments. Then, a value of
type points is a set of R-points, a value of type lines is a set of disjoint R-blocks,
and a value of type regions is a set of edge-disjoint R-faces (edge-disjoint means
that two faces may have a common vertex, but no common edge).
The type system of the ROSE algebra is based on a second-order signature
(GiJting, 1993), which allows one to describe polymorphic operations by quantification
over //k/nds (which here can be viewed as type sets). Two such sets are EXT
= (lines, regions} and GEO = {points, lines, regions}. There are four classes of
VLDB Journal 3 (4) GOting:An Introduction to Spatial Database Systems 363
Here the type variable geo ranges over the three types in kind GEO, so that the
inside operation can compare a points, a lines, or a regions value with a regions
value. The intersects operation can be applied to two values of the same or
different types within kind EXT. The notation regions area-disjint is an attempt
to capture the structure of partitions in the type system. It describes a kind
for all partitions; each particular partition (thematic map) is a type within this
kind whose values are the regions within this partition. Hence, the type variable
area will pick one partition and the operation adjacent be applicable to any two
regions of that partition.
(2) Operations returning atomic spatial data type values:
V geo in GEO.
lines x lines ~ points intersection
regions x regions---* regions intersection
geo x geo ~ geo plus, minus
regions ~ lines contour
Here plus and minus form the union and difference, respectively, of two values
of the same type.
(3) Spatial operators returning numbers:
V geol x geo2 in GEO.
geol geo2 --~ real dist
regions --~ real perimeter, area
Here sum is a spatial aggregate function. It takes a set of objects together with a
spatial attribute of the objects of type geo (given as a function mapping each object
into its attribute value) and returns the geometric union of all attribute values.
For example, one might form the union of a set of provinces to determine the
364
area of a country. The closest operator determines within a set of objects those
whose spatial attribute value has minimal distance from some other geometric
(query) object.
These examples may show the kinds of operations that are available in a spatial
algebra. Formal definitions of the semantics of these types and operations can be
found in Giiting and Schneider (1993b). Some of the important issues related to
spatial data types or algebras are the following:
Extensibility. There is general agreement that the definition of types and, in
particular, the definition of operations is application-dependent. Hence, it
must be possible to define additional or alternative types and operations later,
which leads to the requirement of extensibility for the system architecture
(Section 5).
Completeness. Nevertheless, the question is whether there are any formal
criteria to say that a particular collection of operations is complete in some
respect. Some limited success in this direction has been obtained in the study
of topologicalrelationships (Section 2.4).
One or more types? Is it really necessary to have several different types to
distinguish, for example, points, lines, and regions? Some authors suggest
offering only a single type geometry whose instances can be any of these or
even mixed collections of them (e.g., Gargano et al., 1991; Larue et al., 1993).
This is analogous to the question of whether a system should offer different
types integer and real, or just a single type number. One advantage of a single
type may be that closure under operations is easier to achieve. On the other
hand, several types are more expressive and allow a more precise application
of operations.
Set Operations. A spatial algebra should offer not only operations on "atomic"
SDT values (a region value is considered to be atomic, even if it has a very
large description) but also on spatially related sets of objects, for example,
a partition (thematic map, tesselation) (G(iting, 1988; Scholl and Voisard,
1989; Tomlin, 1990; Svensson and Huang, 1991). Example operations are
the overlay of two partitions, fusion (merging adjacent areas in a partition
if other attributes are equal), or finding in a set of objects the one closest
to a query object. This kind of operation requires a much more intricate
interface with the DBMS data model than in the case of atomic operations
(G/iting and Schneider, 1993b).
Among these, topological relationships are the most fundamental and have been
studied in some depth. A basic question is whether we can somehow enumerate
all possible relationships. A method for this was proposed by Egenhofer (1989)
and Egenhofer and Herring (1990). It was originally formulated for simple regions
(connected, no holes), called area in the sequel, and is based on comparing the
intersections of their boundaries and interiors (denoted 0 A and A , respectively).
For two objects there are four intersection sets; each of them may be empty or
non-empty, which leads to 24 = 16 combinations (Table 1). Eight of these are not
valid, and two of them are symmetric so that six different relationships result, called
disjoint, in, touch, equal, cover, and overlap.
This approach has been extended in various ways. For example, point and
line features have been added (Egenhofer and Herring, 1992; de Hoop and van
Oosterom, 1992). Egenhofer extended the original four-intersection method to a
nine-intersection method by also considering intersections with the complement A-1
(Egenhofer, 1991b). Clementini et al. (1993) also considered the dimension of the
intersection (called the dimension-extended method): in 2-D space the intersection
can be empty, 0-D (point), 1-D (line), or 2-D (area). This results, in principle, in
44 = 256 combinations. Again, many of these are not valid, so that in total 52
relationships among point, line, and area features remain.
Since these are far too many to be named and remembered by a user, an
alternative is suggested. Five basic relationship names are introduced (touch, in,
cross, overlap, and disjoint), whose meaning is formally defined in terms of the
dimension extended method, for example:
The touch relationship applies to area/area, line/line, line/area, point/area,
and point/line, but not point/point situations. For two features A1 and A2,
it is defined by:
< A1 touch A2 >::-~. (A'~ A A~ = 0) A (A1 A A2 -~ 0)
In addition to the five relationships, three operators are offered to obtain the
boundaries of features: operator b applied to area A yields the boundary line 0A;
operators f and t return the end points of a line. It was proven by Clementini
et al. (1993) that the five relationships are mutually exclusive (no two different
relationships can hold between any two features) and that all situations described
by the dimension-extended method can be distinguished using the relationships and
the three boundary operators. Other work on spatial relationships includes Freksa
(1991), Frank (1992), and Cui et al. (1993). The article by Papadias and Sellis in
this special issue investigates the subject in more depth; further references can be
found there (Papadias and Sellis, 1994).
366
3. Querying
From one point of view, the problem of querying is to connect the operations of a
spatial algebra (including predicates to express spatial relationships) to the facilities
368
of a DBMS query language. But there are also other aspects that mainly have to do
with the fact that spatial data require a graphical presentation of results as well as
graphical input of queries or at least SDT values used in queries. In the following
three subsections, we consider the fundamental operations needed at the level of
manipulating sets of database objects, graphical input and output, and techniques
and requirements for extending query languages.
The last example illustrates that selection conditions can also be based on metric
relationships, and can occur in conjunction with other predicates. Query optimization
should be able to compare access plans using spatial indexes with plans using a
standard index. This will be discussed further in Section 5.
Spatial Join. Similar to a spatial selection, a spatial join is a join that compares
any two objects with a predicate according to their spatial attribute values. Some
examples:
"Combine cities with their states."
cities states join[center inside area]
"For each river, find all cities within less than 50 kms."
cities rivers join[dist(center, route) < 50]
As mentioned in Section 1, spatial selection and spatial join are so important that it
is mandatory to support them with spatial indexing and with special join algorithms,
at least for the most important spatial predicates.
Spatial Function Application. How can operations of a spatial algebra that compute
new SDT values (class 2 in Section 2.3) be used in a query? In a set-oriented query,
VLDB Journal 3 (4) GiJting: An Introduction to Spatial Database Systems 369
a new SDT value is computed for each object in a set. Various object algebra
operators allow such an embedding of a function application, for example, the filter
operator of FAD (Bancilhon et al., 1987), the replace operator (Abiteboul and Beeri,
1988), or the A or extend operator (Giiting et al., 1989). The extend operator takes
an expression to be evaluated for each object and a (new) attribute name; it appends
the resulting value as a new attribute to the object. For example:
"For each river going through Bavaria, return the name, the part of its geometry lying inside
Bavaria, and the length of that part. "
rivers select[route intersects Bavaria]
extend [intersect ion (route, Bavaria) {part}]
extend[length(part) {plength}] project[rname, part, plength]
Other Set Operations. Such operations manipulate whole sets of spatial objects in a
special way; they lie at the interface between a spatial algebra and the DBMS object
algebra (Section 2.3). Of particular importance are operations that manipulate
partitions (thematic maps); a collection of such operations is described by Scholl
and Voisard (1989). Closely related is the map algebra by Tomlin (1990). Some
suggested operations are the following:
Overlay. Computes the elementary regions resulting from overlaying two
partitions. It can be viewed as a special kind of spatial join (Frank, 1988;
G~ting, 1988; Scholl and Voisard, 1989).
Fusion. This is a special kind of grouping. Objects are grouped by some
arbitrary attribute values. For each resulting group of objects, the union of
all values of a spatial attribute is formed. For example, given a set of region
objects with a "land-use" attribute, one can group by land-use to obtain one
object for land-use "wheat" with the associated union region, etc. (Scholl
and Voisard, 1989; Gargano et al., 1991).
Voronoi. From a set S of point objects, computes a corresponding set of
region objects (the Voronoi diagram). For each point p, the region consists
of the points of the plane closer to p than to any other point in S (Giiting,
1988).
information to be retrieved is often not the result of a single query but rather a
combination of several queries. For example, for (;IS applications, the user wants
to see a map built by graphically overlaying the results of several queries.
Requirements for spatial querying have been analyzed by Frank (1982), Egen-
hofer and Frank (1988), and Egenhofer (1994). In Egenhofer (1994), the following
list is given:
1. Spatial data types.
2. Graphical display of query results.
3. Graphical combination (overlay) of several query results. It should be possible
to start a new picture, to add a layer, or to remove a layer from the current
display. (Some systems also allow the order of layers to be changed; Voisard,
1991; Vijlbrief and van Oosterom, 1992).
4. Display of context. To interpret the result of a query, for example, a point
describing the location of a city, it is necessary to show some background,
such as the boundary of a state containing it (Frank, 1982). A raster image
of the area can also nicely serve as a background.
5. A facility for checking the content of a display. When a picture (a map) has
been composed by several queries, one should be able to check which queries
have built it.
6. Extended dialog. It should be possible to use pointing devices to select objects
within a picture or subareas (zooming in), for example, by dragging a rectangle
over the picture.
7. Varying graphical representations. It should be possible to assign different
graphical representations (colors, patterns, intensity, symbols) to different
object classes in a picture, or even to distinguish objects within one class
(e.g., by using different symbols to distinguish cities by population).
8. A legend should explain the assignment of graphical representations to object
classes.
9. Label placement. It should be possible to select object attributes to be
used as labels within a graphical representation. Sophisticated ("nice") label
placement for a map is a difficult problem, however (Freeman and Alan,
1987).
10. Scale selection. At least for GIS applications, selecting subareas should be
based on commonly used map scales. The scale determines not only the size of
the graphical representation, but it also could determine what kind of symbol
is used or whether an object is shown at all (cartographic generalization).
11. Subarea for queries. It should be possible to restrict attention to a particular
area of the space for several following queries.
For part (1), the language SQL, extended by spatial types and operations, was used
by Egenhofer (1994). For parts (2) and (3), a special graphicalpresentation language
(GPL; Egenhofer, 1991a) was introduced, which provides specifications for most of
the requirements listed above.
it is possible to denote such values, then one can nicely decouple graphical input
and querying; the user interface allows one to draw the value and assign a name
to it, which can then be used in queries. If it is. not possible, then a suggested
technique (Chang and Fu, 1980; Frank, 1982; Egenhofer, 1994) is to use a special
keyword within a query such as PICK; parsing the query will lead to an interaction
that allows the user to graphically enter the value:, for example:
SELECT sname FROM cities WHERE center inside PICK
In contrast, the expression of other set operations of a spatial algebra does not fit
into the select.., from.., where (SFW) paradigm, because these are algebra operations
at the same level as projection, cartesian product, and selection, as captured by
SFW. Some syntactic facilities required in a query language to accomodate a spatial
algebra completely were described by Giiting and Schneider (1993b) where a general
"object model interface" is developed.
From the point of view of the spatial algebra implementation, which is done in
some programming language (most likely the DBMS implementation language), the
representation
is a value of some programming language data type (e.g., region),
is some arbitrary data structure that is possibly quite complex,
supports efficient computational geometry algorithms for spatial algebra op-
erations,
is not geared to only one particular algorithm, but is balanced to adequately
support many operations.
374
To fulfill the DBMS requirements, the representation must be a paged data structure
compatible with the DBMS support for long fields or large attribute values. To
support efficient loading and storing on disk, it should consist of a single contiguous
byte block, as long as it is small enough to fit into one page. Otherwise, it can be
a large byte block cut into page-sized pieces. The DBMS then either may allocate
enough internal space to hold the whole value (and map pages into the right positions
of this buffer), or it may implement a more complex paging strategy to access the
value. When a value representation happens to be large, a good strategy is to split
it into a small infopart, which will contain often-used summary information about
the value, and an exact geometlypart, representing, for example, the long sequence
of vertices, so that it is possible to load only the info part into a DBMS buffer.
The info part might be contained in the DBMS object representation and contain
a logical pointer to a separate page sequence holding the exact geometry part. The
generic operations needed by the DBMS may concern, for example, transforming
from/to a textual or graphic representation for input/output at the user interface, or
transforming from/to an ASCII format for bulk loading or external data exchange.
More specifically, for spatial data types, generic approximations may be needed to
interface with spatial access methods: For example, each data type must provide
access to a bounding box (also called the minimum bounding rectangle (MBR)).
From both the spatial algebra and the programming language point of view,
the representation should be such that it is mapped by the compiler into a single or
perhaps a few contiguous areas (to support the DBMS loading). For example, it can
be defined as a pointer to a record with several fixed-size components and a very
large array (for the exact geometry) at the end; then one can dynamically allocate
the right amount of space for a given value. Apart from that, the representation
can support operations as follows:
Plane sweep sequence. Very often, algorithms on the exact geometry use
a plane-sweep. The sweep needs the components of the object (e.g., the
vertices) in some fixed order (e.g., x-order). It is highly advantageous to
store this order explicitly in the object so that not every sweep needs to sort
vertices first.
Approximations. The implementation of many operations starts with a rough
test on an approximation of the object. Usually this is the bounding box, but
there can also be other approximations. Hence, these should be part of the
representation.
Stored unary function values. Some operations of the spatial algebra compute
properties of a spatial value (e.g., the area or perimeter of a region). Since
these can be expensive to compute, they may be computed once after the
creation of the value and then be stored with it.
The representation strategy described above does assume, in fact, a particular DBMS
architecture, namely, that of an extensible DBMS. Hence, some of the remarks may
not be valid for an architecture that, for example, stores its SDT values separately
VLDB Journal 3 (4) Giiting: An Introduction to Spatial Database Systems 375
indexing: (1) dedicated external spatial data structures are added to the system,
offering for spatial attributes what a B-tree does for standard attributes, and (2)
spatial objects are mapped into a 1-D space so that they can be stored within a
standard 1-D index such as a B-tree. Apart from spatial selection, spatial indexing
supports also other operations such as spatial join, finding the object closest to a
query value, etc.
A fundamental idea for spatial indexing and, in fact, for all spatial query
processing, is the use of approximations. This allows index structures to manage
an object in terms of one or more spatial keys, which are much simpler geometric
objects than the SDT value itself. A continuous approximation is based on the
coordinates of the SDT value itself. The prime example is the bounding box (the
smallest axis-parallel rectangle enclosing the SDT value). For grid approximations,
space is divided into cells by a regular grid, and the SDT value is represented by the
set of cells that it intersects. Figure 5 illustrates the two kinds of approximations.
The use of approximations leads to a filter and refine strategy for query processing
(Frank, 1981; Orenstein and Manola, 1988): First, based on the approximations,
a filtering step is executed; it returns a set of candidates that is a superset of the
objects fulfilling a predicate. Second, for each candidate (or pair of candidates
in case of spatial join) in a refinement step the exact geometry is checked. This
strategy has more recently been extended to include a second filtering step where
more precise approximations of the candidate objects are checked (Brinkhoff et al.,
1993a).
Due to the use of bounding boxes, most spatial data structures are designed to
store either a set of points (for point values) or a set of rectangles (for line or region
values). The operations offered by such structures are insert, delete, and member
(find a stored rectangle or point) to manage the set as such. Apart from that, one
or more query operations are supported. For stored points, some important types
of queries are:
Range query: Find all points within a query rectangle.
VLDB Journal 3 (4) Giiting: An Introduction to Spatial Database Systems 377
For rectangles:
Intersection query: Find all rectangles intersecting a query rectangle.
Containment query: Find all rectangles completely within a query rectangle.
A spatial index structure organizes objects within a set of buckets (which normally
correspond to pages of secondary memory--some special approaches use varying size
buckets with many pages; Drdge and Schek, 1993). Each bucket has an associated
bucket region, a part of space containing all objects stored in the bucket. Bucket
regions are usually rectangles. For point data structures, these regions are normally
disjoint and they partition the space so that each point belongs to precisely one
bucket. For some rectangle data structures, bucket regions may overlap. Figure 6
shows a partition where each bucket can hold up to 3 points.
Like index structures for standard attributes, the structure can be a clustering
or a secondaJy index. A clustering index stores the actual spatial objects. An entry
in a secondary index is just a spatial key (e.g., point or rectangle) together with a
logical pointer to the object in the database.
In the following three subsections, we first consider 1-D embeddings that allow
the use of standard index structures such as the B-tree. We then discuss dedicated
spatial data structures for points and for rectangles.
4.2.1 1-D Embedding of Grid Approximations. The basic idea for this is (1) to find
a linear order for the cells of the grid such that cells close together in space are also
(as far as possible) close to each other in the linear order, and (2) to define this
order recursively for a grid that is obtained by a hierarchical subdivision of space.
Figure 7 shows the most popular such order, bit interleaving, proposed by Morton
(1966) and later rediscovered several times (e.g., Abel and Smith, 1983; Gargantini,
1982). Orenstein (1986) used it as a general basis for query processing in the
378
01\
0
00 10
0
PROBE system (Orenstein and Manola, 1988), and called it z-order. In Figure 7,
the left diagram shows the ordering imposed on the four quadrants of the top level
of a regular hierarchical partition. On the right side, this is continued to the next
level: within each quadrant, cells are connected in z-order, and then the groups of
cells of the four quadrants are again connected in z-order. Each cell at each level
of the hierarchy has an associated bit string whose length corresponds to the level
to which the cell belongs. For example, the top-right cell in the left diagram has bit
string 11, on the right-side cell 1110 is shown. The bit string 1110 is obtained by
choosing 11 at the top level, and then 10 within the top level quadrant. One can
also think of it as being composed of a 11 x-coordinate (used for the first and third
bit) and a 10 y-coordinate (used for the second and fourth bit) which has led to the
name bit interleaving. The order that is so imposed on all cells of a hierarchical
subdivision is given by the lexicographical order of the bit strings.
Any shape (set of cells) over the grid can now be decomposed into a minimal
number of cells at different levels, always using the highest possible level. It can
therefore be represented by a set of bit strings (Figure 8), called z-elements by
Orenstein (1986).
For a given spatial object, one can therefore use its corresponding set of z-
elements as a set of spatial keys. To build an index for a set of objects, one can
just form the union of all these spatial keys and put them in lexicographical order
into a B-tree. Because of the proximity-preserving property of this embedding,
various types of queries can now be answered relatively efficiently through B-tree
access. For example, to answer a containment or range query with a rectangle r, this
rectangle is itself decomposed into a number of z-elements. For each z-element,
one portion of the leaf sequence of the B-tree is scanned containing all entries
having that z-element as a prefix. This returns a set of candidates which then can
be checked in the refine step whether containment is actually true.
VLDB Journal 3 (4) Giiting: An Introduction to Spatial Database Systems 379
10010 100110
II 1,(
110ca
Y3
Y2 i
Yl
Iq
I
scales ~.... buckets
I I I I
Xl x2 x3x4
4.2.2 Spatial Index Structures for Points. Data structures for representing points in
a k-dimensional space have a much longer tradition than spatial database systems.
This is because a tuple consisting of n attributes, t = ( x 1 . . . . . Xk) , can be viewed as
a point in k dimensions and, therefore, such data structures can be used to support
multi-attribute retrieval. On the other hand, they also can store points with a
geometrical interpretation. Two well-known representatives of such data structures
are the gridfile (Nievergelt et al., 1984) and the kd-tree (Bentley, 1975). The latter
is an internal data structure, but also has been used as a basis for external index
structures.
The grid file (Figure 9) partitions the data space into cells by an irregular
grid. It is characteristic for this partition that split lines extend through the whole
space. The split line positions are kept in scales, using one scale per dimension.
380
The directoly is a k-dimensional array whose entries are logical pointers to buckets.
Each cell of the data space corresponds to one element of the directory array, and
all points lying within a cell are stored in the bucket pointed to by the corresponding
directory entry. Several cells may be mapped into the same bucket so that bucket
regions, in general, consist of more than one cell.
The scales are relatively small structures and can be kept in memory; the
directory resides in a set of pages on disk. To find the bucket that contains a
particular point, one would determine with the help of the scales the address of the
page containing the directory entry for the cell containing it. The second page access
already retrieves this bucket. Range queries can be answered by determining from
the directory the set of buckets containing cells intersected by the query rectangle,
and then by examining the points in these buckets. For the treatment of bucket
overflows or underflows see Nievergelt et al. (1984).
The kd-tree is a binary tree where each internal node contains a key drawn from
one of the k dimensions; leaves contain the points to be stored. The key in the root
node (at level 0, counting from top to bottom) divides the data space with respect
to dimension 0, the keys in its sons, at level 1, divide the two subspaces with respect
to dimension 1, and so forth, up to dimension k-l, after which cycling through the
dimensions restarts. Figure 6 shows a kd-tree partitioning of the data space. For the
original kd-tree, the recursive splitting of space stops when each cell contains only a
single point. This has been transformed to an external data structure by letting each
cell of the partition correspond to a bucket and by also paging the binary tree itself
in the KDB-tree (Robinson, 1981), which is also a generalization of the B-tree (all
leaves are at the same level). Another variant is the LSD-tree (Henrich et al., 1989),
which abandons the strict cycling through the dimensions and makes it possible to
choose the dimension for splitting based on local criteria (therefore called local
split decision tree). The second important aspect of the LSD-tree is a clever paging
algorithm that keeps the external path length balanced, even for very unbalanced
binary trees. This allows the LSD-tree to deal rather well with skewed distributions
of points that particularly arise when extended spatial objects (k-dimensional boxes,
rectangles) are mapped into points through the transformation approach (Section
4.2.3). Other point data structures are, for example, EXCELL (Tamminen, 1982),
the buddy hash tree (Seeger and Kriegel, 1990), the BANG file (Freeston, 1987),
or the hB-tree (Lomet and Salzberg, 1989).
A
ql n
y=i 2
I
I
I j
L
|
_
r
X=ll 12 ql q2
IAIBIClI
IDiEIFII IOIHllIIJIKLIMI
root node contains a rectangle A, which is the bounding box of the rectangles D,
E, and F, stored in the son associated with A. Rectangles may overlap; hence, a
rectangle can intersect several bucket regions but will be represented only in one
of them. An advantage is that a spatial object can be kept in just one bucket. A
problem is that now the search needs to branch and follow several paths whenever
one is interested in a region lying in the overlap of two son regions. To keep search
efficient, it is crucial to minimize the overlap of node regions. This is determined
by the split strategy on overflow. Several strategies based on different heuristics
have been studied (Guttman, 1984; Greene, 1989; Beckmann et al., 1990); the latter
study proposed the R*-tree, which appeared to perform best in experiments.
Clipping. A variant of the R-tree, called R+-tree, was proposed by Sellis et al.
(1987) and Faloutsos et al. (1987), and was used in the PSQL database system
(Roussopoulos et al., 1988). It completely avoids overlapping regions associated
with buckets or internal nodes of the same level by clipping data rectangles, if
necessary~ ' '
In Figure 12, an R+-tree is shown for the same set of data rectangles as in Figure
11. Here the rectangles A, B, and C in the root are chosen a bit differently to keep
them, and therefore the three sons' bucket regions, disjoint. Now it is necessary
to clip rectangles D and J so that each of them is represented in two buckets.
Experimental comparisons of spatial index structures including R-tree variants can
be found in Greene (1989), Smith and Gao (1990), and Beckmann et al. (1990).
There has been a tremendous amount of work on spatial index structures, and
it is not possible to cover it completely in this survey. Other directions include
quadtree variants (surveyed in Samet, 1990), which are closely related to the grid
approximation schemes of Section 4.2.1, or cell trees (Giinther, 1988; Gfinther and
Bilmes, 1989), which do not store rectangles, but work with polygonal subdivisions
of the plane directly. An excellent survey of spatial index structures can be found
in Widmayer (1991). The article by Lin et al. in this special issue introduces the
TI."-tree, a data structure for indexing sets of points in a high-dimensional space,
which is somewhat similar to an R-tree (Lin et al., 1994). It is a good example
VLDB Journal 3 (4) Gfiting: An Introduction to Spatial Database Systems 383
IAIBIcl I
for the design and analysis techniques needed in the development of spatial index
structures as described in this section.
It should be clear now that spatial index structures, offering a few fundamental
query operations, can support selection with many different spatial predicates through
the filter and refine strategy. For example, a query for all regions in a partition adjacent
to a given region can be answered by checking candidates from an intersection query;
to find all regions within a certain distance from a query point, one can also find
candidates with an intersection query that uses a suitable square around the point.
The filter and refine strategy was extended by Brinkhoff et al. (1993a) to
include a second filter step with finer approximations than the bounding box; they
compared, for example, bounding ellipses, convex hulls, and convex 5-corners. These
are conservative approximations, which means they include the actual SDT values.
Better conservative approximations are able to exclude some false hits from further
consideration. In the second filter step one can also use progressive approximations,
which are contained in the actual SDT value, such as a maximum enclosed circle or
a maximum enclosed rectangle (Brinkott et al., 1994). These allow one to identify
hits; if two progressive approximations intersect, their SDT values are guaranteed
to intersect. The goal is to avoid, as far as possible, the expensive loading and
comparison of the exact geometries. It also was suggested to decompose very large
SDT values into several components so that checking the exact geometry can for
most queries be restricted to one of the components (Kriegel et al., 1991).
For grid approximations, and for an overlap predicate, Orenstein (1986) and Orenstein
and Manola (1988) described join algorithms to determine pairs of candidates.
Essentially, a parallel scan of the two sets of z-elements corresponding to the two
sets of spatial objects is performed, similar to a merge join for a < predicate.
Note that overlay, a particularly important operation for GISs, is a special case
(Orenstein, 1991). A general problem with grid approximations is that choosing
too fine a grid leads to inefficiency because too :many z-elements per object are
created, whereas a too rough grid may deliver too many "false hits" in a spatial
join (Orenstein, 1989).
If the filter step is based on the use of bounding boxes, then the problem is
to determine for two sets of rectangles R, S, all pairs (r, s), r C R, s C S, such that
r intersects s. If none of the operands is represented in a spatial index, a good
technique is to use a rectangle intersection algorithm from computational geometry,
which solves this problem precisely. Such an algorithm, called bb_join, has been
used in the Gral system (G~ting, 1989; Becker and GiJting, 1992). The basis is
an external divide-and-conquer algorithm (Becker and Giiting, 1992; GiJting and
Schilling, 1987), somewhat similar to external merge sorting. Note that even when
base object sets are represented in a spatial index, such a method is needed in
query processing (e.g., when the two operand sets have been determined through
other indexes, or are themselves the result of geometric set operations). This also
was emphasized by Lo and Ravishankar (1994), who suggested building an index
for one of the operands on the fly, and who described a new tree structure, seeded
trees, particularly suitable for this method.
If one operand is represented in a spatial index, then an index join or repeated
search join can be used (Becker and Giiting, 1992; Lo and Ravishankar, 1994). This
is a classical technique, usually used with a B-tree index, which can be applied equally
well to spatial index structures. Hence, if the "inner" operand is represented in an
index supporting rectangle intersection queries, one can scan the "outer" operand
set; for each object, the bounding box of its SDT attribute is used as a search
argument on the index. As a result one again obtains a set of candidate pairs with
intersecting rectangles. Repeated search join is especially efficient if the outer set
is not too big (e.g., it is the result of a selection from a large set). If both sets are
large, bb_join may be more efficient. Such choices have to be made by the query
optimizer.
Recent research into spatial join methods has focused on the case where both
operands have a spatial index. The basic idea is to perform a synchronized traversal
of the two index structures so that pairs of ceils whose respective partitions cover
the same part of space are encountered together. A parallel traversal of two grid
files was examined by Rotem (1991), Becker et al. (1993); of R-trees by Brinkhoff
et al. (1993b). Giinther (1993) studied traversal of generalization trees, which
can represent nested polygonal partitions directly but can also be viewed as a
generalization of R-trees, for example. He also derived cost formulas for several
distributions, and compared the cost of nested-loop join (i.e., filtering the cartesian
VLDB Journal 3 (4) GOring: An Introduction to Spatial Database Systems 385
5. System Architecture
5.1 Requirements
At the system architecture level, the problem is to integrate the tools described
in Section 4 to support spatial data types--and more. In principle, the following
extensions to a standard architecture need to be accommodated:
representations for the data types of a spatial algebra,
procedures for the atomic operations,
spatial index structures,
access operations for spatial indexes,
filter and refine techniques,
spatial join algorithms,
cost functions for all these operations,
statistics for estimating selectivity of spatial selection and spatial join,
extensions of the optimizer to map queries into the specialized query pro-
cessing methods,
spatial data types and operations within data definition and query language,
user interface extensions to handle graphical representation and input of
SDT values.
In our view, the only clean way to accomodate these extensions is an integrated
architecture based on the use of an eextensible DBMS. Nevertheless, GISs have been
constructed before extensible DBMS technology was available, and we shall first
review previous approaches to GIS architecture.
386
The first generation of GIS was built directly on top of file systems, and did not
offer the benefits of DBMSs such as high-level data definition, flexible querying,
and transaction management. They are not further discussed here. When DBMS
technology and, in particular, relational systems, became available, attempts were
made to use them as a basis. The two main approaches are layered architecture and
dual architecture (following the terminology of Vijlbrief and Oosterom, 1992; see
also Larue et al., 1993).
Spatial Tools
Standard DBMS
Integration Layer
Research into extensible database systems (e.g., POSTGRES, Stonebraker and Rowe,
1986; Probe, Dayal et al., 1987; EXODUS, Graefe and DeWitt, 1987; GENESIS,
Batory et al., 1988; Gral, G/iting, 1989; Sabrina, Gardarin et al., 1989; Starburst,
Haas et al., 1989; DASDBS, Schek et al., 1990) was aimed at making precisely the
kinds of extensions required in Section 5.1 possible. The use of an extensible system
388
!i i
index structures
GEO-Kernel (Schek et al., 1990; Wolf, 1989), and Gral (G/.iting, 1989; Becker
and Gtiting, 1992). Possible uses of extensibility, in particular in the context of
the Starburst system, for spatial database applications were discussed by Haas and
Cody, 1991). More recent prototypes are G E O + + (van Oosterom and Vijlbrief,
1991; Vijlbrief and van Oosterom, 1992), based on POSTGRES, and G6oSabrina
(Larue et al., 1993), based on Sabrina.
In the Probe system (Orenstein, 1986; Orenstein and Manola, 1988), spatial data
types can be introduced as refinements (within an object-oriented class hierarchy)
of a general POINT-SET data type. For all such types, the system provides built-in
support in the form of approximate geometry processing. This means that SDT
values are represented by sets of z-elements (Section 4.2.1) and that the filter step
for spatial selections (i.e., spatial indexing) and spatial joins is offered in the system
kernel. Recall that this work was a major proponent of the filter and refine strategy
for spatial query processing (Orenstein and Manola, 1988).
Work in the DASDBS project (Schek et al., 1990; Wolf, 1989) has focused on
external data type (EDT) support and on interfacing to generic spatial access methods.
The EDT concept is a variant of data type extensibility assuming that data structures
for an EDT and procedures working on these data structures are probably not coded
specifically for the DBMS but rather have existed in an application environment long
before. The DBMS should be able to work with these given programming language
representations by using appropriate conversion functions. This has recently been
extended to let the DBMS cooperate with a "geometric computation service" (as
an implementation of a spatial algebra) over a network within different run-time
environments (Schek and Wolf, 1993). For spatial indexing, generic access methods
partitioning the data space into cells such as the grid file or the R+-tree are assumed;
to interface with such an access method, each SDT implementation has to offer a
clip and a compose function to determine the piece of the geometry falling into
one cell and to put pieces together again, respectively.
The Gral system (Giiting, 1989; Becker and G/Jting, 1992) emphasizes many-
sorted algebra as a formal basis for its extensible system architecture; it uses
such algebras to describe application-specific query languages and query processing
systems, and provides a rule-based optimizer which transforms a query algebra
expression to an executable expression by applying transformation rules. For spatial
indexing, LSD-trees (Section 4.2.2) are available; spatial joins are supported by
repeated search on LSD-trees or a bounding-box-join algorithm (Section 4.3). The
bounding box is the generic interface between any spatial data type and access
or join methods. The system treats spatial and non-spatial data quite uniformly;
Becker and Gtiting (1992) demonstrated completely integrated query optimization
and processing, as well as how filter and refine techniques are actually implemented
in the optimizer.
Note that extensibility of a system architecture is rather orthogonal to the data
model implemented by that architecture. For example, Probe offers an object-
oriented or functional data model, DASDBS offers a nested relational model, and
390
6. Final Remarks
In this survey, we have tried to coherently present the major technical concepts for
spatial database systems. To keep the task manageable, we treated spatial database
systems only in a restricted sense; image database systems have been excluded. Some
interesting work on image databases includes Joseph and Cardenas (1988), Chang
et al. (1991), and Gupta et al. (1991). Fortunately, several articles in this special
issue are related to image databases and therefore help to close the gap: Baumann
describes basic DBMS support for the management of raster data (Baumann, 1994);
Chu et al. show modeling and querying requirements and techniques for images
in medical applications (Chu et al., 1994); and Papadias and Sellis focus on the
management of abstractions of spatial relationships occurring in images (Papadias
and Sellis, 1994).
Another omission, perhaps, is that not much has been said about the various
kinds of applications. A good general source for case studies of GIS applications and
their requirements is the International Journal of Geographical Information Systems.
Such issues also are discussed at the biannual Symposia on Spatial Data Handling.
The SEQUOIA 2000 project (Stonebraker et al., 1993b) addressed the needs of
global change researchers, in particular the need to deal with terabytes of raster
data. Some idea of the requirements of medical applications can be gained from
the paper by Chu et al. in this issue.
There are two recent surveys related to spatial database systems that may
augment the one given here. Giinther and Buchmann (1990) focus more on open
research questions. The survey by Bauzer-Medeiros and Pires (1994) is closer to
GIS applications.
Many interesting issues related to spatial database systems could not be included
in this survey, for example:
spatio-temporal modeling,
spatial objects with imprecise boundaries,
multi-scale modeling/cartographic generalization,
data lineage (maintaining information about precision, collection method,
etc. of data),
spatial reasoning/deductive spatial databases,
performance benchmarks for spatial DBMS (Stonebraker et al., 1993a).
VLDB Journal 3 (4) G/iting: An Introduction to Spatial Database Systems 391
Integrating solutions to such problems with the spatial database technology described
here will remain a fascinating challenge for database researchers for quite some
time.
Acknowledgments
I thank Max Egenhofer, Andre Frank, Hans-J6rg Schek, and Timos Sellis, who
carefully read a draft version of this survey and provided many interesting and
useful comments.
References
Abdelmoty, A.I., Williams, M.H., and Paton, N.W. Deduction and deductive data-
bases for geographic data handling. Proceedings of the Third International Sympo-
sium on Large Spatial Databases, Singapore, 1993.
Abel, D.J. SIRO-DBMS: A database tool kit for geographical information systems.
International Journal of Geographical Information Systems, 3:103-116, 1989.
Abel, D.J. and Ooi, B.C., eds. Proceedings of the Third lnternational Symposium on
Large Spatial Databases, Singapore, 1993.
Abel, D.J. and Smith, J.L. A data structure and algorithm based on a linear key for
a rectangle retrieval problem. Computer Vision, Graphics, and Image Processing,
24:1-13, 1983.
Abiteboul, S. and Beeri, C. On the power of languages for the manipulation of
complex objects. Technical Report 846, Paris: INRIA, 1988.
Agrawal, R. ALPHA: An extension of relational algebra to express a class of recursive
queries. Proceedings of the IEEE Data Engineering Conference, Los Angeles, 1987.
Aref, W. and Samet, H. Extending a DBMS with spatial operations. Proceedings of
the Second International Symposium on Large Spatial Databases, Zfirich, 1991a.
Aref, W. and Samet, H. Optimization strategies for spatial query processing. Proceed-
ings of the Seventeenth International Conference on l~ry Large Data Bases, Barcelona,
1991b.
Bancilhon, E, Briggs, T., Khoshafian, S., and Valduriez, E FAD, a powerful and
simple database language. Proceedings of the Thirteenth International Conference
on l~ry Large Data Bases, Brighton, 1987.
Batory, D.S., Barnett, J.R., Garza, J.E, Smith, K.P., Tsukuda, K., Twichell, B.C.,
and Wise, T.E. GENESIS: An extensible database management system. 1EEE
Transactions on Software Engineering 14:1711-1730, 1988.
Baumann, P. Management of multidimensional discrete data. VLDB Journal, 3(4):401-
444, 1994.
Bauzer-Medeiros, C. and Pires, E Databases for GIS.ACMSIGMODRecord, 23:107-
115, 1994.
392
Dayal, U., Manola, E, Buchmann, A., Chakravarthy, U., Goldhirsch, D., Heiler, S.,
Orenstein, J., and Rosenthal, A. Simplifying complex objects: The PROBE ap-
proach to modelling and querying them. In: Schek, H.-J. and Schlageter, G., eds.,
Proceedings GI-Fachtagung Datenbanksysteme in Biiro, Technilg und Wissenschaft,
Darmstadt, 1987.
Dr6ge, G. and Schek, H.-J. Query-adaptive data space partitioning using variable-
size storage clusters. Proceedings of the Third International Symposium on Large
SpatiaI Databases, Singapore, 1993.
Dr6ge, G., Schek, H.-J., and Wolf, m. Erweiterbarkeit in DASDBS (Extensibility
in DASDBS). Informatik Forschung und Entwicklung, 5:162-176, 1990.
Egenhofer, M. A formal definition of binary topological relationships. Proceedings
of the Third International Conference on the Foundations of Data Organization and
Algorithms, Paris, 1989.
Egenhofer, M. Extending SQL for cartographic display. Cartographyand Geographic
Information Systems, 18:230-245, 1991a.
Egenhofer, M. Reasoning about binary topological relations. Proceedings of the
Second International Symposium on Large Spatial Databases, Zfirich, 1991b.
Egenhofer, M. Why not SQL! International Journal of Geographical Information Sys-
tems, 6:71-85, 1992.
Egenhofer, M. Spatial SQL: A query and presentation language. IEEE Transactions
on Knowledge and Data Engineering, 6:86-95, 1994.
Egenhofer, M. and Frank, A. Towards a spatial query language: User interface con-
siderations. Proceedings of the Fourteenth International Conference on l,~ryLarge
Data Bases, Los Angeles, 1988.
Egenhofer, M., Frank, A., and Jackson, J.P. A topological data model for spa-
tial databases. Proceedings of the First International Symposium on Large Spatial
Databases, Santa Barbara, 1989.
Egenhofer, M. and Herring, J. A mathematical framework for the definition of
topological relationships. Fourth International Symposium on Spatial Data Han-
dling, Zfirich, 1990.
Egenhofer, M. and Herring, J. Categorizing binary topological relationships between
regions, lines, and points in geographic databases. Technical Report. Department
of Surveying Engineering, University of Maine, Orono, ME, 1992.
Erwig, M. and G/iting, R.H. Explicit graphs in a functional model for spatial
databases. FernUniversitfit Hagen, Informatik-Report 110, 1991. To appear in
IEEE Transactions on Knowledge and Data Engineering.
Faloutsos, C., Sellis, T., and Roussopoulos, N. Analysis of object-oriented spatial
access methods. Proceedings of the ACM SIGMOD Conference, San Francisco,
1987.
Frank, A. Application of DBMS to land information systems. Proceedings of the
Seventh International Conference on VeryLarge Data Bases, Cannes, 1981.
Frank, A. MAPQUERY: Data base query language for retrieval of geometric data
and their graphical representation. Computer Graphics, 16:199-207, 1982.
394
Herring, J., Larsen, R., and Shivakumar, J. Extensions to the SQL language to
support spatial analysis in a topological data base. Proceedings of GIS/LIS, San
Antonio, TX, 1988.
Hinrichs, K. The grid file system: Implementation and case studies of applications.
Doctoral thesis, ETH Zfirich, 1985.
de Hoop, S. and van Oosterom, P. Storage and manipulation of topology in Postgres.
Proceedings of the Third European Conference on Geographical Information Systems,
Miinchen, 1992.
Joseph, T. and Cardenas, A. PICQUERY: A high level query language for pictorial
database management. IEEE Transactions on Software Engineering~ 14:630-638,
1988.
Keating, T., Phillips, W., and Ingrain, K. An integrated topological database design
for geographic information systems. Photogrammetric Engineering and Remote
Sensing~ 53:1399-1402, 1987.
Kriegel, H.P., Horn, H., and Schiwietz, M. The performance of object decomposition
techniques for spatial query processing. Proceedings of the Second International
Symposium on Large Spatial Databases, Z/irich, 1991.
Larue, T., Pastre, D., and Vi6mont, Y. Strong integration of spatial domains and
operators in a relational database system. Proceedings of the Third International
Symposium on Large Spatial Databases, Singapore, 1993.
Lin, K.I., The TV-tree: An index structure for high-dimensional data. VLDBJournal,
3(4): 519-544, 1994.
Lipeck, U. and Neumann, K. Modelling and manipulating objects in geoscien-
tific databases. Proceedings of the Fifth International Conference on the Entity-
Relationship Approach, Dijon, France, 1987.
Lo, M.L. and Ravishankar, C.V. Spatial joins using seeded trees. Proceedings of the
ACM SIGMOD Conference, Minneapolis, MN, 1994.
Lomet, D.B. and Salzberg, B. A robust multi-attribute search structure. Proceedings
of the Fifth International Conference on Data Engineering, Los Angeles, 1989.
Lu, W. and Hun, J. Distance-associated join indices for spatial range search. Pro-
ceedings of the Ninth International Conference on Data Engineerin& Vienna, 1992.
Maingenaud, M. and Portier, M. Cigales: A graphical query language for geo-
graphical information systems. Proceedings of the Fourth International Symposium
on Spatial Data Handling, Ziirich, 1990.
Mantey, P.E. and Carlson, E.D. Integrated geographic data bases: The GADS
experience. In: Blaser, A., ed., Data Base Techniques for Pictorial Applications.
Berlin: Springer, 1980, pp. 173-198.
Mehlhorn, K. Data Structures and Algorithms 3: Multidimensional Searching and Com-
putational Geometry. Berlin: Springer, 1984.
Meyer, B. Beyond icons: Towards new metaphors for visual query languages for
spatial information systems. In: Cooper, R., ed. Interfaces to Database Systems.
Berlin: Springer, 1992.
VLDB Journal 3 (4) Giiting: An Introduction to Spatial Database Systems 397