Exercise 2: 2. Data Storage: Digitizing and Data Structure
Exercise 2: 2. Data Storage: Digitizing and Data Structure
Exercise 2: 2. Data Storage: Digitizing and Data Structure
Introduction
This exercise deals with spatial data storage in GIS. During this course you will mainly use the relational data model
as a method of structuring data. This means that you structure data as collections of tables that are logically related
to each other by shared attributes.
The first part of this exercise deals with data storage in vector which is based on tables. You will create a new vector
dataset by digitizing points as a method of (secondary) data capture. Digitizing is the process of using a mouse to
automatically store locations of geographic features by converting their map positions to series of x, y coordinates in
a computer file or an associated table in a database. Subsequently, you will view, create and fill tables. In addition
you will also join tables.
The second part deals with data storage in raster. You will see that spatial data is structured differently in a raster
environment.
In this exercise:
Objectives
Never use blank spaces in the names of datasets! Use the underscore “_” if you want to separate words. Avoid the
use of symbols (e.g. +, -) in dataset names and try to limit the length of the name to 10-12 characters. If you only use
of ESRI GIS software 12 is safe, if you also use other GIS software limit the length of the name to 10 characters to
avoid interoperability problems.
If your data consist of features too small to be depicted either as lines or areas, when the features have no
dimensionality, or when you are not interested in the dimensionality or geometry of the features, then you should
create an ArcMap point dataset. Points represent discrete locations such as wells, shops, and telephone booths.
During the next exercise, you will create a dataset representing soil points sampled in the southern part of
Wageningen. First, you will have to digitize the locations of the points and subsequently fill the attribute table with
attribute values.
INSTRUCTIONS:
If you discover that a point is entered at the wrong location, you can move that point. You always can modify the
point features.
1.
In this exercise you are going to create a new point dataset by digitizing the soil point observations as presented in
Figure 3. Activate data frame ‘Wag_south’ and display the datasets ‘Soil_types’ and ‘Roads’.
a. Create a new point dataset. Choose and add your workspace-directory ‘Workspace’ as Name (do not double
click!). so D:\IGI\workspace becomes your Feature Class Location. Give the new dataset the name ‘Soil_pts’
in the Feature Class name text box.
b. Digitize the locations of the soil point observations as drawn in Figure 2 in order of profile code: start with
point X1 and finish with X9!
c. Open the attribute table of the new point dataset. How many records does the attribute table of dataset ‘Soil_pts’
contain?
d. Save your edits.
In the previous exercise you digitized the locations of the soil points. No tabular data (attribute data) describing
certain characteristics of these points have been added yet. When you create a new vector dataset in ArcMap, an
attribute table is automatically created for this dataset. For each digitized point a record is automatically added to
the attribute table.
Initially the attribute table will contain three fields, called FID, ID and Shape. The FID field contains unique
identifiers of the features in the vector dataset. The unique identifier links the thematic (attribute) data to the
geometry of a geographic feature. These unique identifiers cannot be changed. In the ID field a user-defined
identifier can be stored. The Shape field stores the feature type of the geographic feature (Point, Line or Polygon).
This field is maintained by ArcMap and cannot be edited.
You can add new fields to this table at any time to store additional attribute data for the features. When you a create
new field in the attribute table you must select a data type for that field. The data type determines the kind of data
(e.g. text, number) that can be stored in a field. ArcMap supports several data types, but the most important ones are:
float, integer, text, and date.
Important:
NEVER use blank spaces and symbols in attribute names, only letters, numbers and underscores.
Attribute names can be at most 10 characters long.
Adding and deleting fields should be done using ArcToolbox as described below.
When adding or deleting fields or rows in a table, make sure that you are NOT in edit mode,
otherwise it will not work.
INSTRUCTIONS:
An alternative way of adding and deleting fields is using the Table Options button on the top of an attribute table for
adding a field and right click in the field with the field name to delete this field.
With the instructions listed before you can add new attributes. However, when you have a look at the table
afterwards, you will see that the fields are empty. You should now start to add the attribute values.
INSTRUCTIONS:
2.
a. Write down the meaning of the data types integer, float, text and date. Use the ArcGIS Desktop Help System to
find the definitions. Hint: use the keyword ’'add field’ in the Search field of the help.
b. Add six attributes to the table of the dataset ‘Soil_pts’: ‘Prof_code’, ‘pH’, ‘Clay’, ‘Silt’, ‘Sand’ and ‘Soilcode’.
Use the field definitions presented in Table 1.
c. When you have defined the new fields of attribute table ‘Soil_pts’ enter the attribute values according to Figure
4. Do not forget to save your edits. Then save the ArcMap document.
Important: When you have saved a field definition (i.e. completed the Add field action), it is not possible anymore
to change this field definition! If you make a mistake in the field definition, you have to delete the field and add a
new field to the attribute table.
Figure 3. The attribute table of vector dataset ‘Soil_ points’.
Polygons represent discrete areas such as houses, provinces or land uses. A polygon is two-dimensional. As a
consequence it has, in addition to location, the properties area and perimeter. In ArcMap you can calculate the area
of the polygon features of a dataset. There are three ways to do this with ArcMap: with ArcToolbox, with the Field
Calculator and with the CalculateGeometry option. This section contains instructions to make you acquainted with
the last method. First of all you have to add an extra field to the dataset. After adding an extra field to the dataset it’s
possible to calculate the areas of all polygon features.
1. Activate the data frame that contains the dataset you want to calculate the area for.
2. Add a field to the attribute table of the dataset you want to calculate the area for. Name the field for
example ‘Area’ (data type = double, precision = 10, scale = 2).
3. Right-click the field heading for this new field and click Calculate Geometry.
4. A Calculate Geometry window appears (Figure 4). Chose the geometry property you want to calculate (in
the case of a polygon area, perimeter, x-coordinate of centroid or y-coordinate of centroid), the coordinate
system (more about this in Exercise 3) you want to use, and the units in which the geometry should be
expressed. Click OK.
Figure 4. The Calculate Geometry window.
3.
Until now you have seen how to add new fields and attribute values to a dataset’s attribute table. Another way to get
your data into ArcMap is to create a new, empty table that you can fill yourself. If your tabular data is not stored on
the hard drive of your computer yet, creating a new table is a good way to get it into ArcMap. Creating a new table
is more flexible than simply adding attribute information to the attribute table of an existing dataset as you did
during the previous exercises. By putting your data in a separate table, you can work with it independent of any
particular dataset, and you can join it to any appropriate dataset whenever you want it (see next section).
INSTRUCTIONS:
So, to make the table visible, select the use the List By Source option. When the table is opened, you can see that it
contains three fields: ‘Rowid’, ‘OBJECTID’ and ‘FIELD1’. Adding and deleting fields and attribute values to this
table works the same as adding fields to the other datasets. Note that you cannot remove or edit the Rowid value
and that a table must always contain at least two fields.
4.
a. Create a new table called ‘Landscape.dbf’. Save it in your workspace directory. Add two fields to the new table:
‘LU_CODE’ and ‘LU_DESCRIP’. The field definitions for these fields are presented in the table below.
b. Enter the attribute values according to Figure 5 and save the edits.
Joining tables
You can add tabular data to an existing dataset by joining it to the dataset’s attribute table. When you join a table to
an attribute table, all fields (attributes) from the join table are appended to the attribute table of the dataset. You can
use any of these joined fields to symbolize, label, query, or analyze the dataset’s features.
A join is based on the values of an attribute that can be found in both tables. The name of the attribute does not
have to be the same in both tables, but the data type has to be the same and (at least some of) the attribute values
have to correspond (Figure 6).
INSTRUCTIONS:
1. To make join, go to ArcToolbox and click Data Management Tools Joins Add Join.
2. Select the dataset (Layer name) to which the table will be joined and select the field of this dataset on
which the join is based (Input join field).
3. Select the table which is going to be joined in the Join table field and select the Output join field of this
table on which the join is based. Click OK.
4. When the attribute table of the dataset to which a table is joined is opened, it can be seen that all fields of
the join table are appended into the dataset’s attribute table. The fields appear at the right hand side of the
table.
5. To remove the join, double click on the dataset name, select the tab Joins & Relates, select the join and
click Remove, or Remove all in the Joins field.
5.
a. Join the fields of the table ‘Landscape’ to the attribute table of dataset ‘Soil_types’ by common field
‘LU_CODE’.
Depending on the information it represents, a raster dataset may be created out of either integer values (whole
numbers) or floating point values (numbers with decimals). In ArcGIS, a raster dataset created out of integer
(discrete) values can have an associated raster cell value attribute table. The unique attribute value combinations are
saved in this table. Raster datasets created out of floating point (continuous) values will not have associated tables.
Discrete rasters represent geographic features that have definable boundaries, sometimes referred to as categorical
or discontinuous data. Examples of discrete terrain objects are: lakes, forests, buildings, roads etc.
Continuous rasters represent geographic phenomena that vary spatially without discrete steps. Each cell value is a
measure of the concentration or level of that location. Continuous geographic phenomena, in general, do not have
distinct boundaries like discrete geographic features. A geographic feature, such as a lake, has a real and definable
boundary. However a geographic phenomenon, like lake depth, continuously changes. Potentially, each cell in a
continuous raster can have a different value. Examples of geographic phenomena include contamination levels, heat
from a fire or elevation.
Important: Rasters are always rectangular. Every cell location in a raster has a value assigned to it.
When information is insufficient or unavailable for a cell location, the location will be assigned the value of
NoData. NoData and 0 are not the same: 0 is a valid value that can be used in geoprocessing whereas NoData is
excluded from geoprocessing.
INSTRUCTIONS:
1. Display the raster dataset in the view window. The colors are assigned to the raster cells based on the cell
value. Each value is symbolized with one color.
2. Click on the Identify tool to identify a cell value.
3. Click on a cell in the View Window.
6.
Activate the data frame ‘Vector vs. Raster’ and display the raster dataset ‘LU_raster’.
a. Use the Identify tool to explore the raster. Explain the meaning of the attributes of the dataset ‘LU_raster’ as
they are displayed in the Identify window.
c. Open the attribute table of ‘LU_raster’ (right click on dataset). How many different values (i.e. land use classes)
does the dataset of ‘LU_raster’ contain?
Cells in a discrete raster that share the same value represent the same type of geographic object. Clusters of
contiguous raster cells with the same value are called a region. A region represents a discrete geographic object, e.g.
a building or a lake. All regions with the same value make up a zone (Figure 8). Zones represent all geographic
objects with the same value, e.g. all buildings or lakes. Thus, the zones in a thematic raster dataset “Land_use” are
land use types.
Note that a region is the raster equivalent of a vector point, line or polygon feature: a discrete object that represents
one geographic feature.
Figure 8. Raster cells belong to zones and regions. This raster contains five zones. The zone with value 4 is made up
of three regions.
Of the two GIS data structures discussed in this exercise (raster and vector), the raster data structure provides the
most comprehensive modeling environment and operators for spatial analysis. ArcToolbox contains a
comprehensive toolset to perform cell-based (raster) operations. These tools can be found in the Spatial Analyst
Tools toolbox and will be discussed in Exercise 7.
7.
a. Do you select a zone or region when you select one record in the attribute table of ‘LU_raster’? Explain your
answer!
8.
a. Compare the attribute tables of the datasets ‘LU_raster’ and ‘Land_use’ (Figure 9) and write down the main
differences in data storage between the two tables.