SQL and Xquery Tutorial For Ibm Db2, Part 1
SQL and Xquery Tutorial For Ibm Db2, Part 1
Jessica Cao
Training Tools Developer
IBM
03 Aug 2006
This tutorial introduces readers to the Aroma database to apply Structured Query
Language (SQL) and XML Query (XQuery) knowledge to solve typical business
questions. Also included are detailed instructions for installing DB2® and the Aroma
database. This tutorial is Part 1 of the SQL & XQuery tutorial for IBM DB2 series.
All the examples in this document are based on Aroma, a sample database that
contains sales data for coffee and tea products sold in stores across the United
States. Each example consists of three parts:
For students using this as part of an academic class, obtain from your instructor the
instructions to connect to the Aroma database and learn about any differences
between the guide and your local set up.
This tutorial was written for DB2 Express-C 9 for UNIX®, Linux® and Windows®
(formerly known as Viper).
This tutorial also introduces the basics of SQL and suggests some typical business
questions that IBM DB2 9 can be used to answer.
System requirements
To use this tutorial to the fullest, you should have IBM DB2 9 installed. Download
DB2 Express-C, a no-charge version of DB2 Express Edition for the community that
offers the same core data features as DB2 Express Edtion and provides a solid base
to build and deploy applications. You will also need to install the Aroma database,
which can be downloaded from here.
3. Find setup.exe in the unzipped folder and double-click on it. The DB2
Setup Launchpad will appear.
4. Within the DB2 Setup Launchpad, click on Install a Product from the
panel on the left.
6. In the installer, click Next to view the License Agreement. You will need to
Accept the agreement to continue and click Next.
8. Confirm that you plan to "Install DB2 Universal Database Express Edition
on this computer and save my settings in a response file" and click Next.
11. In the "User Information" screen, set the Domain to None -- use local
user account using the pull-down menu, use db2admin as the User
name, and set a password.
(Optionally, you can select the LocalSystem account, but there are
limitations introduced if this option is used. Click the Help button to learn
more.)
12. Make sure the checkbox Use the same user name and password for
the remaining DB2 services is checked and click Next.
13. Click Next to skip the "Prepare the DB2 tools catalog" screen.
15. Verify that the Enable operating system security checkbox is checked.
Leave the DB2 administrators group as the default value DB2ADMNS and
leave DB2 users group as the default value DB2USERS. Click Next.
16. Click Finish to start the DB2 product installation when you reach the
"Start copying files and create response file" screen.
17. When the installation is complete, a window with the message "Setup is
complete" will appear. Click Finish to complete the installation procedure.
18. A window entitled "DB2 First Steps" will now appear. Click Create profile.
Your Web browser will open a page called DB2 First Steps. This tutorial
contains many useful links to information on DB2, but it will not be used
during this tutorial. Close your Web browser and continue with this
tutorial.
2. Start the Command Window from the Start menu ==> all Programs ==>
IBM DB2 ==> Command Line Tools ==> Command Window.
4. Two files, create.txt and report.txt, will be created. Edit report.txt and
compare to the expected results below:
TABLE RECORDS
--------------------- -----------
aroma.class 9
aroma.deal 9
aroma.line_items 182
aroma.market 17
aroma.orders 27
aroma.period 821
aroma.product 59
aroma.promotion 194
aroma.sales 69941
aroma.store 18
aroma.supplier 9
XML support
XML and basic XQueries will be introduced in Part 7 of this series. To support XML,
the aromadb database was created with the statement:
Note the specification of UTF-8 as the codeset for the database. The use of XML
features is restricted to a database that is defined with codeset UTF-8 and has only
one database partition. When creating your own databases in the future, don't forget
to include this parameter or your database won't support XML features!
Another set of tables contain data to understand the purchases of Aroma Coffee and
Tea Company. This set of tables is referred to as the purchasing group of tables and
can be used to understand the purchases from suppliers. This group of tables is not
used in any examples in this tutorial. It only exists in the Appendix to provide a more
complete picture of the Aroma Coffee and Tea Company.
The crow's feet in this diagram indicate a one-to-many relationship between two
tables. For example, each distinct value in the Perkey column of the Period table
can occur only once in that table but many times in the Sales table. Column names
in bold are primary-key columns. Column names in italics are foreign-key columns.
Column names in bold italics are primary and foreign-key columns.
Sample data from each table is presented in the following sections so you can see
how these primary-key to foreign-key relationships work.
A primary key is a unique key that is a part of the definition of the table. A table
cannot have more than one primary key, and the columns of a primary key cannot
contain null values. (Null values will be discussed later in detail. If a value is null, that
means there is a place to store the data, but nothing is there - this is different from a
blank.) Primary keys are optional and can be defined in CREATE TABLE or ALTER
TABLE statements. Primary keys are important because there can only be one
occurrence of each value; the same row will always be selected. There may be a
couple of cities with the same name; however, selecting the record based on the
primary key will ensure the same row is always selected.
A foreign key is specified in the definition of a referential constraint. A table can have
zero or more foreign keys. The value of the composite foreign key is null if any
component of the value is null. Foreign keys are optional and can be defined in
CREATE TABLE statements or ALTER TABLE statements.
The following table displays the first few rows of the Period table. The primary-key
column is the Perkey column:
The following table displays the first few rows of the Product table. The primary key
is a combination of the Classkey and Prodkey values:
Note: recall that a primary key is a unique key that is a part of the definition of the
table. Normally, a primary key is a single column in a table. In this case, the primary
key is a combination of two columns, which means that there can only be one
combination like this in a table.
The following table displays the first few rows of the Class table.
The following table displays the first few rows of the Store table (some columns
were truncated to fit on the page). The primary-key column is Storekey; Mktkey is a
The following table displays the first few rows of the Market table.
Promotion table
The following table displays the first few rows of the Promotion table. The
primary-key column is Promokey.
The Value column identifies the cash value of a given promotion. The user can
evaluate the cost of the promotions by summing these values.
Sales table
The following table displays the first 20 rows of the Sales table.
The last column, Comments, stores information about customer feedback using the
XML data type. XML is introduced in Part 7 of this series.
The Sales table contains a multipart primary key. Each of the key's five columns is a
foreign-key reference to the primary key of another table:
This primary key links the Sales data to the Period, Product, Store, and
Promotion tables.
Gold table
The Gold table contains information for "Gold Card" customers that allow them to
enjoy various discounts, free gifts, and e-mail reminders of promotional activities.
The Gold table will be created by the participant in Part 6 of this series, and will have
the following columns: unique card ID, store key, first name, last name, e-mail, and
status.
• SQL was devised specifically as part of the relational theory. You need a
language to talk to your database, to retrieve data from it, or to make
changes to the data.
• SQL is a non-navigational language. In other words, with SQL, you don't
have to tell DB2 to find a record, read its pointer, and follow that pointer to
a related row in the table. Simply tell DB2 what to do and DB2 figures out
how to handle the command. It's that simple!
• You can use SQL to modify aspects of a table without having to take the
table offline, add new columns to an existing table, or add business rules
to the definition of the table.
Easy
• What were the weekly sales of Lotta Latte brand coffee in San Jose
during last year?
• What were the average monthly sales of all coffee products in the West
during each month of last year?
Moderately difficult
• How do the sales of Lotta Latte in San Jose compare with its sales in Los
Angeles and New York?
• How has the monthly market share of Lotta Latte changed during the last
two years in all markets?
• Which suppliers charge the most for bulk tea products?
• What was the most successful promotion last December in California?
Very challenging business questions, which DB2 9 can be used to provide
answers.
• What were the running totals for Lotta Latte sales during each month of
last year?
• What were the ratios of monthly sales to total sales (expressed as
percentages) for Lotta Latte during the same period?
• Which ten cities had the worst coffee sales in 2004 with regard to dollar
sales and quantities sold?
• Which Aroma stores fall into the top 25 percent in terms of sales revenue
for the first quarter of 2005? Which stores fall into the middle 50 percent,
and which fall into the bottom 25 percent?
Numbers
All numbers have a sign and a precision. The precision is the number of bits or
digits, excluding the sign.
Character string
A character string is a sequence of bytes. The length of the string is the number of
bytes in the sequence. If the length is zero, the value is called the empty string.
Graphic string
A graphic string is a sequence of double-byte character data.
Binary string
A binary string is a sequence of bytes. It is used to hold nontraditional data, such as
pictures. Binary Large OBject (BLOB) is a binary string. See Large Objects (LOBs)
for more information.
Datetime values
Datetime values are representations of dates, times, and timestamps (a character
string of 14 digits that represents a valid date and time in the form
yyyyxxddhhmmss). Datetime values can be used in certain arithmetic and string
operations and are compatible with certain strings; however, they are neither strings
nor numbers.
Null value
The null value is a special value that is distinct from all non-null values. It means the
absence of any other value for that column in the row. The null value exists for all
data types.
XML
DB2 9 is the industry's first hybrid data server for managing data from both relational
and pureXML™ structures. DB2 9 has introduced an optimized data storage engine
for XML data alongside the existing relational engine.
XML is self-describing in that the meaning of the data is included in the record. The
Comments column uses the XML data type to store customer comments. See Part 7
of this series for more details.
You can run these interactively or you can save your statements in a script. To
execute these statements, you would click the Play icon in the upper-left corner of
the graphic (circled in Figure 2, below).
Statements can alternatively be run from the Command Line Processor (CLP). Open
the CLP from the Start menu ==> all Programs ==> IBM DB2 ==> Command Line
Tools ==> Command Line Processor.
Connecting to a database
You need to connect to a database before you can use SQL statements to query or
manipulate data. The CONNECT statement associates a database connection with
a user name.
To connect to the aromadb database, type the following command in the DB2
command line processor:
If you are using this guide as part of a classroom, ask your instructor for the userid
and password to connect to the database.
If no userid and password are required, or if you are using this in a self-study mode
and using your own computer, simply type the following command:
CONNECT TO aromadb
The following message tells you that you have made a successful connection:
Once you are connected, you can start using the database.
Contents of the DB2 Information Center are updated with each product or fixpak
release. Be certain you know which version of DB2 you are running to ensure that
you obtain the correct information. The DB2 Information Center becomes more
helpful the more that you take advantage of it. As you become familiar with the
options available, you can narrow your search more effectively and get the answers
to your questions more efficiently.
• You can easily access the DB2 Information Center from the Start menu
==> all Programs ==> IBM DB2 ==> Information ==> Information
Center.
• Alternatively, you could use this url:
http://publib.boulder.ibm.com/infocenter/db2luw/v9/index.jsp
Example
In Part 3 of this series you will encounter an error when you attempt to execute a
query that incorporated mismatched datatypes. You will receive the following error
message:
To better understand the meaning of this message, you may want to research the
details of the error and learn the corrective action you should take.
2. Enter the error code from the error message (SQL0401N) in the Search
box at the top left corner.
3. Click GO.
4. The result set displays two results. The first result is "100% SQL0401",
indicating that it is 100% relevant, while the second result is only 8%
relevant. Select the first one.
Section 8. Metadata
Metadata can be defined as data about the structure of the database itself. This data
includes information such as lists of the tables, indexes, columns, and more included
in the database.
You will often hear metadata called by other names, including catalog data or
system tables. Metadata is automatically collected by DB2 9 whenever a user takes
an action to create, delete, or change some aspect of the database.
The standards of storing data established for conventional SQL do not apply to the
storage of metadata. Each database provider sets up unique methods to store and
process the metadata. Knowing how to retrieve this information can be of
tremendous assistance in working with a database.
System tables
DB2 9 stores metadata in nearly 100 system tables. They are all contained in the
SYSIBM schema. This schema is created automatically each time a database is
created. System tables also follow a standard naming convention. Each table name
begins with SYS, as in SYSTABLES, SYSINDEXES, and SYSUSERAUTH. (For this
reason, user-defined tables cannot begin with the letters SYS.)
A user retrieves metadata from system tables in much the same way as data from
regular tables. SQL statements behave the same in both cases. Knowledge of the
structure of the tables is all that is needed. Given the appropriate authorizations, a
user can obtain a list of all of the system tables holding metadata by querying the
database.
Example query
SELECT name
FROM sysibm.systables
WHERE name LIKE 'SYS%';
Result
NAME
SYSATTRIBUTES
SYSBUFFERPOOLNODES
SYSBUFFERPOOLS
...
SYSTABLES
...
SYSXSROBJECTS
You may or may not be able to execute the above query, depending on permissions
you have been granted by the System Manager and/or Database Administrator.
If you execute the above query without the WHERE clause, you will obtain a list of
all tables in the database. The result set will be ordered by Creator (schema) and
alphabetically by table name within each Creator. The column name "Creator"
assumes that the schema identified for each table has the same name as its creator.
Example query
Result
NAME CREATOR
CLASS AROMA
... AROMA
SUPPLIER AROMA
... ...
SYSATTRIBUTES SYSIBM
... SYSIBM
SYSXMLSTATS SYSIBM
... ...
By constraining on the name of the schema, you can limit the result set to those
tables that are part of a given schema.
Example query
Result
NAME CREATOR
CLASS AROMA
... AROMA
SUPPLIER AROMA
Similarly, you can list information about the columns in a specific table by accessing
the sysibm.syscolumns system table.
Example query
SELECT * FROM sysibm.syscolumns WHERE tbname='CLASS' AND
tbcreator='AROMA';
Result
NAME TBNAME TBCREATOR REMARKS COLTYPE ...
CLASS_DESC CLASS AROMA CHAR ...
CLASS_TYPE CLASS AROMA CHAR ...
CLASSKEY CLASS AROMA INTEGER ...
Examine the contents of this table by executing the command. There are many more
columns in the result set than shown here.
Section 9. Summary
This tutorial introduced the retail group of tables and suggested some typical
business question that IBM DB2 9 can be used to answer. These tables were
created to show an environment where:
The most commonly used data types were described in this tutorial:
Numbers
Character String
Graphic String
Binary String
Datetime Values
Null Value
XML
Downloads
Description Name Size Download
method
Aroma Database Aroma_Data.zip 1MB HTTP
Resources
Learn
• Read the "DB2 XML evaluation guide" (developerWorks, June 2006), a
step-by-step tutorial introducing the reader to the DB2 Viper data server on
Windows platforms using the XML storage and searching (SQL/XML, XQuery)
capabilities available to support next-generation applications.
• Check out this article and "Get off to a fast start with DB2 Viper"
(developerWorks, March 2006).
• View this article series' "Appendix A" (developerWorks, August 2006).
• Learn how to "Query DB2 XML data with XQuery" (developerWorks, April
2006).
• Learn how to "Query DB2 XML data with SQL" (developerWorks, March 2006).
• Read the IBM Systems Journal and celebrate 10 years of XML.
• Refer to the SQL Reference, Vol 1 for additionanl information.
• Refer to the SQL Reference, Vol 2 for additionanl information.
• Refer to the DB2 information Center for troubleshooting.
Get products and technologies
• Now you can use DB2 for free. Download DB2 Express-C, a no-charge version
of DB2 Express Edition for the community that offers the same core data
features as DB2 Express Edtion and provides a solid base to build and deploy
applications.
Discuss
• Participate in the discussion forum for this content.
• Visit the DB2 9 On-line Support Forum.
Bruce Creighton
Bruce Creighton is a Skills Segment Planner in the Information
Management Education Planning and Development department. In this
role, he plans investment in educational content and balances the
investment between areas where IBM can attain revenue and those
where the requirement for skills development are important enough to
provide free education.
Jessica Cao
Jessica Cao is an Arts and Science and Computer Science student at
McMaster University. She expects to complete her combined honours
degree in April 2009. Jessica is working in IBM Toronto lab's DB2
Information Management Skills Channel Planning and Enablement
Program to take advantage of her interest in programming, editing, and
writing.
Trademarks
IBM, DB2, Universal Database, OS/2, and pureXML are registered trademarks of
IBM Corporation in the United States and other countries.
Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft
Corporation in the United States, other countries, or both.
UNIX is a registered trademark of The Open Group in the United States and other
countries.
Linux is a registered trademark of Linus Torvalds in the United States, other
countries, or both.