PostgreSQL 8 For Windows (Blum 2007-04-12) (9F4B1AA7)
PostgreSQL 8 For Windows (Blum 2007-04-12) (9F4B1AA7)
PostgreSQL 8 For Windows (Blum 2007-04-12) (9F4B1AA7)
for Windows
RICHARD BLUM
0-07-150949-6
The material in this eBook also appears in the print version of this title: 0-07-148562-7.
All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked
name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the
trademark. Where such designations appear in this book, they have been printed with initial caps.
McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate
training programs. For more information, please contact George Hoare, Special Sales, at [email protected] or (212)
904-4069.
TERMS OF USE
This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGraw-Hill”) and its licensors reserve all rights in and to the work.
Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one
copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, trans-
mit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the
work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be
terminated if you fail to comply with these terms.
THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS
TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK,
INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE,
AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not war-
rant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error
free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause,
in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed
through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive,
consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the pos-
sibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in
contract, tort or otherwise.
DOI: 10.1036/0071485627
Professional
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
For more information about this title, click here
CONTENTS
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Part I
Installation and Administration
X 1 What Is PostgreSQL? ...................................... 3
The Open Source Movement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
The History of PostgreSQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Comparing PostgreSQL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
PostgreSQL Versus Microsoft Access . . . . . . . . . . . . . . . . . . . . . . 8
PostgreSQL Versus Commercial DBMS Products . . . . . . . . . . . . 12
PostgreSQL Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Transaction Support ..................... . . . . . . . . . . . . . 13
ACID Compliant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Nested Transactions ..................... . . . . . . . . . . . . . 18
Sub-selects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Views . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Triggers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Support for Binary Large Objects (BLOBs) . . . . . . . . . . . . . . . . . 20
v
vi PostgreSQL 8 for Windows
Part II
Using PostgreSQL in Windows
X 5 The psql Program ......................................... 121
The psql Command-Line Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Connection Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
Feature Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
Using the Command-Line Options . . . . . . . . . . . . . . . . . . . . . . . 126
The psql Meta-commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
psql General Meta-commands . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
Query Buffer Meta-commands . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Input/Output Meta-commands . . . . . . . . . . . . . . . . . . . . . . . . . 132
Informational Meta-commands . . . . . . . . . . . . . . . . . . . . . . . . . . 133
Formatting Meta-commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
Copy and Large Object Meta-commands . . . . . . . . . . . . . . . . . . 136
The psqlrc.conf File . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . 138
Importing Data with psql . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . 139
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . . . . . 140
Part III
Windows Programming with PostgreSQL
X 12 Microsoft Access and PostgreSQL .............................. 265
Interfacing PostgreSQL with Access . . . . . . . . . . . ... . . . . . . . . . . . . . 266
Installing the ODBC Driver . . . . . . . . . . . . . ... . . . . . . . . . . . . . 266
Configuring a PostgreSQL ODBC Connection . . . . . . . . . . . . . . 269
Creating an Access Application Using PostgreSQL .. . . . . . . . . . . . . . 275
Data Type Considerations . . . . . . . . . . . . . . ... . . . . . . . . . . . . . 275
Designing an Application Database . . . . . . ... . . . . . . . . . . . . . 276
Setting Up the ODBC Session . . . . . . . . . . . ... . . . . . . . . . . . . . 277
Creating the Access Application . . . . . . . . . ... . . . . . . . . . . . . . 278
Using PostgreSQL Views in Access . . . . . . . ... . . . . . . . . . . . . . 282
Sharing the Application . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . 284
Exporting an Access Application to PostgreSQL . ... . . . . . . . . . . . . . 284
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... . . . . . . . . . . . . . 286
x PostgreSQL 8 for Windows
X 15 Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
The Java Development Environment . . . . . . . . . . . . . . . . . . . . . . . . . . 344
Downloading the Java SDK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 345
Installing the Java SDK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
Building a Java Program Using NetBeans . . . . . . . . . . . . . . . . . . 348
PostgreSQL JDBC Driver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350
Using JDBC in a NetBeans Application . . . . . . . . . . . . . . . . . . . . 352
Using JDBC in a Java Command-Line Application . . . . . . . . . . . 353
Java Database Connectivity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Starting a Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
Sending SQL Commands .................. . . . . . . . . . . . . 356
Using Parameters and Prepared Statements . . . . . . . . . . . . . . . . 362
Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
F
irst, all glory and praise go to God, who through His Son
makes all things possible, and gives us the gift of eternal
life.
Many thanks go to the great team of people at McGraw-Hill for their outstanding
work on this project. Thanks to Lisa McClain, Sponsoring Editor, for offering me
the opportunity to write this book. Also thanks to Alex McDonald, the original
Acquisitions Coordinator for the book, and to Mandy Canales, who took over from
Alex during the production of this book, for keeping things on track and helping
make this book presentable. I am forever indebted to Mike Wessler, the Technical
Editor, for his database expertise and guidance. Thanks Mike for catching my goofs,
and making suggestions for improvements throughout the book. Any leftover
mistakes are completely my fault. I would also like to thank Carole McClendon
at Waterside Productions, Inc. for arranging this opportunity for me, and for help-
ing out in my writing career.
Finally, I would like to thank my parents, Mike and Joyce Blum, for their dedi-
cation and support while raising me, and to my wife Barbara and daughters Katie
Jane, and Jessica for their love, patience, and understanding, especially while I was
writing this book.
xi
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
This page intentionally left blank
INTRODUCTION
D
atabases have become a necessity for almost any application.
The ability to store and quickly retrieve information is a hall-
mark of the personal computer revolution. Everything from
store inventories to bowling league scores is kept in databases, often
on personal computers.
For most Windows users, the word database is synonymous with the Microsoft Access
product. Microsoft Access provides a simple graphical interface for creating data tables,
and the reports necessary to view the data. However, Access has its limitations, espe-
cially in a multi-user environment. This book shows how to overcome these limitations
by using the PostgreSQL Open Source database software.
OVERVIEW
While a mainstay in the Linux world, Open Source software is slowly starting to make
inroads into the Microsoft Windows world. Windows users and developers can now
download and install many Open Source applications compiled specifically for the Win-
dows environment. Starting with version 8.0, the PostgreSQL database server package
includes an easy-to-install Windows version. Now any Windows user and developer can
incorporate PostgreSQL’s commercial-quality database features at no cost.
This book describes the PostgreSQL database server, and how to use it in a Windows
environment. If this is your first time using a large-scale database server, you will be
amazed at how easy it is to create and manage your own database server. You will quickly
xiii
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
xiv PostgreSQL 8 for Windows
see the benefits of moving your databases from an Access database to a PostgreSQL data-
base server. You can even keep your Access applications while utilizing the PostgreSQL
database server to control your data.
If you are a seasoned Windows database administrator, you may be pleasantly sur-
prised at the features and resources available in PostgreSQL. PostgreSQL provides both
commercial-quality database features, such as transactions, triggers, and stored proce-
dures, and programming interfaces for all of the common programming languages used
in the Windows environment. This book shows detailed examples of how to create pro-
grams in several common Windows programming languages that can access a PostgreSQL
database server.
Chapter 9, “Stored Procedures and Triggers,” dives into the complicated world of
creating functions that automatically execute based on database events, such as inserting
or deleting data from a table.
Chapter 10, “Security,” covers the important aspects of protecting your database data
and tracking user access to your data.
Chapter 11, “Performance,” closes out the section by providing some information
and tips on how to monitor and possibly increase the performance of your PostgreSQL
server.
The last section of the book, “Windows Programming with PostgreSQL,” is intended
to show developers how to access and use a PostgreSQL database server in various Win-
dows programming environments.
Chapter 12, “Microsoft Access and PostgreSQL,” provides detailed instructions on
how to access a PostgreSQL database from a Microsoft Access application. Instructions are
also provided on how to covert an existing Access database application to a PostgreSQL
server, and how to use an existing Access application with a PostgreSQL database.
Chapter 13, “Microsoft .NET Framework,” demonstrates how to create .NET applica-
tions using Visual Basic .NET and C# that can access data on a PostgreSQL server. Details
on how to install and use the PostgreSQL Npgsql library are shown.
Chapter 14, “Visual C++,” helps more advanced programmers who are comfortable
with the Microsoft Visual C++ product to interface their programs with a PostgreSQL
server. The PostgreSQL libpq library is presented, showing how to install and use the
library with Visual C++ programs.
Chapter 15, “Java,” walks Java programmers through the steps required to use the
PostgreSQL JDBC driver to access a PostgreSQL server from a Java application. Both the
Java command-line interface and the Java NetBeans graphical development environ-
ment are demonstrated.
1
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
This page intentionally left blank
1
What Is PostgreSQL?
3
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
4 PostgreSQL 8 for Windows
T
here have always been a handful of different commercial database systems avail-
able for Microsoft Windows users and developers to choose from. The choices
vary widely, from simple user database systems such as Microsoft’s Access or
FoxPro to more advanced systems such as Microsoft’s SQL Server, IBM’s DB2, or the
Oracle suite of database software packages. However, now there’s yet another player in
the Microsoft database world.
If you are new to Open Source software, you may not have ever heard of the Post-
greSQL database system. It has been around in the Unix and Linux worlds for quite
some time, gathering quite a following of users and developers. Unfortunately, in earlier
versions of PostgreSQL you had to be pretty knowledgeable and computer-savvy to get
PostgreSQL to work on a Windows platform. This left PostgreSQL as an unknown for
most Windows database users. However, as of PostgreSQL version 8, installing and run-
ning PostgreSQL in Windows is a snap. Now any Windows developer and common user
can create professional databases using the high-quality, free PostgreSQL package.
This chapter introduces PostgreSQL, and explains the myriad of features available
that make it a great choice for both Windows application developers and normal Win-
dows users when creating database applications. You will see that just because a soft-
ware package is free doesn’t mean that it cannot compete with high-quality, expensive
commercial products.
There are many different types of licenses that Open Source software is released
under. The most popular is the GNU General Public License (GPL). The GNU organization
(www.gnu.org) supports Open Source software projects, and has published the GPL as a
guide for how Open Source projects should be licensed to the public. If you have had any
dealings with the popular Linux operating system, no doubt you have heard of the GPL.
The GPL stipulates that any changes made to an Open Source project’s code must be pub-
licly published and available at no cost. While this is great for hobbyists and academics, it
can cause problems for commercial organizations wanting to use Open Source code.
The developers of PostgreSQL have decided to release PostgreSQL under a slightly
different Open Source license. PostgreSQL uses the BSD license, developed at the Univer-
sity of California (UC), Berkeley for public projects. This license is less restrictive than the
GPL. It allows organizations to modify the code for internal use without being bound to
publicly release the changes. This allows corporations (and private users as well) to use
PostgreSQL however they want. This has provided a catalyst for many companies to use
the PostgreSQL database as an internal database engine for many different commercial
applications, as well as using PostgreSQL as the back-end database for some web sites.
Under the BSD license, the developers of PostgreSQL are able to provide PostgreSQL
free of charge at the same time that a few companies provide their versions of Post-
greSQL as a for-profit commercial product. If you want to use PostgreSQL as-is on your
own, you are free to download it and use it for whatever purposes you want. If you want
to use PostgreSQL for a high-visibility production application that requires 24-hour sup-
port, you are able to purchase it from a company that provides such services. This is the
best of both worlds.
Order Table
Customer ID Product ID Quantity Cost
0001 LT0001 10 5,000
being a separate row in the Customer table. Similarly, product data is stored in a sepa-
rate Product table. The Product table contains detailed information about each product,
including a unique product ID, with each product being a separate row of data in the
Product table. This is demonstrated in Figure 1-1.
As shown in Figure 1-1, to track orders, database programmers create a separate
Order table using the unique IDs from the Customer and Product tables. The Order table
relates a customer to the products that are bought. This relationship shows that a single
customer can be related to multiple product orders, but each product order belongs to a
single customer.
Ingres was one of the first database products available to handle these types of data
relationships. With its success, Ingres quickly became a commercial product, and Dr. Stone-
braker started working on another database system. Postgres was started in a similar man-
ner as Ingres, attempting to prove the academic theory of object-relational databases.
Object-relational databases take relational databases one step further. In object-
oriented programming, data can inherit properties from other data, called a parent. The
object-oriented principle of inheritance is applied in object-relational databases. Tables
can inherit fields from base tables (also called parent tables). For example, a database
table of cars can inherit properties (fields) from a parent table of vehicles. This is demon-
strated in Figure 1-2.
Chapter 1: What Is PostgreSQL? 7
Vehicle Table
Vehicle ID Doors Wheels Weight
Inherited Fields
Car Table
Make Model Engine Size Vehicle ID Doors Wheels Weight
Truck Table
Make Model Engine Size Load Capacity Vehicle ID Doors Wheels Weight
Since cars are a type of vehicle, they inherit the properties (or in this case database
fields) of their parent, the Vehicle table. When inserting data into the Car table, you can
also specify values from the Vehicle table. Querying the Car table will return fields from
both the Vehicle and Car tables. However, querying the Vehicle table only returns fields
from that table, not the Car table.
After several years of development work on Postgres, the database package came
upon a major change. A couple of Dr. Stonebraker’s students modified Postgres by add-
ing the Structured Query Language (SQL) interface (early versions of Postgres used their
own data query language). In 1995 this package was re-released as Postgres95. Due to
the rising popularity of SQL, the Postgres95 release helped Postgres migrate into the
mainstream of database products.
It was clear that they had another hit product on their hands. Instead of going com-
mercial, in 1996 the Postgres95 project team broke off from UC Berkeley and started life
as an Open Source project, open to the world to modify. At the same time, to empha-
size its newfound SQL capabilities, Postgres95 was renamed PostgreSQL (pronounced
post-gres-Q-L). Also, to emphasize its past history, the first Open Source version of Post-
greSQL was labeled as version 6.0.
Vast improvements have been made to PostgreSQL since its first release in 1996.
Many modern database features have been added to make each release of PostgreSQL
faster, more robust, and more user-friendly. For Windows users, the biggest PostgreSQL
feature appeared in 2005 with the release of version 8.0.
Prior to version 8.0, PostgreSQL lived its life primarily in the Unix world. Developers
wanting to experiment with PostgreSQL on a Windows platform had to perform some
8 PostgreSQL 8 for Windows
amazing feats of code compilation to get it to even work halfway. This prevented most
ordinary Windows users from being able to utilize PostgreSQL’s advanced features. This
all changed in version 8.0.
Starting with version 8.0, PostgreSQL has incorporated a complete version for Windows,
including an easy installation program. Suddenly, installing PostgreSQL on a Windows
workstation or server is as easy as installing any other Windows software package.
Since its release to the Windows platform, PostgreSQL has been bundled with several
Windows-based GUI administration and utility tools to help Windows developers work
with PostgreSQL. The pgAdmin program provides a fully graphical environment for
database administration. An administrator can create databases, tables, and users simply
with mouse clicks. Similarly, the pSQL program provides a command-line interface (CLI)
for users and administrators to enter SQL commands to databases, and view results.
Also, not to forget Windows developers, the PostgreSQL community has provided
programming interfaces to access PostgreSQL databases from common Windows pro-
gramming languages. Developers have produced an Open Database Connectivity
(ODBC) driver for PostgreSQL, which provides a common interface for all applications
that utilize ODBC database connectivity. Similarly, application program interfaces (APIs)
for the .NET and Java programming environments were developed to allow .NET and
Java programmers direct access to the PostgreSQL server. These features provide a wealth
of possibilities for Windows programmers wanting to work with PostgreSQL.
COMPARING POSTGRESQL
As mentioned earlier, the Windows user has a vast selection of database products to
choose from. You may be asking why you should choose PostgreSQL over any of the
other products. This section helps clarify where PostgreSQL fits into the Windows data-
base product world. Hopefully you will see how PostgreSQL competes against all of the
other Windows database products, and choose to use PostgreSQL in your next Windows
database project.
While there are different types of DBMS packages, they all basically contain the following
parts:
X A database engine
R One or more database files
R An internal data dictionary
W A query language interface
The database engine is the heart and brains of the DBMS. It controls all access to the
data, which is stored in the database files. Any application (including the DBMS itself) that
requires access to data must go through the database engine. This is shown in Figure 1-3.
As shown in Figure 1-3, queries and reports talk to the database engine to retrieve
data from the database files. The database engine is responsible for reading the query,
interpreting the query, checking the database file based on the query, and producing the
results of the query. These actions are all accomplished within the program code of the da-
tabase engine. The interaction between the database engine and database files is crucial.
The internal data dictionary is used by the database engine to define how the data-
base operates, the type of data that can be stored in the database files, and the structure
of the database. It basically defines the rules used for the DBMS. Each DBMS has its own
data dictionary.
If you are a user running a simple database on Access, you probably don’t even real-
ize you are using a database engine. Access keeps much of the DBMS work under the
hood and away from users. When you start Access, the database engine starts, and when
you stop Access, the database engine stops.
Database
Query
Database Engine
Report
Data Dictionary
Database Files
Database Management System
In PostgreSQL, the database engine runs as a service that is always running in the
background. Users run separate application programs that interface with the database
engine while it’s running. Each application can send queries to the database engine, and
process the results returned. When the application stops, the PostgreSQL database en-
gine continues to run in the background, waiting for the next application to access it.
Both Access and PostgreSQL require one or more database files to be present to hold
data. If you work with Access, no doubt you have seen the .mdb database files. These
files contain the data defined in tables created in the Access database. Each database has
its own data file. Copying a database is as easy as copying the .mdb file to another loca-
tion. Things are a little different in PostgreSQL.
In PostgreSQL the database files are tied into the database engine, and are never
handled by users. All of the database work is done behind the database engine, so sepa-
rating data files from the database engine is not recommended. To copy a PostgreSQL
database, you must perform a special action (called an export) to export the database data
to another database.
This shows a major philosophical difference between Access and PostgreSQL. The
difference between the two products becomes even more evident when you want to
share your data between multiple users.
In the Access environment, if two or more people want to share a database, the da-
tabase .mdb file must be located on a shared network drive available to all users. Each
user has a copy of the Access program running on the local workstation, which points to
the common database file. This is shown in Figure 1-4.
.mdb file
Where this model falls apart is how queries or reports are run from the separate
workstations. Since the Access database engine is part of the Access program, each user
is running a separate database engine, pointing to the same data file. This can have di-
sastrous effects, especially on the Local Area Network (LAN).
Each query and report requires the database engine to search through the database
files looking for the appropriate data. When this action occurs on a local workstation, it’s
not too big of a deal. When this action occurs across a LAN, large amounts of data are
continually passed between the database engine and database files through the network.
This can quickly clog even the most robust network configurations, especially when ten
or more users are actively querying a database, and even more so as Access databases
become large (remember, the database engine must check lots of records for the query
result, even if the query matches only one record).
In the PostgreSQL model, the database engine and database files are always on the
same computer. Queries and reports are run from a separate application program, which
may or may not be located on the same computer as the database engine. A multiuser
PostgreSQL environment is demonstrated in Figure 1-5.
Here, the PostgreSQL database engine accepts data requests from multiple users
across the network. All of the database access is still performed on the local computer
running the PostgreSQL database engine. The query and report code transmitted across
the LAN is minimal. Of course, for large data queries the results sent back across the
network can be large, but still not nearly as large as in the Access environment.
PostgreSQL Server
Database Files
Database Engine
If you are using Access in a multiuser environment, it should be easy to see that
Access will not perform as well as PostgreSQL when you get more users. You can scale
PostgreSQL to however many users you need to support. Since PostgreSQL can run on
many different platforms, you can even build your database using PostgreSQL on a
Windows workstation, then easily migrate it to use PostgreSQL running on a powerful
Unix server. The PostgreSQL databases will migrate from one server to another with
minimal effort. This allows you greater flexibility when expanding office applications.
This feature alone makes PostgreSQL a better database choice in a multiuser data-
base environment. However, with its advanced object-relational database features, Post-
greSQL can also outperform Microsoft Access even in simple single-user database proj-
ects. If you are considering a multiuser database application, I would strongly encourage
you to give PostgreSQL a try. If you are just toying around with a single-user database
project, you can still test out PostgreSQL and see if its features can help you out.
Besides the hardware limitations, some of these packages put limitations on the soft-
ware features available in the free version. For example, Microsoft SQL Server Express
does not allow you to import or export data from the database. This limitation alone
prevents it from being used as a serious production database.
In contrast, with PostgreSQL you get the complete package for free. There are no
limitations to how many CPUs, the amount of memory, or the database size you can use.
You may be thinking that there must be some catch. Perhaps the full versions of the Open
Source packages can’t compete with the free versions of the commercial packages. That
is not true.
The PostgreSQL database product has most of the same features as the commercial
products. Most users and developers won’t be able to tell the difference. In fact, Post-
greSQL has some features that the commercial packages don’t include. The next section
describes these features.
POSTGRESQL FEATURES
If you go to the PostgreSQL web site (www.postgresql.org), you will see a list of all the
database features supported by PostgreSQL. To the normal computer user, this list can
look like a course list for an advanced programming degree. This section walks through
some of the advanced features PostgreSQL supports, and explains just exactly what each
one means for the common database user.
Transaction Support
All DBMS packages allow users to enter database commands to query and manipu-
late data. What separates good DBMS packages from bad ones is the way they handle
commands.
The DBMS database engine processes commands as a single unit, called a transaction.
A transaction represents a single data operation on the database. Most simplistic DBMS
packages treat each command received, such as adding a new record to a table or modi-
fying an existing record in a table, as a separate transaction. Groups of commands create
groups of transactions. However, some DBMS packages (including PostgreSQL) allow
for more complicated transactions to be performed.
In some instances, it is necessary for an application to perform multiple commands
as a result of a single action. Remember, in relational databases tables can be related to
one another. This means that one table can contain data that is related (or tied) to the data
in another table. In the store example earlier, the Order table relied on data in both the
Customer and Product tables. While this makes organizing data easier, it makes manag-
ing transactions more difficult. A single action may require the DBMS to update several
data values in several different tables.
In our store example, if a new customer comes into the store and purchases a lap-
top computer, the DBMS must modify three tables. First, the Customer table must be
updated with the information of the new customer. Second, the Order table must be
14 PostgreSQL 8 for Windows
modified to reflect the new order for the laptop. Finally, the Product table must be modi-
fied to show that there is now one less laptop in the store inventory. In an advanced
DBMS package (such as PostgreSQL), all of these steps can be combined into a single
database transaction, which represents the activity of a customer purchasing a laptop.
Of course, with a multistep transaction there are more opportunities for things to go
wrong. The trick for any DBMS is to know how to properly handle transactions. This is
where the database ACID test comes in.
ACID Compliant
Over the years, database experts have devised rules for how databases should handle
transactions. The benchmark of all professional database systems is the ACID test. The
ACID test is actually an acronym for a set of database features defining how a professional-
quality database should support transactions. These features are as follows:
X Atomicity
R Consistency
R Isolation
W Durability
The ACID tests define a set of standards for ensuring that data is protected in all
circumstances. It is crucial for databases to protect data at all cost. Invalid or lost data
can render a database useless. The following sections describe each of the features of the
ACID test.
Atomicity
The atomicity feature states that for a transaction to be considered successful, all steps
within the transaction must complete successfully. For a single command transaction,
this is no big deal. The trick comes when handling transactions that contain multiple
commands.
In atomicity, either all of the database modification commands within the transaction
should be applied to the database, or none of them should. A transaction should not be
allowed to complete part-way.
In our store example, it would be a huge problem if the Order table is updated to
reflect a purchase without the Product table inventory field being updated to reflect the
number of items purchased. The store would have one less laptop in inventory than
what the database thought was there.
To support atomicity, PostgreSQL uses a system called commit and rollback. Database
modifications are only temporarily performed during a transaction. When it appears that
all of the modifications in a transaction would complete successfully, the transaction is
committed (all of the data in the affected tables is modified per the transaction commands).
If it appears that any of the modifications in the transaction would fail (such as an item not
being in the Product table), the transaction is rolled back (any previous steps that were suc-
cessful are reversed). This ensures that the transaction is completed as a whole.
Chapter 1: What Is PostgreSQL? 15
The two-phase commit approach allows PostgreSQL to test all transaction commands
during the prepare phase without having to modify any data in the actual tables. Table
data is not changed until the commit phase is complete.
Consistency
The concept of consistency is a little more difficult than atomicity. The consistency fea-
ture states that every transaction should leave the database in a valid state. The tricky
part here is what is considered a “valid state.” For most simple databases, this is not an
issue. Transactions that update or modify simple tables are usually not a problem.
Often this feature is used when advanced rules or triggers are present in a data-
base for defining how data is stored (we will talk more about these in the “Rules” and
“Triggers” sections later in this chapter). For now, it is sufficient to know that rules and
triggers are internal database functions that occur based on a specific activity on data in
a table.
Developers create triggers to ensure that data is entered into the database correctly,
such as ensuring that each record in the Customer table contains a valid phone number.
If a customer record is added to the Customer table without a phone number entry, a
trigger can cause the record to be rejected by the DBMS.
Consistency states that all rules and triggers are applied properly to a transaction.
If any rule or trigger fails, the transaction is not committed to the database. For our ex-
ample, if a store clerk attempts to add a new customer record without a phone number,
the trigger would prevent the record from being added, causing the transaction to fail,
thus preserving the integrity of the customer record.
Consistency can also be applied to multiple tables. For example, a developer can cre-
ate a rule for the Order table that automatically updates a Billing table with the cost of a
customer’s order. What would happen if an order was inserted into the Order table, but
the database system crashed before the rule could update the Billing table? Free products
are good for customers, but a bad way to do business for the store.
To meet the ACID consistency test, an entry into the Order table should not be made
until it is certain that the database rule creating an entry in the Billing table was com-
pleted. This ensures that the data in the two tables remains consistent.
Isolation
The isolation feature is required for multiuser databases. When there is more than one
person modifying data in a database, odd things can happen. If two people attempt
to modify the same data value at the same time, who is to say which value is the final
value?
16 PostgreSQL 8 for Windows
When more than one person attempts to access the same data, the DBMS must act as
a traffic cop, directing who gets access to the data first. Isolation ensures that each trans-
action in progress is invisible to any other transaction that is in progress. The DBMS must
allow each transaction to complete, and then decide which transaction value is the final
value for the data. This is accomplished by a technique called locking.
Locking does what it says; it locks data while a transaction is being committed to the
database. While the data is locked, other users are not able to access the data, not even for
queries. This prevents multiple users from querying or modifying the data while it is in a
locked mode. There are two basic levels of locking that can be performed on table data:
X Table-level locking
W Record-level locking
Early DBMS implementations used table-level locking. Any time a user required a
modification to a record in a table, the entire table was locked, preventing other users
from even viewing data in the table. In some database implementations the lock pro-
duces an error event, while in others, the database engine just waits its turn in line to
access the data. It’s not hard to see that this method has its limitations. In a multiuser
environment, it would be frustrating to be continually locked out of your database table
while updates were being made by other users.
To help solve the table-level locking problem, most modern DBMS packages use
record-level locking. This method allows access to most of the table; only the record that
contains the value being modified is locked. The rest of the table is available for other
users to view and even modify.
Although using record-level locking helps, it still does not solve the problem of when
two users want to modify the same data at the same time. PostgreSQL, however, takes
record locking a step further. PostgreSQL uses a technique called Multiversion Concur-
rency Control (MVCC).
MVCC uses a sophisticated locking system that, to the user, does not appear to lock
records at all. To accomplish this, PostgreSQL maintains multiple versions of records that
are being updated. If an update is made to a record that is currently in use, PostgreSQL
keeps the new (updated) version of the record on hold, allowing queries to use the cur-
rent version of the record. When the record becomes available, PostgreSQL applies the
new version to the record, updating the table. If multiple updates are being made on
a record, PostgreSQL keeps each version on hold, and applies the latest version to the
record. To users and application programs, at least some version of the record is always
available.
This feature in itself allows for other features to be included in PostgreSQL. Since
no records are ever locked, a backup copy of any table can be made without stopping
the DBMS. This technique is called online backups (also called hot backups). It ensures that
every database backup contains a copy of every record in the table, even the ones cur-
rently in use. Not having to take a database down for backups is a great feature for high-
demand production environments that do business 24 hours a day.
Chapter 1: What Is PostgreSQL? 17
Durability
The durability feature is a must for a modern-day DBMS. It states that once a transaction
is committed to the database, it must not be lost. While this sounds like a simple concept,
in reality durability is often harder to ensure than it sounds.
Durability means being able to withstand both hardware and software failures.
A database is useless if a power outage or server crash compromises the data stored in
the database.
The basic feature for durability is obviously good database backups. As was men-
tioned in the “Isolation” section, PostgreSQL allows administrators to back up databases
at any time without affecting users.
However, databases are usually only backed up once a day, so what about protecting
transactions that occur during the day? If a customer comes into the store in the morning
to order a new laptop, you wouldn’t want to lose that information if the database server
crashes that afternoon before the evening backup.
While it is impossible to account for every type of disaster, PostgreSQL does its best
to prepare for them. To solve this situation, every transaction that operates on the data-
base is placed into a separate log file as the database engine processes it. This is demon-
strated in Figure 1-6.
TRANSACTION
INSERT into Customer VALUES (“0002”, “Blum”,
“Barbara”, “123 Main St.”, “Gary”, “IN”,“46100”,
“555-1234”)
Transaction Log
INSERT into Customer ...
Customer Table
Customer ID Last Name First Name Address City State Zip Phone
0001 Blum Rich 123 Main St. Gary IN 46100 555-1234
0002 Blum Barbara 123 Main St. Gary IN 46100 555-1234
The log file only contains transactions made to the database since the last database
backup. If for some reason the database becomes corrupted before a new backup, the ad-
ministrator can restore the previous backup copy, and then apply the transactions stored
in the log file to bring the database back to where it was before the crash. When a new
backup is complete, the database engine clears the log file and starts adding any new
transactions. As the log file fills up, a new log file is started, as long as there is available
disk space on the hard drive.
Nested Transactions
Nested transactions are an advanced database concept that can further help isolate prob-
lems in transactions. While the example transactions shown so far are pretty simplistic,
in real-life databases transactions can become quite complicated. It is not uncommon to
run across applications where a single transaction must update dozens of tables.
Sometimes in these larger environments a single transaction will spawn child trans-
actions that update tables separate from the parent transaction. The child transactions
are separate from the main parent transaction, but nonetheless are part of an overall
transaction plan. In these cases the overall result of the parent transaction is not depen-
dant on the result of the child transaction. If a child transaction fails, the parent transac-
tion can continue operating.
In nested transactions, a child transaction can be separated from a parent transaction
and treated as a separate entity. If the child transaction fails, the parent transaction can
still attempt to complete successfully. PostgreSQL allows developers to use nested trans-
actions in complex table modifications.
Sub-selects
A sub-select, also called a sub-query by some DBMS packages, provides a method for
chaining queries. In a normal query, users query data contained in a single table. An
example of this would be to search for all the store customers that live in Chicago. In a
simple query, the user requests data from a table that matches a specific criterion based
on data contained in the same table.
A sub-select allows the user to query data that is a result of another query on a sep-
arate table. This provides for querying multiple tables based on complex criteria. An
example of a sub-select would be to create a query for all customers located in Chicago
who purchased a laptop in the last month. This would require performing a query on
data contained in two separate tables. The sub-select feature allows the database user to
perform these complex queries using a single query command. PostgreSQL allows users
to create complex queries, often saving additional steps in the query process.
Views
As we saw in the preceding section, developers can create complex queries to extract
data from multiple tables. However, for queries that span more than a couple of tables, a
sub-select can become overly complex.
Chapter 1: What Is PostgreSQL? 19
Order Table
Customer Product
Quantity Cost
ID ID
Customer ID Last Name First Name Product ID Product Name Quantity Cost
Rules
PostgreSQL allows you to use complex rules in the database structure. As mentioned
earlier, under the consistency test, a rule performs a function on one or more tables based
on an event occurring in a table. Developers use rules when they need to modify data
in more than one table based on a single action. The example of updating a Billing table
based on adding a record to the Order table is a good example. The rule is responsible for
adding the record to the Billing table whenever a record is added to the Order table.
In PostgreSQL there are two types of rules:
X Do rules
W Do instead rules
Do rules are commands that are performed in addition to the original command sub-
mitted by the database user. Do instead rules replace the original command submitted
by the user with a predetermined set of rules. Do instead rules provide a great tool for
the database administrator to control what users can do to data in the database. Often
rules are created to prevent users from manipulating records they shouldn’t be messing
with.
Triggers
Besides rules, PostgreSQL also supports triggers. A trigger is a set of instructions that is
preformed on data based on an event in the table that contains the data. There are three
types of table events that can cause a trigger to activate:
A trigger differs from a rule in that it can only modify data contained in the same
table that is being accessed. Triggers are most often used to check or modify data that is
being entered into a table, such as the earlier example of ensuring each customer record
contains a phone number.
PostgreSQL uses a special data type called the Binary Large Object (BLOB) to store
multimedia data. A BLOB can be entered into a table the same as any other data type.
This allows developers to include support for multimedia storage within applications.
Caution should be taken, though, when using BLOBs, as they can quickly fill a database
disk space as the BLOB images are stored in the table.
User-Defined Types
If BLOBs don’t get you what you want, PostgreSQL also allows you to roll your own
data types. Creating your own data types is not for the faint of heart. It requires creating
C language subroutines defining how PostgreSQL handles the user-defined data type.
Functions must be created for defining how data is both input into the system by the
user, and output by the system. The output function must be able to display the user-
defined data type as a string. The input function accepts string characters from the user
and converts them into the user-defined data type.
The most common example used for a user-defined data type is complex numbers.
A complex number consists of a pair of floating-point numbers, representing the X and Y
value (such as the value (3.25, 4.00)). The C language input function converts the string
representation of the value into the appropriate floating-point values. Likewise, the out-
put function converts the floating-point values into the string representation.
Roles
Of course, a huge factor in any DBMS package is security. Different tables often require
different access levels for users. Data in a DBMS is protected by requiring each user to
log into the DBMS using a specific userid. The DBMS data dictionary maintains a list of
userids, tables, and access levels. Access to data in individual tables is controlled by the
security list. As many database administrators will attest, in an organization with lots of
people coming and going, trying to maintain database security can be a full-time job.
To help database administrators perform this function, PostgreSQL uses a concept
called roles. Roles allow the database administrator to assign access privileges to a generic
entity instead of assigning table rights directly to userids. The database administrator can
create separate roles for different types of access to different tables, as shown in Figure 1-8.
In Figure 1-8, separate roles are defined for each type of access required for the tables.
The Salesperson role allows read access only to the Customer and Product tables. The
Accountants role allows read access to the Product table, plus write access to the Cus-
tomer and Billing tables. Once the roles are created, a database administrator can assign
individual user accounts to the appropriate role, depending on the access required by the
user. If user Fred is an accountant, he is added to the Accountant role. If user Barney is a
salesperson, he is added to the Salesperson role. If Barney takes night classes, then trans-
fers to become an accountant, rather than have to figure out what access rights should
be added or deleted, all the database administrator needs to do is move Barney’s user
account from the Salesperson role to the Accountant role. Barney automatically has the
appropriate accesses he needs to be an accountant. This feature makes life much easier
for database administrators.
22 PostgreSQL 8 for Windows
Customer Table
Fred
read-only
Barney
Wilma
Salesman read-only
Role Product Table
Betty
write access
read-only
Accountant
Role
Billing Table
write access
Table Partitioning
Table partitioning is a relatively new database concept that not all databases support. It
allows a database administrator to split a single large table into multiple smaller tables.
The database engine still treats the smaller tables as a single logical table, but directs
queries and updates to the appropriate smaller table that contains the pertinent data.
This allows queries to be performed quicker, since they can be performed in parallel on
several small tables, rather than having to trudge through a single large table searching
for data.
It is common to partition data based on a physical attribute of the data, such as dates.
All data for a specific time period, such as a fiscal quarter, is stored in the same partition.
Queries requesting data for a specific quarter then only need to search the appropriate
partition instead of the entire table.
Another benefit to table partitioning is table access speeds. Once the logical table is
divided into smaller physical tables, the database engine can store each table piece in a
separate location on the server. This allows the database engine to migrate sections of the
table that are not used much to slower disk resources, while keeping more active sections
of the table on quicker disk resources. This is shown in Figure 1-9.
Chapter 1: What Is PostgreSQL? 23
Disk 1
Customer Table Partition B
Disk 2
Partitions can also be migrated off of disk storage as the data on them is no longer
needed. It is common to have a rotation system where older partitions are moved to tape
for long-term storage.
Of course, creating table partitions does produce some overhead. The point at which
using a table partition outweighs the overhead is a hotly debated topic in the database
world. The rule of thumb is to partition a table when its size becomes larger than the
amount of memory available to the DBMS. At this point the DBMS can no longer load the
entire table into memory to perform operations, and must swap pieces out to the hard
disk while it works.
PostgreSQL uses the object-relational property of table inheritance to implement ta-
ble partitioning. It does this by creating child tables as table partitions of a single parent
table. The parent table contains all of the required fields for the table, but no data. Each
child table contains the same fields as the parent table, but contains a different data set.
There are two methods to partition data between the child tables:
X Range partitioning
W List partitioning
With range partitioning, data is divided into separate ranges based on a key value in
the table. Each range of data is stored in a separate child table (partition). This is extremely
24 PostgreSQL 8 for Windows
convenient for data that is date based. By setting up child tables based on specific date
ranges, partitions containing older data can easily be migrated to slower disk storage.
With list partitioning, data is divided into separate partitions not based on any order.
This can come in handy if you want to partition a table based on data groups instead of
ranges, such as partitioning customers based on their cities. Each city can have its own
table partition. A list is maintained for each table listing which key values appear in
which partition.
SUMMARY
While relatively new to the Microsoft Windows world, PostgreSQL has made quite a
name for itself in the Unix world as a robust, professional-quality database system. Now
with version 8.0, PostgreSQL has native support for the Windows platform, allowing
Windows users and developers to take advantage of its unique features. PostgreSQL
differs significantly from the popular Microsoft Access database system. PostgreSQL
provides many features not found in Microsoft Access, such as table partitioning. Post-
greSQL also provides an easy migration path, allowing you to easily migrate your da-
tabase from a Windows workstation to a Unix server. Of course, one of the best features
about PostgreSQL is that it is Open Source software and available for free.
The next chapter discusses what type of hardware you will need to run a PostgreSQL
database, as well as show how to download and install the latest version from the Post-
greSQL web site.
2
Installing PostgreSQL
on Windows
25
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
26 PostgreSQL 8 for Windows
N
ow that you have made the decision to use PostgreSQL, you will need to get
it running on your Windows system. PostgreSQL 8 supports many different
Windows platforms and hardware configurations. Your job is to determine
which platform and configuration is best for you.
This chapter walks through the decisions that you must make before installing Post-
greSQL. If you only have one Windows system available to run PostgreSQL on, you don’t
have much of a choice (other than knowing if your system can support PostgreSQL).
However, if you are in the market for purchasing a new system to run PostgreSQL on,
there are a few things you should consider before making your purchase.
After going through the system requirements for PostgreSQL, the chapter next dem-
onstrates the process of downloading and installing the PostgreSQL software package. If
you have never installed Open Source software before and are expecting the worse, you
will be pleasantly surprised at how easy it is to get your PostgreSQL system going.
SYSTEM REQUIREMENTS
Obviously, if you are reading this book, you are interested in installing PostgreSQL on
a Windows platform. You many not, however, have decided exactly which Windows
platform to use. This section describes the different Windows platforms, and the require-
ments for running PostgreSQL on each.
Back in the early days of Windows (such as versions 3.0 and 3.1) there was only one
Windows version released by Microsoft at a time. Software developers had a relatively
easy task of knowing what platform to develop software for. Now, however, there are
multiple types and versions of Windows platforms available in the marketplace, not to
mention all of the older Windows versions that some people still have lying around (and
of course still want to use).
Each platform has its own set of items for you to think about before starting the
PostgreSQL software installation. This section breaks down the PostgreSQL Windows
platform requirements into two categories:
Your PostgreSQL installation will go smoothly if you do a little work ahead of time.
Here are some tips to help you out.
Windows Workstations
One of the great features of PostgreSQL is that it is just as comfortable running on a Win-
dows workstation as it is on a Windows server system. However, there are a few require-
ments you need to know about before starting PostgreSQL on your workstation.
Chapter 2: Installing PostgreSQL on Windows 27
Of course, any future versions of Windows will also support PostgreSQL just fine.
There is another point to consider here, though. Since reparse points are only available
on NTFS-formatted hard drives, PostgreSQL will only run on workstations that have an
NTFS-formatted disk partition available. Unfortunately, when Windows 2000 first came
out, many people were still using the older File Allocation Table 32 (FAT32) hard drive
format, and even today I have seen a few Windows XP workstations formatted using the
FAT32 format, although most new systems use the NTFS format by default. If you are
not sure how your workstation hard disks are formatted, you can use the Windows Disk
Management tool to find out.
To start the Disk Management tool, right-click the My Computer icon that is located
either on your desktop or in the Start menu. From the context menu that appears, select
Manage. The Computer Management window appears, providing lots of options for you
to manage your workstation. Click the Disk Management item to start the Disk Manage-
ment tool, shown in Figure 2-1.
The Disk Management tool displays each of the hard drives installed in your work-
station, plus any removable media types, such as CD-ROMs, USB memory sticks, or
external drives. Each detected drive is shown as a separate text line item in the top-right
frame, as well as a separate graphical item in the lower-right frame.
28 PostgreSQL 8 for Windows
In both the text and graphical representations, the drive type and file system format
are shown. If you have a disk installed with an NTFS file system partition available, you
will be fine. If you don’t, you can easily convert an existing FAT32-formatted file system
into NTFS format by using the built-in Windows convert utility.
The convert.exe program is used to convert FAT- and FAT32-formatted file sys-
tems into NTFS format. All data on the disk will be preserved, but any time you mess
with your hard drive, it is always a good idea to make a clean backup copy of any impor-
tant data before starting the conversion.
To run convert.exe you must be at a command prompt. To start a command
prompt, click Start | Run. The Windows Run window appears. In the textbox, type cmd
and click OK.
In the command prompt window, type convert.exe, followed by the drive letter as-
signed to the drive you want to convert, followed by the option /fs:ntfs. The final entry
should look like this (though your drive letter may differ):
convert.exe d: /fs:ntfs
The convert program will start the conversion process. Please do not try to do any-
thing on your system while the hard drive is being converted. When it is done, you will
have an NTFS-formatted hard drive available to install PostgreSQL on.
Chapter 2: Installing PostgreSQL on Windows 29
Windows Servers
Windows servers present another type of problem. If you are planning on building a
PostgreSQL server to support multiple users, you have lots more things to worry about
than just whether your hard drive file system is formatted as NTFS.
At the time of this writing, there are currently four platform choices in the Windows
server environment:
Each of these server platforms fully supports PostgreSQL and is more than capable
of being built to handle a multiuser PostgreSQL database. For servers, the hard drive file
system formatting should not be a problem. Since the NTFS disk format provides for se-
curing data by user accounts, for security reasons all Windows servers should have their
hard drives formatted as NTFS.
Performance is usually the biggest problem in a multiuser database server environ-
ment. Customers always want faster query times for their applications. In the Windows
server environment, there are basically three items that can affect the performance of the
PostgreSQL database:
Obviously, obtaining the fastest CPU, largest amount of RAM, and fastest hard drives
is the optimal solution. However, in the real world of people on limited budgets, it is not
always possible to obtain such a server configuration.
Sometimes things must be compromised in the server configuration. There is a peck-
ing order for determining how much to spend on the CPU, RAM, and hard disks. When
you have limited funds, a small improvement in just one area can help increase the over-
all performance of the server.
For a database server, the main item you should attempt to maximize is the disk ac-
cess speed. Applications that perform lots of queries on stored data can benefit from disk
configurations with quick read speeds. Alternatively, applications that perform lots of
data inserts and deletes can benefit from disk configurations with quick write speeds.
After maximizing your disk configuration, the next item you should consider is the
amount of RAM in the system. PostgreSQL requires lots of memory to perform its func-
tions. Although it will attempt to work given whatever amount of memory you provide
it, obviously more memory will increase performance. Finally, you should look at the
CPU and how it affects your overall database system.
30 PostgreSQL 8 for Windows
The following sections break these features down and give some advice to help you
decide how you can build your PostgreSQL server.
Figure 2-2 demonstrates how the EIDE and SCSI technologies handle hard drives.
As shown in Figure 2-2, EIDE technology provides for two hard drives per channel,
and most EIDE computer systems only have two channels, allowing for a maximum of
four hard drives. As you will see shortly, this can be a limitation for larger systems. SCSI
allows up to seven hard drives per channel, with most systems capable of using multiple
channels. The downside to SCSI technology is that a separate controller card is required
EIDE
Controller
SCSI
Controller Card
for each channel. The upside to this, though, is that you can often put three or four SCSI
controller cards in a single server, allowing for lots of hard drives.
Most workstation systems use EIDE disk technology. While this is a relatively inex-
pensive disk controller technology, EIDE technology is not the fastest disk access tech-
nology available. Unfortunately, some low-end server systems also use EIDE disk tech-
nology. Most high-end server systems use SCSI disk technology.
As a whole, SCSI disks outperform EIDE disks when it comes to disk access speeds.
However, newer EIDE technology is improving the data access speeds to approach those
of SCSI drives. In a high-performance database server, though, SCSI disks are almost al-
ways preferred. The ability to easily add multiple hard drives is a necessity when consid-
ering the second feature required for a good server hard drive system, discussed next.
The second hard drive feature is the type of fault tolerance used on the hard drive
system. On a workstation system, there is often just one disk drive with no fault toler-
ance. This is fine, until something goes wrong with the disk drive. A drive failure can
mean catastrophic results for your database (remember from Chapter 1, durability is a
key feature of an ACID-compliant system). If a hard drive crashes, the transaction log file
is lost, along with all of the transactions made to the database since the last backup.
To help lessen the impact of disk problems, administrators have resorted to using a
technology called Redundant Array of Inexpensive Disks (RAID). RAID technology pro-
vides for several different techniques to safeguard stored data. Each of these techniques
requires using multiple hard drives. Because of this requirement, almost all RAID con-
figurations are implemented using SCSI technology, which easily accommodates large
numbers of disk drives.
In a RAID disk configuration, one disk in a multi-disk configuration can fail without
losing data. This is possible using a logical disk volume structure. Although there are
multiple disks on the system, the operating system (Windows) treats them as a single
logical disk. Data is put on the multiple disks in such a manner that the data contained
on a single failed drive can be recovered based on the data placed on the other active
drives.
There are multiple levels of RAID technology available. Each one configures the multi-
ple disks in a slightly different manner, providing different levels of data security. Table 2-1
shows the levels of RAID that are commonly available in modern-day Windows servers.
The trick for database administrators is to pick the RAID level that gives the best per-
formance and the most data security. In each of these standard RAID levels, performance
is traded for data redundancy. In the RAID 0 method, data write speeds are improved as
the data is spread out over multiple disks, minimizing the amount of time the disk write
heads must travel. Read speeds are also increased, as the disk head travels a shorter dis-
tance to pick up each piece of data. While RAID 0 systems improve disk access speeds,
they do not have fault tolerance. If one of the striped disks goes bad, you lose all of the
data on the system.
The RAID 1 approach solves the fault-tolerance problem, as a complete duplicate hard
drive is available at all times. However, this comes at the cost of disk access times. Since
both disks must always be in sync, data must always be written twice, once on each disk.
32 PostgreSQL 8 for Windows
The RAID 5 level attempts to lessen the impact by writing a parity bit for each block of
data striped across the disks. The thing that slows RAID 5 down is that the parity bit must
be computed for each data write, but it is still faster than RAID 1.
RAID 0+1 attempts to take the best of both the RAID 0 and RAID 1 worlds. By using
the RAID 0 striping of data across multiple disks, read and write speeds are improved.
By using RAID 1 mirroring of the striped disks, if one disk goes bad, you still have the
mirrored set to recover from. The increased speed of the RAID 0 system helps offset the
RAID 1 slowness.
RAID 5 is the most common fault-tolerance method implemented by server manu-
factures. In RAID 5 systems having fewer than six disks, the overhead in writing the
parity bit becomes a larger factor. It has been shown that this overhead is lessened in
systems that have six or more disks.
In most tests, the RAID 1+0 level has proven to be the most efficient method for da-
tabase servers. This level provides basic data security, while providing minimal delays
in data reading and writing.
As a final warning, be careful about how the RAID technology is implemented on
your server. Most RAID configurations are built into the SCSI disk controllers them-
selves. The RAID functions are performed at a hardware level, providing minimal over-
head for disk operations. Unfortunately, the Windows Disk Management tool allows you
to emulate RAID technology within the operating system software itself, using standard
Chapter 2: Installing PostgreSQL on Windows 33
hard drives (even using EIDE drives). Although Windows allows you to create RAID
configurations on standard disks, this provides a huge overhead for accessing data on
the disk. Avoid this feature, especially for large servers where you expect lots of database
traffic.
RAM Performance
PostgreSQL attempts to use as much memory as possible when processing data transac-
tions. PostgreSQL loads as much of a data table into memory as possible when process-
ing transactions. If you have your data divided into lots of small tables, PostgreSQL
must swap tables in and out of memory for each transaction. This requires fast memory
speeds to keep up with the transactions. If you have your data placed in large tables,
PostgreSQL will attempt to load as much of the table into memory as possible to speed
up transactions. This is where having lots of RAM available helps out.
The bottom line is that the more data you expect to have in your tables, the more
memory you should try to install on your server. PostgreSQL will operate to the best of
its ability with the amount of RAM you have installed on your system. You can just help
it along if possible.
CPU Performance
PostgreSQL is not too picky about CPU speeds and types. At the time of this writing, the
only CPU requirements PostgreSQL has are that the processor must be at least a Pentium
III or later Intel processor (or equivalent) and that the processor must be a 32-bit series bus.
PostgreSQL has not been fully tested on 64-bit CPUs. That does not mean that it will
not run on a 64-bit system, just that the PostgreSQL developers have not fully tested and
certified it on 64-bit systems. If you are running a standard 32-bit CPU Windows system,
you should not have any trouble getting PostgreSQL to run on your system.
DOWNLOADING POSTGRESQL
One of the best features of Open Source software is that it is freely available on the In-
ternet. You don’t have to worry about registering your name and address with a com-
pany to receive a complimentary CD in the mail (along with an endless supply of junk
mail). For most Open Source packages, all you need is an Internet connection and a web
browser.
Instead of distributing software on a CD, Open Source packages are distributed by
combining all of the necessary files into a single distribution package file. For Windows
systems, this is a zipped file that contains all of the files required for the installation. To
be able to install PostgreSQL on a Windows platform, you must have an unzip utility
handy. Fortunately, in Windows XP and 2003, this utility is built into the operating sys-
tem (called Compressed Folders). If you are using a Windows 2000 platform, you will
have to obtain a third-party unzip program.
34 PostgreSQL 8 for Windows
When you are ready to download PostgreSQL, you will probably want to use a high-
speed Internet connection. The PostgreSQL download is fairly large (about 23MB at the
time of this writing), so you would not want to download it using a dial-up modem. If
you don’t have access to a high-speed Internet connection, there are commercial versions
of PostgreSQL you can purchase. There are also some companies on the Internet that will
burn Open Source software onto a CD for you for a small fee (usually $5 per CD). Just do
some searching on the Internet and you will find them.
If you are ready to download PostgreSQL yourself, go to the PostgreSQL home page,
located at www.postgresql.org. On the home page, there is a list of the current versions
still supported by the PostgreSQL development community under the Latest Releases
section. You will see a few different versions listed in this section. Now is a good time to
explain the PostgreSQL version numbers.
Unlike other software packages, PostgreSQL does not force you to upgrade from one
version to another. The developers of PostgreSQL realize that often users run production
databases and do not need to upgrade to the latest major version (using the “if it ain’t
broke, don’t fix it” theory). Currently, there are two major releases that are supported:
7 and 8 (signified by the first digit in the release number).
Within each major release are several update releases. Each update release adds
new minor features to the major release that you may or may not need to implement in
your database system. The update release is signified by the second digit in the release
number. The PostgreSQL 8 series is currently on update release 2 (8.2), although update
releases 0 and 1 are still available. The PostgreSQL 7 series also supports two update
releases, versions 7.3 and 7.4.
Within update releases, there are patches that are released to fix bugs and security
problems. These patches are released more often than update releases, and do not add
any new functionality to the software. While you do not necessarily have to upgrade
your PostgreSQL system to the latest and greatest major or update release level, it is
recommended that you should at least keep up with the patches released for the version
you have installed.
At the time of this writing, there are five releases available for download on the Post-
greSQL home page:
Another thing to notice in the Latest Releases section on the PostgreSQL home page
is that each release has two links. The first link (associated with the release name) points
to a source code download area. The second link (associated with the word Binary next
to the release name) points to the binary application download area. While the source
code release allows you to view and modify the official PostgreSQL source code, for most
Windows users, all you will want to download is the binary version of the latest release.
Chapter 2: Installing PostgreSQL on Windows 35
This package provides everything you need to get PostgreSQL running on your Windows
system.
After clicking the binary link of the release you want to use, you are taken to a direc-
tory that contains two links. The first link, linux, points to the Linux repository, and the
second link, win32, points to the Windows repository. Click the win32 link to get to the
PostgreSQL install packages.
The Win32 repository contains several downloads for different types of install pack-
ages. You will want to download the complete package, which includes the PostgreSQL
software and the Windows install routines. At the time of this writing, this link is called
postgresql-8.2.0-1.zip.
When you click the link, you are directed to a list of the available servers from which
to download the package. When you click a server link, the download starts. Save the file
to a convenient place on your local hard drive. When the download completes, you are
ready to start the installation process.
INSTALLING POSTGRESQL
After you have the PostgreSQL Win32 zip package on your PC, you can start the instal-
lation process. The first thing to do is unzip the distribution package. If you have a Win-
dows XP, Vista, or 2003 system, this is easy, as the Windows operating system includes
support for zipped files. Just click the distribution file in Windows Explorer and extract
the files to a temporary directory. It is important that you extract the files to a temporary
directory. The installation program will not work properly if you attempt to run it di-
rectly from the zipped file. Also, remember if you are using a Windows 2000 workstation
or server, you will have to obtain an unzip program if you don’t already have one. There
are plenty of free and commercial zip packages available.
After unzipping the distribution package into a working directory, you should see a
few different files. Don’t be concerned about the name of the installation files. The files
only contain the update version names and not the complete patch names. This is to
simplify the install scripts for each version.
There is also a batch file available to automatically update an existing installation.
This batch file can only be used to install a new patch version of the same update ver-
sion (such as going from version 8.1.3 to 8.1.4). It cannot be used to install a new update
version (such as going from version 8.0 to 8.1). To upgrade to a new update version, you
must export all of your data, reinstall PostgreSQL, then import your data back (this is
covered in Chapter 4).
The installation package divides the PostgreSQL packages into two .msi files, a base
file called postgresql-x.y.msi, and an additional installation file called postgresql-
x.y-int.msi, where x is the major release number, and y is the update release number.
For a new installation, double-click the base .msi file (currently called postgresql-
8.2.msi) to start the installation.
There are several windows that appear throughout the PostgreSQL Windows instal-
lation. The first window allows you to select the language used in the installation process
36 PostgreSQL 8 for Windows
(this selection does not set the language used by the PostgreSQL system; that selection
comes later in the installation process). Also, don’t neglect the little check box at the bot-
tom of this window. It is a good idea to allow the installer to create a log file showing the
installation progress in case anything goes wrong. Click Next to go to the next window
in the installation.
The rest of the installation process can be as simple or as complicated as you want to
make it. If all you are interested in is getting a basic PostgreSQL installation, you can take
the default values for all of the prompts and quickly get through the installation process.
If you want to customize your PostgreSQL installation later, you can rerun the installa-
tion program and just select the items you need to install. Just remember that you should
deselect the Data Dictionary option if you want to keep the existing data dictionary files.
If you reinstall the Data Dictionary option, the installer will overwrite the existing data-
base files, so any databases or tables you created will be lost.
If you want to customize your PostgreSQL installation now, you can choose to se-
lect or deselect the desired options during the initial installation. The following sections
describe the choices that are available to you during the installation process.
The Installation Options window shows each component of the installation package
and allows you to choose whether or not to install it and, if so, where to install it on your
system. Each component is shown as a separate box in the window.
The top-level item is the main PostgreSQL package. Clicking the package icon acti-
vates the Browse button near the bottom of the window and displays the current location
where the PostgreSQL package will be installed. You can change the installation location
on your system by clicking the Browse button and choosing a different location. This
changes the location where all of the PostgreSQL files will be stored. The default location
is C:\Program Files\PostgreSQL\8.2 (for the 8.2 update version release).
The main installation package also has four separate icons for each of the packages
included in the installation package:
X Database Server
R User Interfaces
R Database Drivers
W Development
You can choose to install or skip each of these packages. The following sections de-
scribe the packages to help you decide if you need them.
Database Server
The Database Server is the database engine portion of the PostgreSQL package. It in-
cludes four modules that can be installed:
User Interfaces
The User Interfaces package includes two separate packages you can install:
X psql
W pgAdmin III
Both packages are selected to be installed by default. The psql program provides a
simple command-line interface for running ad hoc SQL commands in your database. It
is extremely handy to have around when working on your database.
The pgAdmin III package provides a fancy Windows GUI program for administering
your PostgreSQL databases. It makes administering a PostgreSQL database almost easy.
Both of these packages are covered later in this book. pgAdmin III is covered in Chapter 4
and psql is covered in Chapter 5. I strongly recommend installing both packages to
make your PostgreSQL experience much easier.
Database Drivers
If you are creating your PostgreSQL database in a production environment, most likely
you will need to interface it with application programs. Different application develop-
ment environments require different methods to interface with the PostgreSQL database.
PostgreSQL provides a few drivers to support the following interfaces:
X JDBC
R Npgsql
R ODBC
W OLEDB
If you plan to work in a Java development environment, the Java Database Connec-
tivity (JDBC) driver allows your Java applications to connect to the PostgreSQL database
and perform database transactions. Similarly, the Npgsql driver allows applications cre-
ated in Microsoft’s .NET environment to connect to PostgreSQL. The ODBC and Object
Linking and Embedding Database (OLEDB) drivers interface legacy Microsoft applica-
tions, including Microsoft Access, with the PostgreSQL database.
Chapter 2: Installing PostgreSQL on Windows 39
Part III, “Windows Programming with PostgreSQL,” demonstrates how to use each
of these drivers in their respective development environments. If you plan to follow
along in this section, you need to install these drivers if they are not already installed.
All of these drivers are installed by default. If you decide you do not want to use any of
these drivers, you can choose not to install them. They can be individually installed later
on if you need them.
Development
The development install packages include libraries and header files for use with various
development environments. If you plan on developing C or C++ applications for Post-
greSQL (as covered in Chapter 13), you need to include the development packages in
your installation. If you choose not to install any of the features during the initial instal-
lation process, you can always rerun the PostgreSQL installer and select just the options
you did not install the first time.
There are two methods you can use to run PostgreSQL on your Windows system:
X As a background service
W As a normal program
The PostgreSQL installation process provides an easy way for you to create a Post-
greSQL background service on your Windows system. Checking the Install as a Service
check box allows the PostgreSQL installer to create the necessary service objects on the
Windows system. This method allows PostgreSQL to automatically start when the sys-
tem is booted. This is almost always the preferred method of running PostgreSQL on
servers. It also comes in handy if you are doing development work on your workstation.
If you are just playing around with PostgreSQL on your workstation, you might not
want it to load every time you turn on your system. In that case, choose to not install
PostgreSQL as a service (clear the check box).
The Service Configuration window allows you to customize the way the service runs.
You can specify the name of the service, as well as the user account used to run the ser-
vice. If the user account does not exist on the system, the installer will attempt to create
it. The default account PostgreSQL will create is called postgres.
There is one word of caution if you run PostgreSQL as a service. To support strict se-
curity requirements, PostgreSQL will not allow itself to be started by a user account that
has administrator privileges. This blocks hackers from utilizing the PostgreSQL program
to gain uncontrolled access to the Windows system. The postgres user account created
has limited privileges on the system.
When you have completed your selections for the Services Configuration window,
click Next to move on in the installation.
accept network connections from all network interfaces on the Windows system. Even
with this check box checked, by default, PostgreSQL will not accept connections from
external network clients. You will see in Chapter 3 that there is an additional configura-
tion item you need to make to allow remote clients to access your database.
The Locale parameter is where you configure the language used on the PostgreSQL
system. There are lots of values that this parameter can be set to. If you select a non-
English Locale, you must have the national language support module selected in the
Installation Options window shown earlier in Figure 2-3.
You will notice that the default value of Locale is not set to a language. Instead, it
is set to the value C. The C stands for the ISO C standard, which allows PostgreSQL to
obtain the locale information from the host system. Theoretically, this should not be a
problem; however, some host systems do not follow the ISO C standards properly, thus
creating a problem for PostgreSQL. Fortunately for us, this is not a problem on Windows
systems, so you can leave this default value alone.
The Encoding parameter determines how data is stored in the database. This pa-
rameter depends on how the host system stores values. The default value is SQL_ASCII
for Windows systems. This encoding stores data in standard ASCII format in the data-
base. While ASCII encoding works great for English characters, it is extremely limited
for many other language character sets.
42 PostgreSQL 8 for Windows
As you can see from Figure 2-6, you have a few different choices of what program-
ming languages to use within your PostgreSQL system. However, you may also notice
that most of the choices are grayed out and not available.
By default, PostgreSQL supports only its own pgsql procedural language. If you have
the Sun Java runtime installed on your system, and its location is specified in the Win-
dows PATH environment variable, PostgreSQL will also allow you to select PL/Java as
a procedural language. The other procedural languages require you to install additional
third-party software packages.
If you are familiar with the Unix world, you may have heard about the other proce-
dural language options available in PostgreSQL. The Perl, Python, and Tcl languages are
popular Unix scripting languages that are supported by most Unix and Linux platforms.
Unfortunately, Windows does not support these languages without additional software.
There are both commercial and free Windows versions of these scripting languages.
Table 2-2 lists as an example one of several web addresses where a Windows version of
each language package can be found.
In order to have these procedural language options available, you must install the ap-
propriate package before starting the PostgreSQL installation. If the procedural language
is installed, PostgreSQL will allow you to select the language to install in the PostgreSQL
system. Remember, you can always rerun the PostgreSQL installer program again after
the initial installation to install just these additional components.
The functions created by the contrib modules are not the same as the internal Post-
greSQL functions, like the ones that are covered in Chapter 8. Instead, these are specialized
functions created by running the contrib module SQL scripts in the database. Table 2-3
describes the functions available in the PostgreSQL installation at the time of this writing.
You may notice that the Admin81 contrib module is already selected by default. This
contrib module is required if you are using the pgAdmin III application to administer the
PostgreSQL system. If you do not plan on using pgAdmin III, you can unselect this item.
Don’t worry if you cannot decide whether you want a contrib module installed. This
step only installs the contrib modules into the database. The SQL code used to create the
contrib modules is automatically stored in the PostgreSQL installation directory on your
system. You can always go back and run the SQL code to manually install the contrib
module at a later time.
RUNNING POSTGRESQL
Now that you have PostgreSQL installed, it’s time to either see if it’s running (if you
chose to install it as a background service) or get it running (if you did not choose to start
it automatically). The following sections describe both of these two methods for running
PostgreSQL on your Windows system.
Service Method
If you chose to run PostgreSQL as a background service, it should now be running on
your system. You can use the Windows Task Manager to check for yourself. The Task
Manager allows you to see what services are currently running on your system. You
must be logged in with a user account that has administrator privileges to see system
services.
The easiest way to start the Task Manager is to right-click on an empty place on the
system taskbar and select Task Manager from the menu that appears. Figure 2-8 shows
the Task Manager in action.
In the Task Manager window, click the Processes tab. This shows a list of all the
services running on the system. You can click the Image Name column heading to sort
the processes by their names. If you scroll down to the P section, you should see several
processes related to PostgreSQL.
The main PostgreSQL process is called postmaster. This process controls the
PostgreSQL system and database engine. You should also see several other postgres
processes, which handle queries to the database engine. These processes control con-
nectivity to the PostgreSQL system. The number of postgres processes created is con-
trolled by an entry in the PostgreSQL configuration file (discussed in Chapter 3).
Manual Method
If you do not want the PostgreSQL system running all the time on your Windows sys-
tem, you have to manually create the default database, and start and stop the program.
This is done using a couple of the programs included with PostgreSQL.
To be able to start PostgreSQL, it must have the Data Dictionary area created. To
manually create this, you must use the initdb program. The initdb program uses
command-line parameters to define where the Data Dictionary should be created, and
what type of encoding to use:
initdb -D datapath -E encoding
By default, the Data Dictionary is located in the data directory within the Post-
greSQL installation directory. The datapath must exist, and the user account used to run
PostgreSQL must have write access to it.
The initdb program must be run by the Windows user account that controls Post-
greSQL. This cannot be a user account with administrator privileges. To run the initdb
program as the user account, you can either log in using that account or use the Win-
dows runas command to start the program as the user. The easiest way to do this is to
start a command-prompt session as the PostgreSQL user account:
runas /user:postgres cmd
After prompting you for the postgres password, Windows starts a command-
prompt session as that user. By default, the initdb program is located in the PostgreSQL
bin directory. You can change to that directory and run the program as follows:
cd \program files\postgresql\8.2\bin
initdb -D "c:\program files\postgresql\8.2\data"
Note that you must enclose the datapath value in double quotes if it includes spaces.
After the Data Dictionary is created, you can start PostgreSQL. This is done using
the pg_ctl program. Again, to run the pg_ctl program you must be logged in as a
nonadministrator user account, or use the runas command as you did with the initdb
program.
48 PostgreSQL 8 for Windows
There are several functions you can perform with the pg_ctl program. The format
of the pg_ctl command for controlling the PostgreSQL server is
pg_ctl command -D datadir
The command parameter can be one of five basic options: start, stop, restart, reload,
and status. The datadir parameter points to the location of the Data Dictionary. If the
pathname includes spaces (such as in the default directory location) you must enclose
the pathname in double quotes. Thus, to start PostgreSQL using the default Data Direc-
tory location, you would use the following command:
Remember, you cannot start the PostgreSQL server from a user account that has
administrator privileges. If the PostgreSQL server starts, you will be able to see the
postmaster and postgres processes running in the Task Manager.
SUMMARY
This chapter discussed how to get PostgreSQL running on your Windows workstation
or server. There are a few things that need to be addressed before starting the software
installation. PostgreSQL is robust enough to work well even on a Windows workstation
platform. For workstations, you must ensure that you have an NTFS version 5.0 partition
available to install PostgreSQL on. For a multiuser server environment, there are few ad-
ditional things to worry about. The most important feature for a PostgreSQL server is the
disk access speeds. The quicker the disk access speed the better the database performance.
Servers using EIDE disk technology are poor choices for database servers. The SCSI disk
technology provides for faster data transfer speeds, as well as providing for more disk
drives to be added to the system. Servers using SCSI disk systems can also implement
RAID technology to provide a fault-tolerant disk environment for the database. Next to
fast disk speeds, PostgreSQL also prefers to have as much RAM available as possible.
When you have your workstation or server ready for PostgreSQL, you can download
the Windows binary distribution package directly from the PostgreSQL web site. The
PostgreSQL developers support several versions of PostgreSQL at the same time. For a
new database implementation, you will want to download the latest version available.
New patches are releases often to fix bugs and security problems. Updates to releases are
released less often, and major releases are released even less often.
After downloading the distribution package, you must unzip it and store it in a tem-
porary location. The PostgreSQL installation program is started by double-clicking the
.msi installer file. The installer program starts by asking a few questions regarding the
type of installation you want, as well as the features you desire for your PostgreSQL
setup. After answering the questions, the PostgreSQL software loads.
After PostgreSQL is installed, it’s time to start exploring. The next chapter walks
through the various utilities and configuration files found in the PostgreSQL package.
3
The PostgreSQL Files
and Programs
49
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
50 PostgreSQL 8 for Windows
N
ow that you have PostgreSQL installed on your system, we can take a walk
through the various pieces of it. There are lots of files associated with the
PostgreSQL install. Most of them work behind the scenes with the database
engine. The two main types of files you will have to worry about are configuration files
and utilities. This chapter shows where PostgreSQL keeps all of these files, then explains
how to customize your PostgreSQL system using the configuration files. Finally, the
chapter presents a rundown of the various PostgreSQL utilities you have available to use
on your PostgreSQL system.
C:\Program Files\PostgreSQL
Under this directory is a directory named after the update version you installed.
At the time of this writing, that directory is called 8.2. If you install additional patch
releases for the update, the files remain in the same directory. If you upgrade to a new
update release, for example version 8.3, the PostgreSQL installer will create a new direc-
tory for the new files. Patch releases are kept in the same directory as the original update
release.
If you accepted all of the default locations during installation, all of the PostgreSQL
files are located beneath the update release directory. Table 3-1 outlines the directories
you should see using a default installation.
The data directory is especially important in that it contains files and directories spe-
cific to the operation of the database engine. As mentioned in Chapter 2, you can choose
to relocate this directory during the installation phase to a separate place on the system.
The data directory is referred to as the database cluster, as this is where all of the data-
base files are located. The PostgreSQL system must have at least one database cluster to
operate. The next section takes a closer look at the database cluster directory structure.
Directory Description
bin The PostgreSQL main programs, utilities, and library files
data PostgreSQL Data Dictionary, log files, and the transaction log
doc Documentation on contrib modules, PgOleDb, and psqlODBC
include C program header files for developing C programs for
PostgreSQL (if the Development package was installed)
jdbc Java JDBC library files for developing Java programs for
PostgreSQL (if the JDBC package was installed)
lib PostgreSQL library files for the executable programs
npgsql Microsoft .NET library files for developing .NET programs for
PostgreSQL (if the npgsql package was installed)
PgAdmin III The pgAdmin III program documentation
share Contrib modules and timezone information for PostgreSQL
Directory Description
base Contains a directory for each database
global Contains system tables for the Data Dictionary
pg_clog Contains status files on transaction commits
pg_log Contains PostgreSQL system log files
pg_multixact Contains multitransaction status information used for
row locking
pg_subtrans Contains subtransaction status information
pg_tblspc Contains links to database tables
pg_twophase Contains phase files for the two-phase transaction
commit process
pg_xlog Contains the transaction log files
The base directory contains subdirectories for each database created on the Post-
greSQL system (creating databases is described in Chapter 4). PostgreSQL names the
directory after the object ID (OID) assigned to the database in the PostgreSQL Data Dic-
tionary. As a normal PostgreSQL user, you will not have to worry about these files, be-
cause PostgreSQL takes care of them behind the scenes.
As a PostgreSQL administrator, one area you should become familiar with is the
pg_log directory. This is the place where the PostgreSQL system maintains system log
files. These log files track events that occur in the database using text messages stored in
the log file. As the administrator, it is your job to watch these log files for any problems
that appear.
Each time the PostgreSQL system is started, a new log file is created. The default
format of the log filename is
postgresl-year-month-day-time.log
You can change the filename format using features in the postgresql.conf file
(discussed in “The postgresql.conf File” section later in this chapter).
Each major event that happens in the PostgreSQL system is logged in the system log
file. The default format shows a timestamp and the event that occurred. Some sample
entries from a log file are given here:
2006-06-29 20:13:45 FATAL: database "test" does not exist
2006-06-29 20:24:01 LOG: transaction ID wrap limit is 2147484148,
limited by database "postgres"
2006-06-29 20:25:05 LOG: autovacuum: processing database "Test"
2006-06-29 20:26:05 LOG: autovacuum: processing database "template1"
2006-06-29 20:27:05 LOG: autovacuum: processing database "postgres"
2006-06-29 20:28:18 NOTICE: ALTER TABLE / ADD PRIMARY KEY will create
implicit index "ItemID" for table "test"
2006-06-29 20:28:46 ERROR: syntax error at or near "connect" at character 1
2006-06-29 20:30:35 LOG: autovacuum: processing database "postgres"
2006-06-29 20:31:35 LOG: autovacuum: processing database "Test"
2006-06-29 20:32:35 LOG: autovacuum: processing database "template1"
2006-06-29 20:33:18 ERROR: duplicate key violates unique constraint "ItemID"
2006-06-29 20:33:52 ERROR: column "itemid" does not exist
2006-06-29 20:34:34 LOG: autovacuum: processing database "postgres"
2006-06-29 20:35:34 LOG: autovacuum: processing database "Test"
2006-06-29 20:35:40 ERROR: syntax error at or near "1" at character 36
You can change the format of the log file entries (this is also discussed in “The post-
gresql.conf File” section). You may notice that after the timestamp on each line is a key-
word, which describes the severity of the event, followed by a short description of the
event. There are a few different levels of information logged in the system log file, each
one signified by a special keyword. Table 3-3 describes the different information levels
stored in the log in order of severity.
Chapter 3: The PostgreSQL Files and Programs 53
As you can see from the example entries above, there are lots of events that are logged
in the log file. On a busy system, it does not take long for the log file to get rather large.
You can configure how PostgreSQL handles system logging using the PostgreSQL con-
figuration file (discussed in the next section). You have lots of options of how to handle
log files, such as setting the level at which PostgreSQL logs messages (the NOTICE level
is the default), and setting the log file to automatically roll over to a new file if it gets past
a preset size. This helps you to sift through the log file looking for problems among the
normal messages.
It is always a good idea to keep an eye on the PostgreSQL log files. If you let them go
on for too long, they will consume the entire available disk space on your system, and
cause the PostgreSQL server to stop. Most database administrators institute a policy on
when to delete old log files, and whether or not to save them before deletion.
CONFIGURATION FILES
How PostgreSQL behaves on your system is controlled by three separate configuration
files. While the PostgreSQL install process creates standard configuration file values for
the system to operate, knowing how to fine-tune your PostgreSQL configuration values
can be a necessity for large installations. There are plenty of advanced features that you
can turn on or off via the configuration files.
54 PostgreSQL 8 for Windows
The PostgreSQL configuration files are standard text files, located in the database
cluster directory, described earlier. Since they are standard text files, you can use the
Windows Notepad application to view and modify them. On a Windows PostgreSQL
installation, there is an easy way for you to edit the configuration files. The PostgreSQL
menu area on the Windows Start | Programs menu (called PostgreSQL 8.2 on my instal-
lation) contains links to edit each of the three configuration files:
X postgresql.conf
R pg_hba.conf
W pg_ident.conf
All you need to do is click the appropriate link to bring the configuration file up in a
standard Windows Notepad session.
You can change configuration file values at any time while the system is running.
However, the new changes will not take effect until either the system is restarted or you
use the reload feature located on the PostgreSQL menu (or manually use the pg_ctl
reload command, discussed later in this chapter).
When working with the configuration files, there are a few things to keep in mind.
Each entry in the configuration files is on a separate line. Lines that start with a hash
mark (#) are comments and not processed by the PostgreSQL system (a hash mark can
be added at the end of an entry to add a comment on the entry line). If a configuration
line is commented out, PostgreSQL uses the default value for that entry. You can change
the default value to another value by removing the comment symbol from the line and
reloading or restarting the PostgreSQL system.
However, if you want to revert to the default value for an entry, you cannot just put
the comment symbol back and reload the PostgreSQL system. You must stop and restart
the PostgreSQL system for the default value to take effect. This is a big “gotcha” for
many novice PostgreSQL administrators.
The following sections describe the entries in the three configuration files and walk
through the settings that can be controlled within the configuration files.
featurename = value
While this is the formal definition of a feature record, in reality, the equal sign (=) is
optional. As long as the value item is separated from the featurename item using white
space (either one or more spaces or tabs), PostgreSQL will understand what you mean.
Comments are preceded by a pound sign (#), which can be either at the start of a line to
comment the entire line, or after a valid feature definition to add a comment to a feature
line.
Chapter 3: The PostgreSQL Files and Programs 55
When you view the postgres.conf file, you will see that there are lots of feature
entries that are commented out. When a feature is commented out from the configura-
tion file, PostgreSQL assumes the default value for the feature. The configuration file
shows the default value within the commented line.
In the sample postgresql.conf file created by the PostgreSQL installer, similar
features are grouped together into sections, but this is not a requirement. Feature records
can be placed anywhere in the configuration file.
You can view and modify the contents of the postgresql.conf file by choosing
Start | Programs | PostgreSQL 8.2 | Configuration Files | Edit postgresql.conf. The con-
figuration file contains entries for all of the features available to modify, along with their
default values. The following sections describe the different configuration file sections
and the features that can be modified. In each section, the feature entries are shown with
their default values, followed by a description of what the feature is used for.
data_directory = 'ConfigDir'
hba_file = 'ConfigDir/pg_hba.conf'
This feature is used to reference the location of the pg_hba.conf configuration file.
By default, it is located in the database cluster directory. You can elect to move the file to
an alternative location on your system.
ident_file = 'ConfigDir/pg_ident.conf'
Similarly, this feature allows you to relocate the pg_ident.conf configuration file.
external_pid_file = '(none)'
By default, PostgreSQL maintains a file that contains the system process ID (PID)
of the currently running PostgreSQL system. This file is used by PostgreSQL utilities to
track the current PostgreSQL running process. You can define an additional copy of that
file by using the external_pid_file feature. The default location is always kept for
internal PostgreSQL use.
56 PostgreSQL 8 for Windows
port = 5432
The port feature allows you to set the TCP port that PostgreSQL listens for client
connections on. If you change this value, you will have to use the new value in the client
configurations.
max_connections = 100
The max_connections feature allows you to limit the number of clients that can
connect to your PostgreSQL system simultaneously. Be careful with this value, as each
connection defined consumes additional memory on your system, even when no clients
are connected.
superuser_reserved_connections = 2
On Windows and Unix systems, internal system communications can occur via tem-
porary socket files created by applications. By default, PostgreSQL creates the temporary
socket connections in the directory defined by the TEMP Windows environment variable.
You can elect to use a different directory on your system by using this feature value. By
default, the owner of the socket files is the group of the current user. You can change the
group permission for these files, although it is not recommended. The 0777 permission is
a Unix octal format that allows anyone to connect to PostgreSQL using the internal con-
nection. This is not needed in the Windows PostgreSQL installation.
bonjour_name = ''
The Bonjour server is a method for network servers to advertise their DNS names
on the network for servers and clients to recognize. By default, the Bonjour name of the
PostgreSQL is the same as the Windows computer name. You can change the value by
entering a text value in the bonjour_name feature.
The security and authentication group of features allows you to configure the user
authentication features of your PostgreSQL system:
authentication_timeout = 60
This value sets the maximum amount of time (in seconds) that a client has to authen-
ticate with the PostgreSQL server. If authentication is not complete, the connection is
terminated.
ssl = off
password_encryption = on
db_user_namespace = off
When enabled, this feature allows separate PostgreSQL user accounts to be created
for each created database. As you will see in Chapter 4, by default, each PostgreSQL
user account can be used to access any database on the system (limited by permission
restrictions created by the administrator). By enabling this feature, you can restrict user
accounts to only a single database.
58 PostgreSQL 8 for Windows
Kerberos servers are popular in the Unix environment for client authentication. Most
Windows administrators do not need to work with Kerberos servers, as Windows pro-
vides its own authentication method using Active Directory. If you do use Kerberos serv-
ers in your network, the following features can be used to define your environment:
The final group of features in this section defines advanced TCP control behavior.
These features define advanced TCP parameters for fine-tuning the PostgreSQL network
connectivity. For most normal situations, you should not have to worry about messing
with these values.
X tcp_keepalives_idle = 0 Sets the number of seconds between sending
TCP keepalive packets on an idle remote client connection. The value of 0 indi-
cates for PostgreSQL to use the Windows operating system default.
R tcp_keepalives_interval = 0 Specifies how long to wait for a response
to a keepalive before retransmitting. The default value of 0 indicates for Post-
greSQL to use the Windows operating system default.
W tcp_keepalives_count = 0 Specifies how many keepalive packets can
be lost before the client connection is considered dead. Again, the default value
of 0 allows PostgreSQL to use the Windows operating system default.
Be careful when changing the TCP features. If you are interested in changing any of
these values, please consult a TCP/IP networking book.
shared_buffers = 1000
Most PostgreSQL administrators recommend using a value of at least 2000, and working
up or down from there, depending on the performance of your particular system. For a
standard Windows workstation installation, the default value of 1000 should be fine.
temp_buffers = 1000
This feature determines the maximum number 8124 byte temporary memory buffers
used by each database session. As individual sessions access tables, temporary memory
buffers are created to store information. You can limit the amount of memory each ses-
sion is allowed to consume using this value. Again, this is a performance feature that can
be experimented with in your specific PostgreSQL environment.
max_prepared_transactions = 5
This feature limits the amount of memory used by internal sort and hashing func-
tions in the entire PostgreSQL system. This value is in kilobytes, so the default value of
1024 kilobytes is 1MB of memory space. If a sort or hash function exceeds this limit, data
is swapped out to temporary files to complete the process. This is extremely costly to
database performance, and is a crucial element in performance tuning (see Chapter 10).
maintenance_work_mem = 16384
Sets the amount of memory (in kilobytes) allowed for internal PostgreSQL database
maintenance operations. These functions include creating table indexes, removing de-
leted records, and altering existing tables. This value is in kilobytes, so the default value
is 16MB.
max_stack_depth = 2048
This feature sets the size of the PostgreSQL server’s execution stack level. This limits
the number of processes that can wait for execution. The value is in kilobytes, so the
default value is 2MB.
max_fsm_pages = 20000
max_fsm_relations = 1000
60 PostgreSQL 8 for Windows
preload_libraries = ''
This feature specifies a list of libraries that are preloaded into the PostgreSQL server
at startup. The default is not to load any libraries.
The next group of features in the Resource Usage section deals with database vacu-
uming. The PostgreSQL system utilizes a feature called vacuuming to remove deleted
records. Vacuuming the database is described in detail a little later in the “Autovacuum
Parameters” section. When the PostgreSQL system initiates a vacuum process, the sys-
tem tracks how many system resources are consumed by the vacuum. You can control
how many resources the system allows the vacuum process to take before limiting the
process by using these feature values:
The PostgreSQL background writer process ensures that memory buffers containing
transaction data are written to the hard disks as soon as possible. This group of feature
values can be fine-tuned by advanced administrators to determine exactly how often the
memory buffers are written.
fsync = on
This feature ensures that each individual transaction is written to disk immediately
instead of being stored in the WAL. Keeping this feature enabled results in decreased
database performance, but ensures data durability and security. Disabling this feature
allows multiple transactions to accumulate in the WAL transaction log files before being
written to disk.
wal_sync_method = fsync
This value determines the method PostgreSQL uses to write data to disk. This value
is set by the PostgreSQL installer and is optimized for the operating system that Post-
greSQL is installed on. The other values listed under this feature in the configuration file
are used for other operating systems that PostgreSQL can run on. You do not need to
change these values for a Windows environment.
full_page_writes = on
Determines if PostgreSQL writes data to the WAL transaction log files based on a
single database record or an entire page of records.
wal_buffers = 8
The number of 8124-byte disk page buffers allocated in shared memory for the WAL.
commit_delay = 0
62 PostgreSQL 8 for Windows
The minimum number of open concurrent transactions before starting the commit_
delay timer.
checkpoint_segments = 3
The maximum number of 16MB log file segments between writing the WAL log files
to the database. Checkpoints are points in the transaction process when PostgreSQL stores
all of the data in the WAL to disk. After the checkpoint, it is guaranteed that the data in
the WAL is safely in the database files. This value means that for every three segments
(48MB) of data, PostgreSQL will force the WAL to be saved to disk.
checkpoint_timeout = 300
The time in seconds between writing the WAL to the database. If the checkpoint_
segment value hasn’t been met within this timeframe, the WAL is written to disk.
checkpoint_warning = 30
PostgreSQL writes a message to the system log file if the WAL segment files get filled
within the defined value (in seconds).
archive_command = ''
enable_bitmapscan = on
enable_hashagg = on
enable_hashjoin = on
enable_indexscan = on
Chapter 3: The PostgreSQL Files and Programs 63
enable_mergejoin = on
enable_nestloop = on
enable_seqscan = on
enable_sort = on
enable_tidscan = on
effective_cache_size = 1000
random_page_cost = 4
cpu_tuple_cost = 0.01
cpu_index_tuple_cost = 0.001
cpu_operator_cost = 0.0025
geqo = on
geqo_threshold = 12
geqo_effort = 5
geqo_pool_size = 0
geqo_generations = 0
geqo_selection_bias = 2.0
default_statistics_target = 10
constraint_exclusion = off
from_collapse_limit = 8
join_collapse_limit = 8
log_destination = 'stderr'
This value sets the location to which PostgreSQL sends log messages. The possible
entries for this value can only be the standard error file (stderr), the standard log (sys-
log) file, or in the case of Windows, the system eventlog. The Windows eventlog can
be viewed from the Windows Computer Manager window (discussed in Chapter 2).
redirect_stderr = on
Enabling this feature allows you to redirect log messages sent to stderr to an alterna-
tive location, such as a log file. This feature is enabled by default, to create a separate log
file for PostgreSQL.
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
64 PostgreSQL 8 for Windows
The filename the log file will be redirected to. The filename can use wildcard char-
acters to use the current year (%Y), month (%m), day (%d), hour (%H), minute (%M), and
section (%S).
log_truncate_on_rotation = off
When this feature is enabled, PostgreSQL will overwrite an existing log file using the
same log filename when PostgreSQL is restarted. When disabled, existing log files will
be appended with the new log messages.
log_rotation_age = 1440
Sets the time (in minutes) when a new log file will automatically be started. Depend-
ing on the log_filename feature, the new log file may be a separate file (such as if the
filename uses a timestamp). If the new log filename is the same name, depending on the
log_truncate_on_rotation feature, it may or may not overwrite the existing log
file.
log_rotation_size = 10240
Sets the size (in kilobytes) when a new log file will be started.
syslog_facility = 'LOCAL0'
Defines the system facility used for logging log files. The LOCAL0 value instructs
PostgreSQL to use the syslog (or Windows eventlog) feature.
syslog_ident = 'postgres'
Identifies the username used when logging messages to the syslog (or the Windows
eventlog).
client_min_messages = notice
Sets the minimum level of log messages logged by client connections. All messages
with this severity or higher (see Table 3-3) are logged.
log_min_messages = notice
Sets the minimum level of log messages logged by the PostgreSQL server. All mes-
sages with this severity or higher (see Table 3-3) are logged.
log_error_verbosity = default
Sets the amount of detail logged to the system log for each message. The other values
are terse, for shortened messages, and verbose, for longer messages.
log_min_error_statement = panic
Chapter 3: The PostgreSQL Files and Programs 65
Sets the severity of errors in SQL statements that are logged. The hierarchy of error
messages was shown earlier in Table 3-3.
log_min_duration_statement = -1
Logs the SQL statement that produced the error and its duration to the log file. If set
to 0, all SQL statements are logged. If set to -1, no SQL statements are logged.
silent_mode = off
If enabled, this feature allows the PostgreSQL server to run as a background process
with no standard output or error connections. All messages sent to the logging facilities
are ignored.
debug_print_parse = off
debug_print_rewritten = off
debug_print_plan = off
debug_pretty_print = off
The debug family of features determines how debugging output is handled. These
features, described in Chapter 10, are used by developers.
log_connections = off
This feature enables PostgreSQL to log each client connection made to the system.
log_disconnections = off
This feature logs each time a client disconnects from the system.
log_duration = off
Sets the first identifier on the log line. The default value (%t) uses the timestamp
of the message as the first identifier on the log line. Other options that you can use are
username (%u), database name (%d), remote hostname or IP address and port (%r), re-
mote hostname or IP address only(%h), the process ID (%p), timestamp with milliseconds
(%m), command tag (%i), session ID (%c), log line number (%l), session start timestamp
(%s), transaction ID (%x), and no identifier (%q). Any combination of these values can be
used.
log_statement = 'none'
Determines what SQL statements are logged in the system log file. The default is to
log no SQL statements. You can also use ddl to log data definition SQL statements, such
66 PostgreSQL 8 for Windows
as CREATE, ALTER, and DROP, mod to log data definition SQL statements and data modi-
fication SQL statements such as INSERT, UPDATE, DELETE, and TRUNCATE, and all,
which logs all SQL statements.
log_hostname = off
This feature allows you to record the hostname of all client connections in the system
log file.
log_parser_stats = off
log_planner_stats = off
log_executor_stats = off
log_statement_stats = off
stats_start_collector = on
stats_command_string = off
stats_block_level = off
stats_row_level = on
stats_reset_on_server_start = off
autovacuum = on
This value enables or disables (off) the database autovacuum feature within
PostgreSQL.
autovacuum_naptime = 60
Chapter 3: The PostgreSQL Files and Programs 67
This value sets the amount of time (in seconds) that PostgreSQL will run the auto-
vacuum feature.
autovacuum_vacuum_threshold = 1000
autovacuum_analyze_threshold = 500
autovacuum_vacuum_scale_factor = 0.4
autovacuum_analyze_scale_factor = 0.2
autovacuum_vacuum_cost_delay = -1
autovacuum_vacuum_cost_limit = -1
Once the autovacuum feature is enabled, these features control when tables are au-
tovacuumed, based on the number of transactions (vacuum threshold), or when tables
are analyzed, based on the number of transactions (analyze threshold). If any of these pa-
rameters are met before the configured autovacuum_naptime value, the autovacuum
is performed.
search_path = '$user,public'
Defines the default order in which database schemas (described in Chapter 4) are
searched when an object is referenced without a schema. By default, the user’s own
schema is searched first, followed by the public schema.
default_tablespace = ''
This value specifies the default tablespace (described in Chapter 4) where Post-
greSQL creates objects. The default value is for PostgreSQL to use the tablespace where
the database is located.
check_function_bodies = on
This value instructs PostgreSQL to check new functions as they are created by users
to ensure the functions will work within the database tables.
This value defines the default isolation level used by PostgreSQL for transactions
(described in Chapter 7).
default_transaction_read_only = off
statement_timeout = 0
68 PostgreSQL 8 for Windows
This value provides the time, in milliseconds, that PostgreSQL allows an SQL state-
ment to run. The default value of 0 disables this feature, which allows SQL statements to
take as long as they need to process.
The next group of features defines how times, dates, and characters are formatted
within PostgreSQL.
datestyle = 'iso, mdy'
This value specifies the format used to display dates. The default value displays
dates using the ISO format, using the month/day/year style. Other styles available are
DMY and YMD.
timezone = unknown
This value sets the default time zone for displaying times. The default of unknown
forces PostgreSQL to use the system settings.
australian_timezones = off
When enabled, PostgreSQL interprets the ACST, CST, EST, and SAT time zones as
Australian instead of North American.
extra_float_digits = 0
This value specifies the number of extra digits displayed for floating-point
numbers.
client_encoding = sql_ascii
This value specifies the client-side data encoding method used by the PostgreSQL
client. The default value is the database encoding.
lc_messages = 'C'
lc_monetary = 'C'
lc_numeric = 'C'
lc_time = 'C'
These features set the language locale used for messages, money values, number
formatting, and time formatting. The default is set in the PostgreSQL installer program
(see Chapter 2). The C value formats all values using the ISO C format, which assumes
that the local operating system controls language formatting.
deadlock_timeout = 1000
Chapter 3: The PostgreSQL Files and Programs 69
This feature defines the time in milliseconds for PostgreSQL to wait for a record lock
before checking for a deadlock situation (when two or more processes attempt to access
the same record at the same time). If a deadlock situation occurs, PostgreSQL initiates the
Multiversion Concurrency Control (MVCC) feature to resolve the deadlock. Implement-
ing this feature too early can negatively impact database performance.
max_locks_per_transaction = 64
This value sets the number of locked records allowed per transaction per client.
When this value is enabled, tables referenced by a query are automatically added to
the FROM SQL clause if not already present. While this behavior is not standard SQL, this
feature was present in previous PostgreSQL versions.
backslash_quote = safe_encoding
This value controls how the PostgreSQL server handles backslashes in client con-
nections. This feature was recently modified due to a security exploit. When the feature
is disabled (off), statements with backslashes are always rejected. When the feature
is enabled (on), statements with backslashes are always allowed. The default value of
safe_encoding allows backslashes if the client also supports an encoding that allows
backslashes.
default_with_oids = off
This value controls whether a CREATE TABLE or CREATE TABLE AS statement in-
cludes an object ID (OID) column by default. In PostgreSQL versions previous to 8.1, this
was enabled by default. If you reference a table field as a foreign key in another table (see
Chapter 4), the referenced table must be created with an OID. However, not all tables
need to be created with an OID column.
escape_string_warning = off
When this value is enabled, PostgreSQL writes a warning message to the system log
file when a SQL statement contains a backslash character.
regex_flavor = advanced
sql_inheritance = on
70 PostgreSQL 8 for Windows
These two features control the behavior of the PostgreSQL system to match previ-
ous versions. The regex feature changed in PostgreSQL 7.4. If you need to run applica-
tions from a previous version of PostgreSQL, you may need to set this value to either
basic or extended. The table inheritance feature of PostgreSQL (discussed in Chapter 1)
changed significantly in version 7.4. If you need to run applications created in a previous
version of PostgreSQL, you will have to disable this feature.
transform_null_equals = off
This feature is extremely important if you are accessing your PostgreSQL database
from Microsoft Access (see Chapter 12). Microsoft Access uses the NULL value in queries
somewhat differently than the standard SQL specifications. If you are building queries
using Access, you will need to enable this feature.
custom_variable_classes = ''
This feature allows you to define new variables for special classes within PostgreSQL.
A comma-separated list of the new classes must be provided to this feature. Variables
within the class are identified as classname.variable. This is an advanced feature that is not
normally used by PostgreSQL users.
Each line in the pg_hba.conf file contains a separate definition (record) controlling
access to the system. Multiple records can be created to define different access types and
categories to the system. For a remote client to gain access to the PostgreSQL system, the
conditions of at least one record must be met. If multiple records match a specific client,
only the first matching record in the configuration file is used for authentication.
The format of a pg_hba.conf record is
Connection-type Field
The connection-type field defines the method used by the client to connect to the PostgreSQL
system. There are four types of connections that are supported by PostgreSQL:
Remember that to allow remote clients to connect, you must have enabled the Lis-
ten on All Addresses feature during the PostgreSQL installation (see Chapter 2). If you
did not do that, and now want to allow remote clients to connect, you must change the
listen_addresses entry in the postgresql.conf configuration file, discussed later
in this chapter.
Database Field
The database field defines which PostgreSQL databases the record controls access to.
There are several formats that can be used for this field:
The record only applies to the databases specified in this field. Clients matching this
record will not be given access to any nonmatching database in the system, unless speci-
fied in a separate record in the configuration file.
User Field
The user field defines which PostgreSQL user accounts are controlled by this record. Re-
member, this applies to the PostgreSQL username, and not to the logged-in username on
the operating system. Similar to the database field, there are a few different formats you
can use for this field:
X A single username
R A comma-separated list of usernames
72 PostgreSQL 8 for Windows
R A role name, preceded by a plus sign (+), to enable all users who are members of
the role
W A filename, preceded by an at sign (@), to specify a file containing a text list of
usernames
The user accounts listed in the record are allowed access to the databases listed in the
record. This does not override normal PostgreSQL database security on tables (discussed
in Chapter 4) once a client is connected to the database, but provides a front-end access
control method for remote clients to gain initial entry into the database.
Network-address Field
The network-address field defines which network hosts clients are allowed to connect to
the PostgreSQL system from. This is a handy way to restrict access to your system to
clients on your local network, or even to just one specific host on your network.
The network address cannot be written using a host’s domain name. The network-
address field must be entered in one of two formats:
The old method of specifying host addresses is the host address/subnet mask pair
method. This uses two separate values: a dotted-decimal host or network address, and a
dotted-decimal subnet mask address. The subnet mask defines how many bits of the ad-
dress are used to define the network address and host address. For example, the subnet
mask 255.255.255.255 is used to define a single host on the network. The subnet mask
255.255.255.0 defines a subnet of hosts that uses the fourth octet in the network address
for the host address.
The CIDR format is becoming more popular in the networking world. This method
uses a host or network address in dotted-decimal notation, along with a single integer
value to define the number of bits enabled in the subnet mask. For example, the CIDR
address 192.168.0.10/32 indicates the host at address 192.168.0.10, using all 32 bits for
the subnet mask, or 255.255.255.255. This defines a single host address on the network.
The CIDR address 192.168.0.0/27 indicates all of the hosts on subnet 192.168.0.0, using
subnet mask 255.255.255.0. This provides for any host whose address is from 192.168.0.1
to 192.168.0.254 to connect to the databases listed in the record.
Login-method Field
The login-method field defines the method the client must use to authenticate with the
PostgreSQL system. PostgreSQL supports quite a few different login methods. Unfor-
tunately, though, some of the supported methods are only available on Unix platforms.
The values available on Windows installations are as follows:
X trust
R reject
Chapter 3: The PostgreSQL Files and Programs 73
R ident
R password
R md5
R crypt
W krb5
Obviously, if you are working on an open network such as the Internet, it is best to
use one of the encrypted authentication methods (unless you are connecting using an
SSL connection). If you are just experimenting on your own home network or internal
corporate network, the password authentication method works just fine.
The krb5 authentication method uses secure Kerberos technology to send an en-
crypted password key between the client and PostgreSQL server. This is the most secure
method of authentication.
Options Field
The options field provides an area for options to be supplied when using the ident au-
thentication method. The keyword sameuser is used to bypass the pg_ident.conf
configuration file and pass the received client userid to PostgreSQL as the logged on user
account. This assumes that PostgreSQL user account names match the userids assigned
to users by the remote host.
74 PostgreSQL 8 for Windows
If this is not the case, a map name can be specified as the option to point to a specific
entry class in the pg_ident.conf configuration file. PostgreSQL matches the remote
client userid within the map name class to a PostgreSQL user account defined in the
configuration file. The upcoming section “The pg_ident.conf File” explains this process
in more detail.
This entry applies to all host connections originating from the local loopback address
(127.0.0.1) of the PostgreSQL server for all users connecting to all databases. Basically,
this applies to every time you connect to the PostgreSQL server from the local system.
This record specifies that these types of connections must use the md5 authentication
method to authenticate PostgreSQL user accounts. Thus, you must supply an appropri-
ate password when you try to log into the PostgreSQL system from the local server.
Here is another sample record that can be created:
This record only allows the postgres user account to connect to any database from
the single network host address of 192.168.0.10. The postgres user account will not be
allowed to log in from any other network address.
Finally, take a look at this example:
This record allows any client on the local 192.168.1.0 subnetwork to connect as any
user to any database using plain-text authentication. Of course, once the client connec-
tion is made, PostgreSQL database restrictions still apply for database access (discussed
in Chapter 4). Again, be very careful when using the password authentication method,
as it can be susceptible to network snooping.
The map-name field contains the name associated with the pg_hba.conf ident field
option. This allows you to set up different maps if you have users accessing the system
using the same userids from different remote host systems.
The ident-name field contains the userid that is passed from the client system to Post-
greSQL in the connection. The account is mapped to the specific PostgreSQL user ac-
count specified in the record.
As an example, assume you have the following record in your pg_hba.conf file:
host all all 192.168.0.10/32 ident testhost
All users from the host 192.168.0.10 will have access to all PostgreSQL databases. User
accounts from this host are mapped to PostgreSQL user accounts using the testhost
ident mapping. Now, we need to look at how the pg_ident.conf file maps these user
accounts. Assume you have the following records in your pg_indent.conf file:
testhost rich richard
testhost mike michael
testhost dan daniel
When the user rich connects from the host 192.168.0.10, he is automatically mapped
to the richard PostgreSQL user account on the system. The same process happens for
users mike and dan. If any user other than rich, mike, or dan attempts to connect from
this host, they will be denied access to the PostgreSQL system.
PROGRAMS
All of the PostgreSQL main program files are located in the bin directory. While most Unix
administrators live and die by these utilities, Windows administrators will want to use
the graphical tools available in the pgAdmin III application (discussed next in Chapter 4).
However, even in Windows it is sometimes easier to just use a simple command-line pro-
gram than to start up a full Windows program just to perform a single function.
To run the PostgreSQL utilities, you either need to be in the PostgreSQL bin direc-
tory or set your Windows PATH environment variable to include this directory. The easi-
est method to run these programs is to use the Command Prompt item in the PostgreSQL
menu (Start | Programs | PostgreSQL 8.2 | Command Prompt). This creates a command
prompt window that defaults to the bin directory.
This section describes some of the more popular PostgreSQL command-line utilities
that are available, and demonstrates how to use them.
pg_config
The pg_config program provides a quick way to see the current configuration val-
ues on the running PostgreSQL system. These are not the configuration values that are
defined in the three configuration files shown earlier. Instead, these are configuration
values that were used to compile and install the PostgreSQL package.
pg_ctl
The pg_ctl program is used to control the PostgreSQL system. It is used to stop, start, or
reload the configuration files for PostgreSQL. To perform one of these functions, though,
you must specify the database cluster area used by PostgreSQL using the -D command-
line option:
C:\>
Notice that if the database cluster pathname includes spaces, you must enclose the
pathname with double quotes. To start the database, you must either be logged in as a
user account without administrator privileges or use the Windows runas command to
run as another user:
C:\>runas /user:postgres "pg_ctl start -D \"c:\Program Files\PostgreSQL
\8.2\data\""
Enter the password for postgres:
Attempting to start pg_ctl start -D "c:\Program Files\PostgreSQL\8.2\data"
as user "EZEKIEL\postgres" ...
C:\>
If you use the runas command to run pg_ctl, the command line gets even more
complicated. Since the runas command line includes spaces, you must enclose the en-
tire command in double quotes. However, the pg_ctl command also requires double
quotes. As you can see in the preceding code, the solution is to use the backslash charac-
ter (\) to escape out the double quotes required for the pg_ctl command.
pg_dump
The pg_dump program provides an easy way for you to dump (or back up) the contents
of a database on the PostgreSQL system to a file.
The pg_dump program can output the dump in one of two dump file formats:
X Script
W Archived
Chapter 3: The PostgreSQL Files and Programs 77
The Script dump format creates a text file that contains SQL statements that will re-
create the database and insert the appropriate data. The Script dump file can be run on
any system that can read SQL statements, such as the psql program described in the
next section.
The Archived dump format creates a compressed binary file of the database that can
only be restored using the pg_restore program, discussed a bit later in the chapter.
There are lots of command-line options used to control how pg_dump works. The -F
option is used to specify the backup type (c for compressed binary, t for uncompressed
binary, or p for plain SQL). You can specify the filename to write the backup to using the
-f command-line option. To produce a compressed backup file of the database test,
you would use the following command:
C:\>pg_dump test -f test.backup -Fc -U postgres
Password:
C:\>dir *.backup
As you can see from the example, you can also specify the user account to use for the
backup by using the -U option. The pg_dump program does not provide any status, it
just quietly performs the backup and exits. If you want to see what it is doing, you can
use the verbose option, -v.
pg_dumpall
The pg_dumpall program is similar to the pg_dump program, except it dumps all of the
databases in a PostgreSQL database cluster to a file that can later be used to restore the
entire PostgreSQL system. This also includes all system tables and user accounts, so you
must log in as the PostgreSQL superuser account to run this program.
By default, the pg_dumpall utility produces the SQL statements necessary to re-create
the entire system, and sends them to the standard output (the console window display).
You can redirect this output to a file by using the standard Windows redirect symbol:
C:\>pg_dumpall -U postgres > backup.sql
When you run this command, it will ask several times for the postgres user pass-
word (each time it connects to a different database). This command creates the file
backup.sql, which contains SQL statements for creating the user accounts, databases,
tables, and all other database objects in your PostgreSQL system.
It is recommended that you run the pg_dump utility on a regular basis, and copy the
resulting backup file to a removable storage medium. The PostgreSQL system allows you
to run the pg_dump program without having to stop the database. It manages all user
transactions during the backup process, and applies them when the backup is complete.
The backup file produced by the pg_dumpall program is in Script format. The da-
tabase can be restored using a standard SQL command-prompt interface, such as psql.
You must also be logged into the system as the superuser account to restore database
and user objects.
78 PostgreSQL 8 for Windows
pg_restore
After using the pg_dump program to dump a database to an archived dump format
file, reason would have it that at some point you may need to restore that data. The
pg_restore program is just the tool for that job. It allows you to restore a database
from a file created by the pg_dump program.
The pg_restore program also provides command-line options that allow you to
select which parts of a total database dump you want to restore.
postmaster
The postmaster program is the main program that controls the PostgreSQL system. It
must be running in the background at all times for anyone to be able to access data in the
PostgreSQL database cluster. All applications (both local and remote) must connect to
the postmaster program to interface with the database. In the Windows implementa-
tion of PostgreSQL, the PostgreSQL installer can configure the postmaster program to
run as a Windows service to ensure that it is always running on the host system.
Each instance of the postmaster program references a single database cluster area.
It is possible to have two or more postmaster programs running on the same server,
each referencing different database cluster areas.
postgres
The postgres program is the database engine part of PostgreSQL. The postmaster
program spawns multiple copies of the postgres program to handle database queries.
As seen in Chapter 2, when you look at the running processes on your Windows system,
you will see one copy of the postmaster program and several copies of the postgres
program running.
It is possible to use the postgres program on the command line as a stand-alone
program to query the database, but this is not an easy task. With the psql program avail-
able, you should never need to use postgres from the command prompt.
Command Description
clusterdb Reclusters tables in a database
createdb Creates a new database
createlang Adds a programming language to a database
createuser Creates a new user in the PostgreSQL system
Dropdb Deletes an existing database
droplang Removes an existing programming language from a database
reindexdb Rebuilds indexes in the database
vacuumdb Removes deleted records from a database, and generates statistics
PostgreSQL Applications
Finally, there are two special applications that are included in the PostgreSQL installa-
tion. These applications help you interface with the PostgreSQL server and provide an
easy way for you to administer and use your databases.
psql
The psql application provides a command-line interface to the PostgreSQL system.
From here you can execute standard SQL commands, as well as special psql commands
used for querying the database to see what tables, indexes, and other features are avail-
able in the database. The psql program is examined in detail in Chapter 5.
pgAdmin III
The pgAdmin III application is a program that provides a fancy graphical interface for
administering a PostgreSQL system. The pgAdmin III program is a separately devel-
oped Open Source application. The home page for pgAdmin III is www.pgadmin.org.
This application allows you to perform any database function from a graphical front
end. You can add or remove databases, users, tables, and tablespaces easily using the
graphical icons. The pgAdmin III program also includes a SQL command interface, al-
lowing you to enter SQL queries and view the results. It is the Swiss Army knife for
PostgreSQL systems. Chapter 4 covers the pgAdmin III program and how to use it to
completely manage your PostgreSQL system.
80 PostgreSQL 8 for Windows
SUMMARY
This chapter covered a lot of ground. The PostgreSQL installation includes lots of files
and directories to support all of its functions. The main PostgreSQL directory is located
by default at C:\Program Files\Postgresql. Each update release has its own di-
rectory under the main directory. Within the update release directory, PostgreSQL di-
vides files into several directories. The data directory contains the database cluster files
and directories necessary for the database engine. The data directory also contains the
PostgreSQL configuration files. The postgresql.conf configuration file defines all
of the features available in the PostgreSQL system. You can edit this file to fine-tune
your PostgreSQL installation. The pg_hba.conf configuration file allows you to define
which PostgreSQL users can log in from remote workstations, as well as control which
databases they have access to. The bin directory contains the utilities and programs that
you use to interact with PostgreSQL. There are several SQL wrapper commands that pro-
vide command-prompt programs that implement standard SQL commands. This allows
you to execute SQL commands without having to use a SQL interface.
The next chapter discusses the pgAdmin III application, which allows you to per-
form all of your database administration functions from a graphical interface. You will
see that you can do all of your administration work from a single interface, making your
life much easier.
4
Managing PostgreSQL
on Windows
81
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
82 PostgreSQL 8 for Windows
T
he previous chapter showed you where all of the database, configuration, and
program files are located on your PostgreSQL system. It is now time to dive in and
start working with the actual PostgreSQL system. This chapter walks you through
the basic functions required to manage your PostgreSQL server. The pgAdmin III
program provides an easy interface for you to perform all of the management functions
required to keep your PostgreSQL server running. There are many pieces to manage
within a PostgreSQL system, and trying to keep track of them all can be a challenge. This
chapter walks you through each of the parts in the system, and shows how to manage
them using pgAdmin III.
PostgreSQL server. The top-right frame shows detailed configuration values of the object
currently selected in the left frame. The lower-right frame shows the SQL code used to
create the object currently selected in the left frame.
There are lots of objects contained in the PostgreSQL system. To help you manage
all of these objects, pgAdmin III divides the objects into categories. The next section
describes the objects in the PostgreSQL system based on the category they are displayed
under.
contains all of the files necessary to run the database. Now it is time to see what these
files contain.
There are five basic components that make up the PostgreSQL server (which are list-
ed in the directory on the left side of pgAdmin III, as shown in Figure 4-2):
X Tablespaces
R Databases
R Schemas (listed under each individual database)
R Group Roles
W Login Roles
Before you dive into managing your PostgreSQL server, you need to be familiar with
each of these components. This section walks through each of these components, de-
scribing how they interact to form the PostgreSQL server.
Chapter 4: Managing PostgreSQL on Windows 85
Tablespaces
Tablespaces are the physical locations where objects are stored. Objects can be anything
from database tables, indexes, functions, and triggers.
The database cluster area you saw while examining files in Chapter 3 is the default
location where PostgreSQL creates its tablespaces. Remember, by default this location
is c:\Program Files\Postgresql\8.2\data, but you can create the default Data
Dictionary in a different location when you install PostgreSQL (described in Chapter 2).
When you initialized the PostgreSQL system, either when you installed it from the
PostgreSQL installer or initialized it using the initdb command, it created two default
tablespaces within the database cluster area:
X pg_default
W pg_global
When you expand the Tablespaces object in the directory on the left side of pgAdmin
III, you see both of these tablespaces listed. The pg_default tablespace is the default
location for all database objects created on the PostgreSQL system. When you create new
database objects, you should use the pg_default tablespace area to store them. You can
store as many database objects as you want in a single tablespace (provided you have
enough disk space available).
The pg_global tablespace is used to hold PostgreSQL system catalogs. The system
catalogs contain internal Data Dictionary information for the PostgreSQL system to op-
erate. You should not create new database objects within this tablespace.
You can also create new tablespace areas within the PostgreSQL system. Often, da-
tabase administrators create new tablespaces on separate hard disk devices to distribute
the database load between disk systems. When new database objects are created, you
must specify which tablespace area they are stored in. It is possible (and sometimes even
beneficial) to store different objects within the same database in different tablespaces.
The pgAdmin III program provides an easy interface for creating new tablespaces.
Before you can create a new tablespace, though, you must set up its physical location on
your system.
The new tablespace must point to an empty directory on the system. If there are any
files currently located in the directory, PostgreSQL will refuse to create the tablespace.
Also, the PostgreSQL system account (called postgres by default) must have permis-
sion to write to the directory. Remember from Chapter 2 that the system account is cre-
ated as a normal user account for security purposes. Most likely you will have to specifi-
cally grant the postgres user account access to the directory. Unfortunately, Windows
XP and Vista Home Editions do not provide a method for you to grant special permis-
sions to directories for user accounts. You have to use the location specified during the
installation. The other Windows operating systems supported by PostgreSQL allow you
to use the Security property of a directory to set privileges for users on the directory. You
can view the Security property of a directory by right-clicking the directory in Windows
Explorer, and selecting Properties.
86 PostgreSQL 8 for Windows
After creating the directory and assigning the postgres account permissions, right-
click the Tablespaces object in pgAdmin III and select New Tablespace from the menu.
The New Tablespace window appears, providing an interface for you to create the new
tablespace. In the form, you must specify a name for the new tablespace, the location
(directory) where the new tablespace will be located, and the user account that will be
the owner of the tablespace. After you create the new tablespace, you can use it to store
new database objects.
Databases
Databases are the core objects in PostgreSQL. They hold all of the data objects used in
PostgreSQL. When a user account connects to a PostgreSQL server, it connects to a da-
tabase object, and has access to all the objects contained within the database, restricted
by the privileges that are granted on the objects. A user account can only access objects
within a single database in a single connection. You cannot access objects in two separate
databases in one connection. Of course, there is no restriction as to how many database
connections a single user can have within an application at the same time, but each indi-
vidual connection can only be connected to a single database.
A PostgreSQL system can (and usually does) have multiple databases defined on
the system. The default database created during the PostgreSQL installation (or when
you use the initdb command) is called postgres. The postgres database contains the
default system tables for handling the internal PostgreSQL Data Dictionary. These tables
are not shown in pgAdmin III, but can be accessed via SQL queries.
There are two additional databases that are configured by default in PostgreSQL, but
not shown in pgAdmin III: template0 and template1. These are, as their names sug-
gest, generic templates that are used to create new databases. Values assigned to these
templates (such as the tablespace location and database owner) are assigned to new da-
tabases created using the template. As you will see in the “Creating a New Database”
section later in this chapter, the pgAdmin III interface allows you to choose any database
to base a newly created database on.
The difference between template0 and template1 is that template1 can be
modified. If you need to create lots of database objects with certain features or objects,
you can modify the template1 database template to include the features and objects,
then use it as the master template to create all of your new databases. Each new data-
base will have the same features and objects as the template1 database template. The
template0 database template cannot be modified and always describes a generic Post-
greSQL database object. This template uses the pg_default tablespace, and any new
databases created with this template will always use the pg_default tablespace.
As you can see from the pgAdmin III display (shown in Figure 4-2), each database
object contains four types of objects:
X Casts
R Languages
Chapter 4: Managing PostgreSQL on Windows 87
R Replications
W Schemas
Casts allow you to specify conversions from one data type to another. Once a cast is
defined, it is active for the entire database and can be used in any table object defined in
the database. Casts are often used to create unique data types for a database. By default
there are no casts defined in the database.
Languages define procedural programming languages that can be used for functions
and triggers within the database. In Chapter 2, you had the opportunity to install new
procedural languages at installation time.
Replications define copies (or replicas) of the PostgreSQL database in a fault-tolerant
operation. PostgreSQL uses an add-on package called Slony-I to control database replicas
distributed among remote servers. This is not part of the standard PostgreSQL package.
Schemas are the most important objects within the database. As you can see from
Figure 4-2, a schema contains the tables, triggers, functions, views, and other objects for
handling data in the database. Table 4-1 describes the different objects contained within
a schema.
A database can contain several different schemas, each with its own set of tables,
triggers, functions, and views. While users can only access objects within one database
at a time, they can access all of the schemas within that database (restricted by assigned
privileges). Sometimes related applications can share the same database, but use differ-
ent schemas to hold their separate data. This makes it easier for users to find data tables
related to the applications within the database. This is especially true if tables have the
same names.
Table names must be unique within a schema, but can be duplicated between sche-
mas. Tables are referenced in SQL statements using the format:
schemaname.tablename
This format specifies exactly which table in which schema is being accessed. Depend-
ing on your naming conventions, this format can become quite cumbersome. However,
PostgreSQL provides a shortcut for us.
Much like a Windows PATH environment variable, PostgreSQL uses the search_
path variable for defining default schema names. If you specify a table name alone
within a SQL statement, PostgreSQL attempts to find the table by searching the schemas
in your search_path. In Chapter 6, you will see how you can set your search_path
variable for your particular schema configuration.
The schema objects created in a new database are based on the template used to cre-
ate the database. By default, the template0 and template1 templates contain a sche-
ma called public. By default, all user accounts that have permission to a database have
permission to create and access objects in the public schema. Most applications do not
use the public schema, but instead create their own schemas to control data used in the
application. You will see how to do that in the “Creating a New Schema” section later on
in this chapter.
Group Roles
Group Roles are used to create access permissions for groups of users. While you can grant
an individual user account access directly to a database object, the preferred method is
to use Group Roles (in fact, pgAdmin III only allows you to grant Group Roles access
to database objects). A Group Role is not allowed to log into the PostgreSQL server, but
controls access for user accounts that do log in.
Group Roles are defined at the PostgreSQL server level and used by all databases
controlled by the PostgreSQL server. By default, there is one Group Role configured in
PostgreSQL. The public group role applies to all users on the PostgreSQL system. You are
not able to remove any user account from the public Group Role. Because of this, the
public Group Role does not appear in the pgAdmin III Group Roles listing.
You can create your own Group Roles using pgAdmin III to control access to your
application schemas and tables. This process is described in detail in the “Working with
User Accounts” section later in this chapter.
Chapter 4: Managing PostgreSQL on Windows 89
Login Roles
Login Roles are roles that are allowed to log into the PostgreSQL server. They are also
known as user accounts. The Login Roles section is where you define accounts for indi-
vidual users for the PostgreSQL system.
Each database user should have an individual account for logging into the Post-
greSQL system. That account is then assigned as a member of the appropriate Group
Roles that grant privileges to the database objects required. In a large database environ-
ment, this allows you to easily change access for database objects without having to
touch hundreds (or even thousands) of individual user Login Roles.
To control access to the data, we will create two Group Roles. The Salesman Group
Role will be given write permission on the Customer and Order tables, but only read
permission on the Product table. The Accountant Group Role will be given write permis-
sion on the Product and Order tables, and read permission on the Customer table. To
round out our application, we will create two Login Roles. Our store will consist of a
salesman, called Barney, and an accountant, called Fred. The following sections create
our new store application.
The New Database window contains four tabs for entering information for the new
database. The Properties tab contains a form for entering the required information to
create the new database. The Variables tab is a generic tab used in pgAdmin III, but does
not apply to the New Database function.
By default, only the database owner will be able to create or modify objects in the
database. The Privileges tab allows you to assign privileges to other Group Roles config-
ured in the PostgreSQL server.
The SQL tab shows the SQL statement generated by your selected options in the
other tabs to create the database. Creating a new database uses the CREATE DATABASE
SQL statement. pgAdmin III fills in the appropriate statement parameters for you auto-
matically based on your entries in the configuration tabs.
Go to the Properties tab and assign a name to your new database (for this example, I
use the database name test). You must select an owner for your database. At this point,
you might not have any users other than the postgres superuser created on your server,
Chapter 4: Managing PostgreSQL on Windows 91
so you can make that user the owner of the database. Remember, the database owner
has complete control over all objects in the database. You never want to make a normal
database user owner of the entire application database.
After entering the database owner, you must select the encoding to use for the data-
base. By default, pgAdmin III uses the SQL_ASCII encoding. The ASCII character set is
an old 7-bit character set that has been around almost since the dawn of computers. This
value has worked fine for simple, English-language databases. However, if you plan on
creating tables that contain advanced data types or using languages with more compli-
cated character sets, the 7-bit ASCII character set will not work. You need to use a full
8-bit character encoding.
In the past this has been a problem with running PostgreSQL on Windows. However,
since version 8.1, PostgreSQL supports the UTF8 encoding format on Windows systems.
If you are not sure what type of data your application will ultimately hold, it is always
safest to use the UTF8 encoding.
The next setting on Properties tab is the template to use to create the new database.
If this is the first database you have created, all that will be available to you are the tem-
plate0 and template1 templates. As you create new databases, pgAdmin III provides
those to be used as templates as well. For now you can use the template1 template to
create your new database.
After the template choice comes the tablespace. In most situations, you should use
the pg_default tablespace for your new database. Although pgAdmin III offers the
pg_global tablespace as an option, you should never create new objects in the pg_
global tablespace. It is generally not a good idea to mix your application objects with
the system objects. If you had created a new tablespace on your system, it would also
appear in the list of options available.
The Comment section is another area that does not apply to creating databases. Post-
greSQL allows you to add comments to many objects when they are created, but data-
bases are not one of them.
After filling out the Properties tab, look at the SQL tab. It contains the complete SQL
statement used to create the database. This is shown in Figure 4-4.
The CREATE DATABASE SQL statement (see Figure 4-4) is what pgAdmin III runs to
create the new database. In Chapter 6, you will learn how to manually create and enter
these types of SQL commands. However, with pgAdmin III it is as easy as filling out a
form and clicking a button.
Speaking of clicking a button, go back to the Properties tab and, if it is complete, click
the OK button. PostgreSQL will create the new database object, which then appears in
your pgAdmin III main window object listing. By default, you are not connected to the
new database object, so it is shown with a red X over it. Clicking the new object connects
you to the database and expands the database object in the window.
within the public schema, we will create a new schema to hold our application data
objects. Unlike the public schema, the newly created schema will be protected by de-
fault. No users will have access to the schema until we specifically grant them access.
This helps provide better security for the application by default.
There are a couple of ways to create a new schema in the database. The first method
is to right-click the newly created database object that will contain the new schema and
select New Object | New Schema. The second method is to expand the newly created
database object, right-click the Schemas object contained within the database object, and
select New Schema from the menu. Either method opens the New Schema window,
shown in Figure 4-5.
Just like the New Database window, the New Schema window provides an easy form
(Properties tab) for you to fill out to define the new schema. You must provide a unique
name for the schema within the database in which it is created. For this example, use the
schema name store. Just as with the new database, you must also provide the owner
Chapter 4: Managing PostgreSQL on Windows 93
of the schema, which should be set to the postgres user account. Unlike the new data-
base, you do not specify the tablespace for the schema. PostgreSQL creates the schema
object in the same tablespace where the parent database object resides. Also unlike the
new database, you can supply a comment in the Comment field of the Properties tab.
Note that the OID field is grayed out. PostgreSQL automatically assigns a unique OID to
the database once it is created.
The New Schema window also allows you to assign permissions to the newly created
schema. By default, only the superuser account has permissions to create and modify
data in the schema. We will change that later after we create some user accounts.
After entering the information, click OK to create the new schema. The new schema
automatically appears within the database object in pgAdmin III. You can expand the
new schema object to see the listing of objects contained within the schema. At this point
there should not be any new objects within the schema. Now it is time to create some
tables.
94 PostgreSQL 8 for Windows
The last rule is often what gets novice database administrators in trouble. This means
that you can have a table named Customer as well as another table named customer in
the same schema. This is almost always a recipe for disaster when running SQL state-
ments, and should be avoided if at all possible. For this exercise, we will capitalize the
first character in the table names and make the rest lowercase. When creating this new
table, use the name Customer.
As with the other database objects, you must define the owner of the table. Again, we
will use the postgres user for this exercise.
When creating a new table, you must specify the tablespace where the table will be
stored. You may store tables in different tablespaces within the same database if you de-
sire. For this exercise, you should just use the pg_default tablespace.
The Has OIDs check box indicates whether PostgreSQL will assign an object ID (OID)
to the newly created table. By default, PostgreSQL does not assign an OID to user-created
tables. If you plan on using the PostgreSQL inheritance feature (discussed in Chapter 1),
you must assign an OID to the table. Since our tables will not use this feature, we can
leave this check box unchecked.
The Inherits from Tables section allows you to specify whether the newly created
table will inherit any columns (fields) from parent tables. The parent table must have an
OID assigned to it. Again, since none of our tables will do this, we will leave this section
blank.
Finally, the Comment section allows us to place a comment in the table that can
be seen from pgAdmin III when we view the properties of the table. This is extremely
handy when designing a large system with lots of tables.
After filling in the required information in the Properties tab, click the OK button
to create the table. The new table appears under the Tables list in the new schema. You
have created a new table, but there is no data defined within the table. We will do that
shortly.
Before entering new data, look at the table objects created with the new table. Expand
the new Customer table object that was created in the new schema. Each table consists
of five categories of objects:
The next step to complete your table is to define columns for the data. Each column
defines a data element that will be stored in the table. Each column must be declared as
96 PostgreSQL 8 for Windows
a specific data type, depending on the type of data you intend to store in the column.
Which data elements are placed in which tables is a subject of many relational database
theories. For the purposes of this exercise, we will try to keep things somewhat simple.
Table 4-2 shows the columns that will be used for the Customer table.
The Customer table contains the necessary contact information for the customer. It
also contains a unique value added to the customer information to uniquely identify the
customer in the database (in case you get two customers named John Smith, for exam-
ple). There are two different data types used for the columns in the Customer table. The
char data type holds a fixed-length character string. The length of the character string
must be stated when the column is created. The CustomerID column will be created as
a six-character string, the State column as a two-character string, and the Zip column as
a five-character string.
The varchar data type holds a variable-length character string. This is for fields where
you do not know the length of the data that could be entered. The columns where the
data lengths can vary widely, such as the street address, are set to use the varchar data
type.
There are lots of data types defined within PostgreSQL. Table 4-3 shows some of the
more popular data types that are used.
This is just a small sampling of the many data types available. The PostgreSQL Help
window (in pgAdmin III, choose Help | PostgreSQL Help) contains information on all
of the PostgreSQL data types available.
You can create the data columns either when you first create the table or after the
table has already been created. If you created the Customer table earlier, there are a cou-
ple of different methods you can use for adding columns. You can double-click the new
Chapter 4: Managing PostgreSQL on Windows 97
table object in pgAdmin III to bring up the standard Table Properties window or you can
right-click the Columns object under the Customer table entry and select New Column.
I prefer to double-click the new table object, as it provides the entire Table Properties
window to view. In the Table Properties window, click the Columns tab to access the
columns list.
In the Columns tab, click the Add button to add a new column. The New Column
window appears, shown in Figure 4-7.
Fill in the forms for the information to define the column. The Data Type textbox in-
cludes a handy drop-down menu that contains all of the available data types. Select the
appropriate data type for the column you are creating. For the CustomerID and Phone
columns, check the Not NULL check box. This feature requires that these fields must
be filled for the data record to be added to the table. If either of these fields is missing,
the new data record will be rejected.
As you enter columns in the table, they appear in the columns list in the Columns
tab. When you have entered all of the columns for the Customer table, click the Con-
straints tab. The Constraints tab allows you to specify further constraints on the data in
the columns.
98 PostgreSQL 8 for Windows
There will be one constraint in the Customer table. You want to ensure that no two
customers have the same CustomerID column value. To do that, you can make this col-
umn a primary key for the table. Primary keys are used to uniquely identify each record in
the table. Each table can have only one primary key assigned to it. A column (or column
combination) used for the primary key is guaranteed to be unique. The PostgreSQL sys-
tem will not allow users to enter records with a duplicate primary key.
On the Constraints tab, ensure that the Primary Key value is set in the drop-down
box, and click the Add button. The New Primary Key window appears. On the Properties
tab, set the name of the primary key (call it CustomerKey) and define which tablespace
it should be located in. You do not have to place a primary key in the same tablespace as
the data table (and, in fact, many database administrators do not for performance pur-
poses). The primary key object creates an index of the database that is accessed during
queries to help speed things up. PostgreSQL can quickly search the index looking for the
key value without having to read all of the table columns.
After filling out the information on the Properties tab, click the Columns tab. On this
tab, you must select the columns that are used for the primary key. For this exercise, select
Chapter 4: Managing PostgreSQL on Windows 99
the CustomerID column. You can choose to make a constraint deferrable, which means
PostgreSQL does not check the constraint at the end of each command, but rather at the
end of a transaction. When you have defined the primary key, click the Add button.
After filling out the forms, click OK to add the constraint to the Table Properties
window. When you have all of the columns and the constraint configured, click the SQL
tab to view the generated SQL statements required to implement your selections. In this
case, you can see how pgAdmin III saves you from lots of typing by not having to manu-
ally enter the SQL code to create the table columns. After reviewing the SQL statements,
click OK to create the columns and constraint. Back in the pgAdmin III window, you
should see the new columns and constraint added to the table objects.
Now you can create the other two tables for the application. The Product table col-
umns are shown in Table 4-4.
Create the Product table using the same procedures used to create the Customer table.
Do not forget to add the primary key constraint to the table for the ProductID column.
The last table is the Order table. This one is a little different from the other two tables. It
uses column values derived from the other two tables. Table 4-5 describes the Order table.
Create the Order table as you did the other two tables. Create the CustomerID and
ProductID columns using the same data type and length as with the associated columns
in the other tables. The difference with the Order table is in the constraints. There are two
additional constraints that must be created. You must create constraints to ensure that
the CustomerID and ProductID values entered into the Order table are valid.
Click the Constraints tab in the New Table window for the Order table. First, create
the primary key on the OrderID column as normal. After creating that, from the drop-
down menu on the Constraints tab, select Foreign Key and click Add. The New Foreign
Key window appears, shown in Figure 4-8.
For the first constraint, we will match the CustomerID column to the CustomerID
column in the Customer table, so call the new constraint Customer. In the References
textbox, click the drop-down arrow and select the store. Customer table (remember,
100 PostgreSQL 8 for Windows
Figure 4-8. New Foreign Key window for the Order table
Chapter 4: Managing PostgreSQL on Windows 101
tables are referenced by schemaname.tablename). Next, click the Columns tab. The Local
Column textbox references the column in the Order table you want to match. Select the
CustomerID column. The Referencing textbox references the column in the Customer
table you want to constrain the column to. Select the CustomerID column, click Add,
then click OK. Do the same with the ProductID column (except reference the ProductID
column in the Products table).
The Action tab allows you to specify an action to take on the related data when an
event occurs on the table. If the foreign key column value is updated or deleted from the
table, the related value in the foreign key table can also be updated or deleted. For now
we will not worry about that, so leave the settings at the default No Action. When you
are finished, click OK to create the table.
The table columns are shown as columns across the top of the Edit Data window.
Within the window are grids showing the individual records within the table. You can
enter data directly into the next available row in the window.
When you have entered data for all of the columns in a record, press enter to submit
the data. All constraints applied to the columns apply to the Edit Data window. If you
violate any of the constraints, an error message appears when you try to submit the data,
as shown in Figure 4-10.
In Figure 4-10, I attempted to enter a record with the same CustomerID column value
as an existing record. The Edit Data window produced an error message informing me of
my error. Unfortunately, when you get an error message, all of the data entered into the
record is lost, and you must start over on that record.
By default, the Edit Data window displays all of the records contained in the table
and sorts them in ascending order based on the defined primary key. You can use the
Sort/Filter option button (the funnel icon) to change this. Clicking this button produces
Chapter 4: Managing PostgreSQL on Windows 103
the View Data Options window (a slight misnaming of windows and icons), which has
two tabs:
X Data Sorting Allows you to specify how to sort data as it is displayed in the
Edit Data window.
W Filter Allows you to specify what data to view using simple expressions. You
can use standard comparison expressions within the Filter tab to filter the data
you are looking for.
After entering your expressions, click the Validate button to see if your filter expres-
sions will work.
One warning about using column names in the Filter tab (actually, this applies to all
of PostgreSQL). If you use the filter expression LastName = 'Blum' you will get an
error when you click the Validate button, as shown in Figure 4-11.
All the error message says is that the column “lastname” is invalid. This error throws
many novice administrators off, as they assume that indeed they do have a column
named lastname. What has happened is that PostgreSQL assumes all column names are
in lowercase. Unfortunately, I created my column name in mixed case. Notice that even
though I entered the filter expression correctly, PostgreSQL converted the column name
to all lowercase letters, which did not match the real column name.
To solve this problem, you should always place double quotes around column and
table names that use uppercase letters. If you use the schemaname.tablename format, make
sure that each one has quotes around each individual name instead of the whole name (for
example, "Store"."Customer" and not "Store.Customer"). Notice, however, that
character strings use single quotes. This feature can get confusing, as well as annoying.
Do not get too hung up on the filter expressions at this point. Chapter 6 discusses this
topic in greater detail.
X Execute query
R Execute query and save results to file
W Explain query
The first two are somewhat self-explanatory. When the SQL commands are executed,
the results are displayed in the lower frame in the Data Output tab. If you use the Save
Results to File button, an Export Data to File window appears, providing you with op-
tions on how you would like the data saved. You can select various options for the for-
mat the data is stored in, such as what character is used to separate record fields, and the
type of quotes used to delineate string values.
The third option, Explain Query, is somewhat different. Instead of running the query,
the Query Tool uses the PostgreSQL explain plan. The explain plan describes the steps
Chapter 4: Managing PostgreSQL on Windows 105
the PostgreSQL database engine takes to process the query, and how long the individual
steps would take to process.
The steps appear as icons in the Explain tab of the lower frame. For a simple one-step
query, the results are not too interesting. You will see just a single icon. When you click
the icon, a textbox appears, showing the statistics for processing the SQL command. For
more complicated SQL commands, you see multiple step icons within the Explain tab.
This feature enables you to determine which steps take the most amount of time when
processing the SQL commands, and possibly allow you to alter your commands to be
more efficient. These will be covered in more detail in Chapter 10.
So far you have been accessing your tables using just the postgres superuser account.
While this works okay in development, this is a major no-no in the production world.
In the real world, you want each of the database users to have their own Login Roles
(user accounts) assigned to them. This ensures that you can track database activity down
to the individual user. You also want to create Group Roles to grant permissions to tables
based on functions performed, then assign the individual Login Roles to the appropriate
Group Roles. This allows you to move user Login Roles around between tables without
having to mess with individual privileges.
This section walks through the steps required to set up a simple user environment for
the test database created.
X Salesman Has write access to the Customer and Order tables, and read access
to the Product table
W Accountant Has write access to the Product and Order tables, and read access
to the Customer table
To create a new Group Role, right-click the Group Roles entry in the pgAdmin III win-
dow and select New Group Role. The New Group Role window, shown in Figure 4-13,
appears.
By now you should be familiar with the drill. In the window, there is a simple form
on the Properties tab to fill out the information pertinent to the Group Role you want
to create. Group Roles do not have login privileges, so you do not need to create a
password for the role (although PostgreSQL allows you to if you want to). For these
Group Roles, you also do not need to check any of the check boxes in the Role Privileges
area of the Properties tab. We obviously do not want the members of this group to be
superusers, and there will not be any objects or roles that these users need to create.
You can cascade Group Roles, assigning one group as a member of another group, and
allowing it to inherit all of the permissions assigned to the parent group. We will not
get that complicated in this simple example. You also have the option of setting a date
when the group will expire. This is convenient for setting up temporary testing groups
for applications.
Go ahead and create the Salesman and Accountant roles using the New Group Role
window. You should see the new roles added to the PostgreSQL system as you create
them.
After creating the new Group Roles, you must assign the proper permissions to the
tables for them. This is done in the individual Table Properties windows.
Chapter 4: Managing PostgreSQL on Windows 107
Double-click the Customer table icon in pgAdmin III. The Table Properties window
appears for the table. Click the Privileges tab, showing the configuration window to set
permissions on the table, shown in Figure 4-14.
In the lower section, you must select the role to grant permissions for. Select the
Salesman Group Role. Next, select the permissions you want to grant for that Group
Role. For the Salesman role, we want to grant them INSERT, SELECT, UPDATE, and
DELETE privileges, so check those check boxes. The WITH GRANT OPTION check box
allows members in the Group Role to reassign the same privileges to other users. We
do not want our users going behind our backs and giving other users privileges to the
tables, so leave those check boxes empty.
When you are done, click the Add/Change button to assign the selected privileges.
You will see the Salesman Group Role name added to the list in the top frame, along with
a code showing the privileges assigned. The codes are shorthand for the privileges set.
Table 4-6 shows the codes that are used.
108 PostgreSQL 8 for Windows
Next, do the same for the Accountant Group Role, assigning them only SELECT
privileges for the table.
When you have finished, do the same for the Product table, assigning INSERT,
SELECT, UPDATE, and DELETE privileges for the Accountant role, and only SELECT priv-
ileges for the Salesman role. Finally, assign privileges for the Order table, giving both the
Salesman and Accountant roles INSERT, SELECT, UPDATE, and DELETE privileges.
Now that you have all of the table privileges created, you might think that you
are done. Unfortunately that is not the case. There is one more privilege you need to
address.
As a security feature, a newly created schema does not grant any privileges to any
groups (not even the public group). Only the superuser (postgres by default) has
privileges to modify the schema. By default, pgAdmin III does not show individual user
privileges in the listing until after you add group privileges, so you will not see the
postgres user privileges displayed. You must specify if your new Group Roles should
have access to the store schema. Double-click the store schema in pgAdmin III, then
Chapter 4: Managing PostgreSQL on Windows 109
Code Privilege
a INSERT (append)
r SELECT (read)
w UPDATE (write)
d DELETE
R RULE
x REFERENCES
t TRIGGER
X EXECUTE
U USAGE
C CREATE
T TEMPORARY
click the Privileges tab. You should not see any group privileges shown for the schema.
Select the group to add privileges for from the drop-down menu. The USAGE privilege
allows the group to use existing objects in the schema. The CREATE privilege allows
groups to create new objects in the schema.
You must add the USAGE privilege to both the Salesman and Accountant Group
Roles for those users to have access to objects in the schema. Just select the appropri-
ate Group Role from the drop-down menu, check the USAGE check box, and click the
Add/Change button. When you are done, click OK. Now your application should be all
set for user accounts.
In the Role Privileges section, you must select the Inherits Rights from Parent Roles
check box. This ensures that the Login Role will inherit the privileges assigned to the
Group Role it is assigned to.
After filling in the Properties tab, click the Role Membership tab. This is where you
assign the Login Role to one or more Group Roles. For a Salesman account, click the
Salesman group name in the left box and click the right arrows. The Salesman group
name is added to the Member In list.
To complete this example, create two Login Roles, user barney and user fred. Add
barney to the Salesman group and add fred to the Accountant group.
While it is not necessary, it is best practice to create Login Roles using all lowercase
letters. As mentioned before, PostgreSQL is case sensitive in all things, so creating a Log-
in Role Barney can result in future problems if the user does not remember to capitalize
the first letter. Using all lower case letters is a standard practice on database systems.
Chapter 4: Managing PostgreSQL on Windows 111
Testing Privileges
You can now test your new user environment by logging into the PostgreSQL server us-
ing the newly created Login Roles. The easiest way to do that is to use the psql program.
Chapter 5 describes this program in great detail, but for now we will just use it to see if
we can get to the new tables.
The easiest way to start the psql program is to choose Start | Programs | Post-
greSQL 8.2 | Command Prompt. This creates a command prompt window in the default
bin directory of PostgreSQL, where all of the utilities are located. From here you can run
the psql command-line program. There are lots of command-line options for psql (as
you will see in Chapter 5). For now we will just use a basic format:
psql databasename username
The first option specifies the database to connect to. The second option specifies the
Login Role to log in as. Assuming you have been following the example names in this
chapter, log in using the test database and the fred Login Role:
psql test fred
After entering the password for the fred Login Role, you will be greeted by the
psql welcome screen and given a command prompt, showing the database name you
are connected to.
The fred Login Role is a member of the Accountant Group Role, so he should have
write privileges for the Product table. We can test this by using a simple SQL INSERT
statement:
test=>INSERT into store."Product" VALUES ('LAP001', 'Laptop', 'TakeAlong',
'Acme', '500.00', 100);
INSERT 0 1
test=>
The details of the INSERT SQL statement will be explained in Chapter 6. For now,
all you need to know is that this command attempts to enter a new record into the Prod-
uct table (again, note the double quotes required around Product). Note that, using this
command format, you must enter a value for each column in the table, in the order the
column appears in the table. Character values must be surrounded by single quotes.
Also, be aware of the semicolon at the end of the SQL statement. This is required to let
psql know the SQL statement is finished. As you will see in Chapter 6, you can create
SQL statements that go on for several lines.
The result of the SQL statement indicates that one record was added to the Product
table, so indeed Fred has his necessary privileges. You can use a SELECT SQL statement
to ensure the record really was added to the table. Now you can see if Fred can add a
112 PostgreSQL 8 for Windows
record to the Customer table (which he should not be allowed to do, based on the Ac-
countant Group Role privileges):
test=> INSERT into store."Customer" VALUES ('BLU001', 'Blum', 'Rich',
'111 Main St', 'Gary', 'IN', '46100', '555-1234');
ERROR: permission denied for relation Customer
test=>
As expected, Fred was prevented from adding a new Customer record. You can try
the same tests for the barney Login Role. When you are ready to exit psql, just type the
command \q by itself on a line and press enter.
DATABASE MAINTENANCE
Once your application goes into production mode, there will most likely be lots of activi-
ty on the tables. When lots of records are deleted and updated, dead space appears in the
tables. This is a result of deleted or updated records being marked for deletion but not
physically removed from the table. Remember, at any time, a transaction could be rolled
back, meaning that previously deleted records need to be recovered. PostgreSQL accom-
plishes this by keeping internal information about each record, such as if it is marked for
deletion or not.
Over time, an active table could possibly contain more records marked for deletion
than actual live records. Because of this, dead space should be cleaned out at regular
intervals.
PostgreSQL provides a method called vacuuming to remove records marked for dele-
tion. Chapter 3 showed the postgresql.conf file settings for the autovacuum func-
tion. By default, PostgreSQL vacuums all database tables at a regular interval, controlled
by the autovacuum_naptime setting in the postgresql.conf file. However, you can
also manually vacuum a table. There is a special utility just for this function.
The vacuum utility can be accessed directly from the pgAdmin III main window.
There are a few different ways to vacuum records from a table:
X Set a table to automatically vacuum at preset intervals, different from the default
autovacuum settings in the postgresql.conf file
R Manually vacuum a table
W Manually vacuum all tables in a database
If there is not a lot of activity on a particular table, you can turn off the autovacuum
feature in the postgresql.conf file and manually vacuum your tables when neces-
sary. To manually vacuum a table or database, right-click the table or database object
and select Maintenance. The Maintain Table (or Database) window appears, as shown
in Figure 4-17.
The Maintain Table and Database windows allow you to
Besides vacuuming deleted records from a table, reindexing any created table in-
dexes can also speed up query performance for a table. As new records are added, the
index can become disjointed. By reindexing, PostgreSQL can reorganize the index values
to maximize query performance.
114 PostgreSQL 8 for Windows
Performing a Backup
To back up a database, right-click the database object and select Backup from the menu.
The Backup Database window appears, as shown in Figure 4-18.
From this window you can select the options for your backup, including the file-
name of the backup file, the type of backup (as discussed in the “pg_dump” section in
Chapter 3), and the parts backed up. By default, the pg_dump utility uses the COPY SQL
commands to restore table data. If you need to restore the database on a different data-
base system, you should use the INSERT commands feature by checking the check box.
If you do select the Insert Commands check box, you will not be able to store OIDs either,
since OIDs cannot be copied using standard SQL INSERT commands. After the backup
completes, click the Messages tab and view the backup results.
Restoring a Database
The other side of backing up your database is the ability to restore the database. The
backup file created by the pgAdmin III program can be restored using the pg_restore
utility, discussed in Chapter 3. This file can be used to restore the database on any Post-
greSQL platform on any server. This makes for a powerful tool in migrating databases to
larger servers on different platforms.
With the pg_restore utility, all you need to do is run the utility with the backup file.
The objects that are contained in the backup file are automatically restored. Unfortunately,
in pgAdmin III, to restore a database, schema, or table, you must have a similar skeleton
object available to perform the restore. First, create a new object using the same name as
the original object that was lost (you can easily copy an object by creating an object with
a different name and performing the restore on that object). You do not have to worry
about setting too many options, as they will be restored during the restore process. Next,
right-click the new object and select Restore from the menu. The Restore Database
window, shown in Figure 4-19, appears.
From the Restore Database window, select the appropriate backup file to restore,
and the options you need (for a complete restore, you do not need to check any of the
options). The restore process starts, and the results are shown in the Messages tab.
Please do not wait until you have lost important data to find out that you missed
something in the process. Now that you have a few sample tables and data, try doing a
backup of your test database. After you back up the database, remove it by right-clicking
the database object and selecting Delete/Drop from the menu.
After removing the database, create a new skeleton database, then perform the re-
store process using your backup file. After the restore, you will have to click the database
object and then click the Refresh icon in the toolbar. The new (or rather old) database
objects should now be present. View the data in the new tables to make sure the data has
also been restored. You should also try a few table queries using psql to ensure that the
privileges have been restored as well.
It is important to remember that this backup and restore did not affect any of the
Login or Group Roles created in the PostgreSQL system. While the backup preserved
the privileges assigned to objects, it did not create the roles themselves. To fully back up
your entire PostgreSQL system, use the pg_dumpall utility. This creates the SQL state-
ments to restore your entire PostgreSQL system, including Login and Group Roles.
SUMMARY
This chapter walked through the many features of the pgAdmin III program, and dem-
onstrated how to use it to perform all of your database administration needs. You can
use pgAdmin III to create new tablespaces, databases, schemas, tables, and user accounts
(Login Roles). From each new table, you can create new columns to hold data, as well as
define constraints, indexes, rules, procedures, and triggers. After you create new tables
and data, you can create Group Roles to assign group privileges to tables and schemas.
Finally, you use pgAdmin III to create Login Roles (also known as user accounts), and
assign them to the appropriate Group Roles.
An important part of managing a PostgreSQL system is to maintain the database
data. You can do this using the table maintenance features provided in pgAdmin III. The
vacuum and reindex functions are essential to maintaining your database performance
as you accumulate more data. Backing up the database is another important function of
the database manager. pgAdmin III provides a simple way to back up and restore indi-
vidual databases, but to back up and entire PostgreSQL system, you must still use the
pg_dumpall utility.
In this chapter, you got a taste of the other powerful PostgreSQL application, psql.
The next chapter expands on the psql program, showing all of the functions you can
perform using the simple command-line interface.
This page intentionally left blank
II
Using PostgreSQL
in Windows
119
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
This page intentionally left blank
5
The psql Program
121
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
122 PostgreSQL 8 for Windows
A
fter you have installed the PostgreSQL system and created user accounts,
databases, and tables using pgAdmin III, it is time to start interacting with your
new system. While PostgreSQL provides many interfaces to interact with the
system, none is as simple as the psql program. This simple console application provides
full access to the entire PostgreSQL system using both internal meta-commands and SQL
statements. This chapter dissects the psql program and shows how to use it for all of
your database needs.
Connection Options
The psql program uses command-line options and parameters to control its features.
The format of the psql command line is
where options can be one or more options that define additional information that controls
the various psql features. The databasename and username parameters allow you to directly
specify these values on the command line without having to use the options format.
To simply start psql and connect to a specific database on the local system as a
specific user, all you need to do is list the database name and user account name on
Chapter 5: The psql Program 123
the command line. As was demonstrated in Chapter 4, you can connect to the database
named test on the local system using the PostgreSQL user account fred with the psql
command:
psql test fred
The PostgreSQL pg_hba.conf configuration file on the server defines what authen-
tication method the psql connection needs to use to authenticate the connection (see
Chapter 3). If the entry in the pg_hba.conf file defining this connection requires pass-
word authentication, psql asks you for the password for the fred account. Once you
enter the correct password, the psql command prompt appears, ready for action.
You may have also noticed the psql link available in the PostgreSQL 8.2 Start menu.
This link provides a quick way to start psql as the postgres superuser account and
connect to the postgres database. This link is handy if you need to do any quick da-
tabase administration work, but I would not recommend using it for normal database
activity. It is always easy to enter wrong commands, and always logging in as the super-
user account can cause problems if you do the wrong things, such as accidentally delete
the wrong database. It is much safer to just log in as a normal user account unless you
absolutely have to be the superuser.
Feature Options
The psql command-line options allows you to control what features are enabled when
you start psql. The format for a feature option is
optionname [value]
The value parameter is required for some options to further define the option behav-
ior, such as specifying a filename or a variable name. There are two formats that can be
used to specify the optionname parameter:
X Long-name format Uses a common name to represent the option, preceded by
a double dash, such as —echo-queries.
W Short-name format Uses a single character to represent the option, preceded
by just a single dash, such as -e. Be careful when using the short-name options,
as they are case sensitive.
Table 5-1 describes all of the feature options that are available on the psql command
line.
You can see from this table that many of the psql command-line options focus on
controlling how table data is displayed. You will see later in this chapter in the “The psql
Meta-commands” section that many of these command-line options can also be set us-
ing psql meta-commands at the psql prompt. Sometimes it is easier to include a quick
command-line option than to use the internal meta-commands. For example, if you just
124 PostgreSQL 8 for Windows
want to execute some SQL statements stored in a file, you could use the following psql
command:
C:\Program Files\PostgreSQL\8.2\bin>psql -f test.sql test fred
Password for user fred:
Customer Query result
CustomerID | LastName | FirstName | Address | City | State | Zip | Phone
————————————+——————————+———————————+———————————+———————+———————+———————+——————
BLU001 | Blum | Rich | 123 Main | Chicago | IL | 60633 | 555-1234
(1 row)
C:\Program Files\PostgreSQL\8.2\bin>
The -f option specifies the text file to read SQL statements from. By default, psql
assumes the text file is located in the same directory you started psql from. You can also
supply a full pathname for the file. If the pathname includes spaces, you must use double
quotes around the pathname. The psql command connected to the database test using
the user account fred, then executed the SQL commands in the file test.sql.
List of databases
C:\Program Files\PostgreSQL\8.2\bin>
Chapter 5: The psql Program 127
In this example, the -l option is used to obtain a list of all the databases created in
the local PostgreSQL server, and the -U option is used to specify the user account to log
in with. If you do not specify a database name as an option, you cannot specify the user
account to log in with unless you use the -U option.
Once you have started your psql session, it greets you with a simple welcome mes-
sage, then the psql command prompt:
test=>
If this gets old, you can use the -q command-line option to disable the welcome
message.
The default psql prompt shows the database you are currently connected to. If you
are connected as a normal user, the prompt ends with a greater-than sign (>). If you are
connected to the database as a superuser (such as the postgres user), the prompt ends
with a pound sign (#):
test=#
This provides a warning to let you know of your superuser capabilities. Again, use
extreme caution when deleting objects as the superuser.
At the psql prompt you can enter psql commands to control and query objects in
the database. There are two different types of psql commands:
Chapters 6 and 7 cover the SQL commands you can use in psql. The following sec-
tion describes the special psql meta-commands.
of the available meta-commands from the psql prompt by typing \?. The psql meta-
commands are divided into the following categories based on their functions:
X General meta-commands
R Query buffer meta-commands
R Input/output meta-commands
R Informational meta-commands
R Formatting meta-commands
W Copy and large object meta-commands
Command Description
\c dbname Connect to different database
\cd dir Change to a different working directory on the local
system
\copyright Display the PostgreSQL copyright information
\encoding encoding Display or set the current psql encoding
\h statement Display help on a SQL statement
\q Exit (or quit) psql
\set name value Set variable value (same as -v command-line option)
\timing Display total time a command takes
\unset name Unset variable value
\! command Execute command on host system
you are currently connected as (which has already been authenticated) to connect to the
specified database. If you do not have privileges to connect to the database, your previ-
ous database connection is restored. Here is an example of connecting to a database from
within psql.
test=> \set
VERSION = 'PostgreSQL 8.2.0 on i686-pc-mingw32, compiled by GCC gcc.exe
(GCC) 3.4.2 (mingw-special)'
AUTOCOMMIT = 'on'
VERBOSITY = 'default'
PROMPT1 = '%/%R%# '
PROMPT2 = '%/%R%# '
PROMPT3 = '>> '
DBNAME = 'test'
USER = 'fred'
PORT = '5432'
ENCODING = 'UTF8'
test=>
The PROMPT1, PROMPT2, and PROMPT3 variables allow you to set the prompts used
in the psql session to descriptive text, giving you hints about where you are in the
database process. The reason for the three prompts is the three levels of input modes in
psql:
As you can see in the \set variable output, the prompts use substitution characters
(characters preceded by a percent sign, %) to represent variable information. Table 5-3
lists and describes the prompt variable substitution characters that can be used.
130 PostgreSQL 8 for Windows
The psql prompts can be customized to reflect many features of your database
connection:
Besides setting the default variables, you can create your own variables that are used
during the psql session. Here is an example of creating your own variable:
test=> \set cust 'store."Customer"'
test=> select * from :cust;
CustomerID | LastName | FirstName | Address | City | State | Zip | Phone
————————————+——————————+——————————-+——————————+—————————+———————+———————+————————
BLU001 | Blum | Rich | 123 Main | Chicago | IL | 60633 | 555-1234
(1 row)
test=>
The variable cust is set to the full pathname of the Customer table. The variable can
then be referenced in any SQL statement by preceding it with a colon (:).
Chapter 5: The psql Program 131
test=> \p
select * from store."Product";
test=> \g
ProductID | ProductName | Model | Manufacturer | UnitPrice | Inventory
———————————+—————————————+———————————+——————————————+———————————+———————————
LAP001 | Laptop | Takealong | Acme | 500 | 10
(1 row)
test=> \r
Query buffer reset (cleared).
test=> \p
Query buffer is empty.
test=>
Command Description
\e [file] Edit the query buffer (or file) with an external editor (Windows
Notepad).
\g [file] Send the query buffer to the PostgreSQL server. Place the results
in file if specified. This can be a full pathname, or a file located in
the directory psql was started from.
\p Display the current contents of the query buffer.
\r Reset the contents of the query buffer.
\w file Write the contents of the query buffer to file. This can be a full
pathname, or a file located in the directory psql was started from.
Notice that even though a variable is used for the SQL statement, the query buffer
contains the expanded variable text. The \g command is used to rerun the SQL state-
ment in the query buffer. The \r command resets (empties) the query buffer.
Input/Output Meta-commands
The input/output meta-commands control the way SQL statements are input to psql
and how output should be handled. Table 5-5 lists and describes these commands.
The \i command is extremely helpful. This command allows you to run SQL state-
ments stored in files on the system. This includes any files created by the pg_dump back-
up program (described in Chapter 3) that were created using the plain backup format.
By default, the file must be located in the same directory you started psql from. If it is
not, you must specify the full pathname for the file. If the pathname includes spaces, you
must enclose the filename in double quotes.
While in interactive mode the \echo command may seem silly, it is often used to insert
comments within stored SQL statements. The echoed strings appear within the standard
command output. As an example, create the file test.sql, and enter the following lines:
\echo Customer query result
select * from store."Customer";
\echo Product query result
select * from store."Product";
Now, within a psql session, use the \i meta-command to run the SQL file:
test=> \i test.sql
Customer Query result
CustomerID | LastName | FirstName | Address | City | State | Zip | Phone
————————————+——————————+———————————+——————————+—————————+———————+———————+————————
BLU001 | Blum | Rich | 123 Main | Chicago | IL | 60633 | 555-1234
(1 row)
Product Query result
ProductID | ProductName | Model | Manufacturer | UnitPrice | Inventory
———————————+—————————————+———————————+——————————————+———————————+———————————
LAP001 | Laptop | Takealong | Acme | 500 | 10
(1 row)
test=>
Command Description
\echo string Display string on the standard output
\i file Execute commands from the specified file
\o file Redirect all query output to the specified file
\qecho string Write string to the output specified in the \o command
The commands entered in the test.sql file were processed by psql, and the output
was displayed on the console. Notice that the commands themselves were not displayed
in the output. This is where the \echo command comes in handy to help produce nice
titles for the displayed data.
Informational Meta-commands
The psql informational meta-commands provide a wealth of information about the
PostgreSQL system. You can use the informational meta-commands to display the tables,
views, and users created in the connected database. These commands are listed and de-
scribed in Table 5-6.
Command Description
\d name Display detailed information about the table, index, sequence, or view name.
\da [pattern] List aggregate functions matching pattern
\db [pattern] List tablespaces matching pattern
\dc [pattern] List conversions matching pattern
\dC List casts
\dd [pattern] Show comments for objects matching pattern
\dD [pattern] List domains matching pattern
\df [pattern] List functions matching pattern
\dg [pattern] List Group Roles matching pattern
\di [pattern] List indexes matching pattern
\dn [pattern] List schemas matching pattern
\do [pattern] List operators matching pattern
\dl [pattern] List large objects matching pattern
\dp [pattern] List table, view, and sequence access privileges matching pattern
\ds [pattern] List sequences matching pattern
\dS [pattern] List system tables matching pattern
\dt [pattern] List tables matching pattern
\dT [pattern] List data types matching pattern
\du [pattern] List users matching pattern
\dv [pattern] List views matching pattern
\l List all databases
\z [pattern] List table, view, and sequence access privileges matching pattern
As you can see from Table 5-5, there are lots of informational meta-commands avail-
able for you. These are great tools for viewing the layout of a PostgreSQL database.
Be careful when using these commands, though, as the information they produce is
dependant on your schema search path (described in detail in Chapter 6). By default, the
\dt command displays a list of tables in the schemas contained in your search path. If
you want to display the tables from a schema not in your search path, you must specify
it within the option pattern:
test=> \dt store.
List of relations
Schema | Name | Type | Owner
————————+——————————+———————+———————
store | Customer | table | postgres
store | Order | table | postgres
store | Product | table | postgres
(3 rows)
test=>
The period at the end of the schema name is important. A very useful meta-command is
the \d command.
This meta-command allows you to view the columns available within a table:
test=> \d store."Customer"
Table "store.Customer"
Column | Type | Modifiers
————————————+———————————————————+——————————
CustomerID | character varying | not null
LastName | character varying |
FirstName | character varying |
Address | character varying |
City | character varying |
State | character(2) |
Zip | character(5) |
Phone | character(8) |
Indexes:
"CustomerKey" PRIMARY KEY, btree ("CustomerID")
test=>
Notice that all of the information related to the table is displayed, including any in-
dexes created.
For a quick look at what Group Roles a Login Role belongs to, the \dg meta-
command is handy:
Chapter 5: The psql Program 135
test=>\dg fred
List of roles
Role name | Superuser | Create role | Create DB | Connections | Member of
———————————+———————————+—————————————+———————————+—————————————+———————————
fred | no | no | no | no limit | {Accountant}
(1 row)
test=>
This output tells you what database privileges the Login Role has, as well as what Group
Roles it is a member of.
Formatting Meta-commands
The formatting meta-commands control how table data is displayed in psql. These com-
mands control the format of the table output when displaying the results from queries.
The formatting meta-commands are listed and described in Table 5-7.
You should have noticed by now that in the normal query output, the table column
names are displayed, followed by the record data, each record being on one row, im-
mediately followed by the data for the next record. These meta-commands allow you to
change the way the data is displayed.
For starters, if you want to see only the data rows, without the column headings, use the
\t meta-command. If you would like to create a special title for the table, use the \C meta-
command. If you do not even want the data displayed in a table format, use the \a meta-
command. This outputs only the table headings and data, without the table formatting.
Command Description
\a Toggle between aligned (table) and unaligned (raw) output
mode.
\C string Set table title to string. Unsets the title if string is empty.
\f [string] Show the field separator character, or set it to string.
\H Toggle HTML output mode.
\pset parameter Set table printing option parameter to value. Same as -P
value command-line option.
\t Toggle display of table headings.
\T string Set HTML <table> tag attribute to string.
\x Toggle expanded output.
If you would like to use a character other than the pipe symbol (|) to separate fields in the
unaligned output, use the \f meta-command and specify the character to use.
The \x command provides for a completely different style of table output. Each
record is displayed individually, with the column names on the left, and record values
on the right:
test=> \x
Expanded display is on.
test=> select * from :cust;
-[ RECORD 1 ]————
CustomerID | BLU001
LastName | Blum
FirstName | Rich
Address | 123 Main
City | Chicago
State | IL
Zip | 60633
Phone | 555-1234
test=>
This format is called extended frames. This also applies to other commands that dis-
play data, such as using the informational meta-commands such as \dt and \l.
The \pset meta-command is your one-stop-shopping place for all table formatting
commands. It provides an interface to control all of the table formatting possibilities in
one place. It can be used to fine-tune exactly how you want the table output to look.
Table 5-8 lists and describes the \pset parameters that are available.
As you can see, you can control just about every aspect of the table output using the
\pset meta-command. Also, these same controls can be set using the psql -P command-
line option. The special HTML formatting controls are great tools that allow you to easily
incorporate psql output in web pages.
Command Description
\copy Copy data from a file to a table
\lo_export LOBOID file Export the large object LOBOID to a file named file
\lo_import LOBOID file Import the large object in the file file to the large
object LOBOID
\lo_list List all large objects defined in the database
\lo_unlink LOBOID Unlink the large object LOBOID
data table that references the large object. The LOBOID points to the location of the large
object in the pg_largeobject table.
To display the value of the environment variable, place percent signs (%) around it
within the echo statement. Now that you know the value of the APPDATA environment
variable, the psqlrc.conf file is located in the following path:
%APPDATA%\postgresql\psqlrc.conf
The PostgreSQL installer doesn’t create this file automatically, so you have to manu-
ally create it using the Windows Notepad application. The psqlrc.conf file is a stan-
dard text file that contains the meta-commands and SQL statements you want to auto-
matically run. Here is an example of a psqlrc.conf file:
Now every time psql is run, these variables will automatically be set:
C:\Program Files\PostgreSQL\8.2\bin>psql -q test fred
Password for user fred:
test=> select * from :cust;
test=>
In the psql session, the cust variable was already set for us by the psqlrc.conf
file. To reference the variable, just precede it with a colon. This can be a great typing sav-
er, especially if you (or your customers) choose to have long schema and table names.
Chapter 5: The psql Program 139
The tablename value must be the full table name. In my version of psql, it appears
that you cannot use a variable name in the \copy command. The filename value must
include the full pathname for the data file if it is not located in the same directory you
started psql from.
By default, the \copy command assumes column data is separated by a tab char-
acter. If your data uses any other character as a separator, you must use the USING
DELIMITERS option to specify it. Be careful when specifying a delimiter character. You
must ensure that the character is not found within the normal data. If it is, the \copy
command will not parse the column data correctly.
Also, by default the \copy command assumes that blank column data entries are
empty strings, and not the special NULL character. To enter a NULL character, use a
\N value, or specify the NULL character to use in the WITH NULL option.
These days, it is common to receive from customers data within Microsoft Excel
spreadsheets. Using the \copy command, it is easy to import this data directly into
your PostgreSQL tables. Before using the \copy command, you must convert the Excel
spreadsheet data to a format that \copy can read. This is accomplished by saving the
spreadsheet in either the Text (tab-delimited) or CSV (comma-delimited) format. The
following is an example of a comma-delimited text file:
BLU002,Blum,Barbara,879 Oak,Gary,IN,46100,555-4321
BLU003,Blum,Katie,342 Pine,Hammond,IN,46200,555-9242
BLU004,Blum,Jessica,229 State,Whiting,IN,46300,555-0921
140 PostgreSQL 8 for Windows
Since the data is separated by commas, you must specify the proper delimiter in the
\copy command:
test=> \copy store."Customer" from data.txt using delimiters ','
\.
test=> select * from :cust;
CustomerID | LastName | FirstName | Address | City | State | Zip | Phone
————————————+——————————+———————————+———————————+————————+———————+——————+——————————
BLU001 | Blum | Rich | 123 Main | Chicago | IL | 60633 | 555-1234
BLU002 | Blum | Barbara | 879 Oak | Gary | IN | 46100 | 555-4321
BLU003 | Blum | Katie | 342 Pine | Hammond | IN | 46200 | 555-9242
BLU004 | Blum | Jessica | 229 State | Whiting | IN | 46300 | 555-0921
(4 rows)
test=>
The data converted perfectly for the import. If any of the data rows in the text file
do not convert properly (such as if they do not satisfy any constraints placed on the col-
umns), the \copy command will produce an error message:
test=> \copy store."Customer" from data.txt using delimiters ','
\.
ERROR: duplicate key violates unique constraint "CustomerKey"
CONTEXT: COPY Customer, line 1: "BLU002,Blume,Judy,111 Maple,Hobart,IN,
46700,555-5577"
test=>
Notice that both the error message and the row that produced the error are displayed.
There is another very important thing to know about errors in the \copy command. The
\copy command is processed as a single transaction by the database. This means that
the \copy command is either committed as a whole or rolled back as a whole. If you
import a 1000-row data file, and row 999 has an error, all of the previously inserted 998
rows will be rolled back and not entered into the database, and the one remaining row
will not be processed.
SUMMARY
This chapter walked through the features of the psql program, which provides a simple
console interface to your PostgreSQL system. You can perform database administration func-
tions as well as input, delete, modify, and query the data contained in the database. The psql
program uses command-line options to allow you to set features used within the application.
Besides the command-line options, psql also provides its own internal meta-commands.
These commands consist of shortcuts for performing complex SQL statements (such as copy-
ing data from a data file to a table) and commands for setting the format of query results.
The main language used to interface with the PostgreSQL database is SQL. Chapter 6
dives into the basics of the SQL language and shows you how to use psql to perform
administration and user functions using SQL statements.
6
Using Basic SQL
141
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
142 PostgreSQL 8 for Windows
C
hapter 5 showed how to use the psql program to interact with the PostgreSQL
system. One of the ways to interact with PostgreSQL is to use the standard SQL
query language. Whether you are accessing your PostgreSQL system from the
psql program interface or from a fancy Java or .NET application, knowing how to use
SQL is an important skill to have. The better your SQL skills, the better your application
will perform. This chapter shows the basics of using SQL to control your PostgreSQL
system, and how to insert, delete, and query data.
SQL History
The Structured Query Language (SQL) has been around since the early 1970s as a lan-
guage for interacting with relational database systems. As database systems became more
complex, it was clear that a simpler language was needed to interact with databases. The
first commercial SQL product, called Structured English Query Language (SEQUEL),
was released by IBM in 1974. As its name suggests, it was intended to provide a query
interface to the database system using simple English words. Over the next few years the
SEQUEL language was modified, and eventually became known as SQL (which often is
pronounced “sequel”).
As other database vendors attempted to mimic or replace SQL with their own query
languages, it became evident that SQL provided the easiest interface for both users and ad-
ministrators to interact with any type of database system. In 1986 the American National
Standards Institute (ANSI) formulated the first attempt to standardize SQL. The standard
was adopted by the United States government as a federal standard, and was named
ANSI SQL89. This SQL standard was adopted by most commercial database vendors.
As you can probably guess from there, additional updates have been made to the
ANSI SQL standard over the years, resulting in SQL92 (also called SQL2) and SQL99
(also called SQL3) versions, which are the versions PostgreSQL supports. At the time of
this writing there is also an ANSI SQL 2003 version that has been published, but Post-
greSQL does not claim compatibility with it.
SQL Format
The main feature of SQL is that it provides a standard way of defining commands that the
database engine can understand. A SQL command (also called a statement) consists of
tokens, separated by white space, and terminated by a semicolon (although PostgreSQL
Chapter 6: Using Basic SQL 143
can be configured to allow a SQL statement to be terminated by pressing the enter key,
defined in the postgresql.conf file).
The command tokens identify actions, and data used in the command. They consist of :
X Keywords
R Identifiers
W Literals
SQL Keywords
SQL keywords define the actions the database engine takes based on the SQL statement.
Table 6-1 lists and describes the standard SQL keywords supported by PostgreSQL.
These standard SQL keywords are common among all database products that follow
the ANSI SQL standards. However, besides the standard ANSI SQL commands, many
database vendors implement nonstandard commands to augment the standard com-
mands and to differentiate their product from others. Besides the standard SQL key-
words, PostgreSQL provides a few nonstandard SQL keywords, listed and described in
Table 6-2.
SQL keywords are case insensitive in PostgreSQL. You can enter keywords in any
case (including mixed case) and PostgreSQL will interpret them properly.
Besides these keywords there are also modifying keywords, such as FROM, WHERE,
HAVING, GROUP BY, and ORDER BY. These keywords are used to modify the command
defined in the main keyword. The various modifying keywords are described later in
this chapter as the main keywords are introduced.
SQL Identifiers
SQL command identifiers define database objects used in the command. This is most
often a database name, schema name, or table name. Identifiers are case sensitive in
PostgreSQL, so you must take extreme care to enter the case correctly. By default, Post-
greSQL changes all unquoted identifiers to all lowercase. Thus the identifiers Customer,
CUSTOMER, and CusTomer all become customer to PostgreSQL. To use uppercase
letters in an identifier, you must use double quotes around the individual identifi-
ers. For example, to reference the table Customer in the schema store, you must type
store."Customer" in the SQL statement. However, to reference the table Customer in
the schema Store, you must type "Store"."Customer" in the SQL statement.
Also, identifier names cannot be keywords. Thus, you cannot create a table named
select and expect to use it in a SQL statement as
This will produce an error message from PostgreSQL. However, you can get around
this rule by using a quoted identifier. PostgreSQL allows you to reference a table called
select by using the format "select". The SQL statement
SQL Literals
SQL command literals define data values referenced by the keyword command. These
are constant values, such as data that is to be inserted into tables, or data values for ref-
erencing queries.
Literals can be any type of data supported by PostgreSQL. There are hundreds
of various data types recognized by PostgreSQL (as demonstrated in Chapter 4).
Chapter 6: Using Basic SQL 145
String data types, such as characters, variable-length characters, time strings, and date
strings, must be enclosed in single quotes in the SQL statement. You can include a single
quote within a string data type by preceding it with a backslash (such as 'O\'Leary').
You can also embed special ASCII control characters into the string literal using C-style
references:
X \n for newline
R \r for carriage return
R \f for form feed
R \t for tab
W \b for backspace
Numerical literal values must be entered without quotes, otherwise PostgreSQL will
interpret them as string values. Numerical values use standard math notations for entry,
such as 10, 3.14, .005, and 10e3. PostgreSQL will interpret these values as numerical and
attempt to enter them into the database table field using the format defined in the col-
umn (such as integer, floating point, or scientific).
CREATING OBJECTS
Chapter 4 showed how to use the graphical pgAdmin III program to create database,
schema, table, Group Role, and Login Role objects. All of these objects can also be manu-
ally created using standard SQL statements. While most PostgreSQL administrators will
find the graphical tools much easier to work with, sometimes knowing how to manually
create an object comes in handy. This section walks through the SQL CREATE statement,
showing how to manually create objects in the PostgreSQL system.
Creating a Database
As mentioned in Chapter 4, the PostgreSQL installation program creates a default data-
base called postgres. It is advisable to create at least one other database for your ap-
plications to use, separate from the default database. The following is the format of the
SQL CREATE statement used for creating databases:
The parameters enclosed in brackets are optional, and not required to create the da-
tabase. The only required parameter for the CREATE DATABASE command is the name
Chapter 6: Using Basic SQL 147
of the new database, name. By default, PostgreSQL makes the database owner the Login
Role that creates the database. The Login Role must have privileges to create a new data-
base on the PostgreSQL system. In practice, only the postgres superuser should create
new databases.
The default template used to create the database will be template1 (see Chapter 4
for details on templates), and the encoding, tablespace, and connection limit values will
be the same as those defined for the template1 database. You can specify alternative
values for any or all of these parameters.
When you create a database, PostgreSQL returns either the CREATE DATABASE mes-
sage on success or an error message on failure:
The first database creation command completed successfully, but the second attempt
failed, as the proposed database name already existed. You can display a listing of all the
existing database objects by using the PostgreSQL \l meta-command:
test2=# \l
List of databases
Name | Owner | Encoding
———————————+——————————+——————————
postgres | postgres | SQL_ASCII
template0 | postgres | SQL_ASCII
template1 | postgres | SQL_ASCII
test | postgres | UTF8
test2 | postgres | SQL_ASCII
(5 rows)
test2=#
If you need to remove an existing database, use the DROP DATABASE SQL command.
This command cannot be used while anyone (including yourself) is currently in the da-
tabase. All objects contained in the database (schemas, tables, functions, and so on) are
removed from the PostgreSQL system. Be careful when using this command. The only
way to recover from the DROP DATABASE command is to restore the database from the
last backup.
148 PostgreSQL 8 for Windows
Creating a Schema
After creating a database object, next up is the schema. As described in Chapter 4, a data-
base can contain multiple schemas, which in turn contain the tables, views, and various
functions. Often a database administrator will create separate schemas for each applica-
tion within the database.
Just as with the database, normally only the superuser should create schemas for
database users. The format of the CREATE SCHEMA command is
CREATE SCHEMA [schemaname] [AUTHORIZATION username [schema elements]]
The first oddity you may notice with this command format is that the schemaname
value is optional. If it is not listed, PostgreSQL uses the Login Role value as the schema
name.
The AUTHORIZATION parameter defines the user who will be the schema owner. By
default, the schema owner is the user who creates the schema. Only the superuser can
use the AUTHORIZATION parameter to specify a different user as the schema owner.
A unique feature of the CREATE SCHEMA command is that you can add additional
SQL commands to the CREATE command to create schema objects (tables, views, in-
dexes, sequences, and triggers) and grant privileges to users for the schema.
The new schema is created in the database you are currently connected to. By default,
this is shown in the psql command prompt. A list of existing schemas in the database
can be displayed using the PostgreSQL \dn meta-command:
test2=# \dn
List of schemas
Name | Owner
————————————————————+—————————
information_schema | postgres
pg_catalog | postgres
pg_toast | postgres
public | postgres
(4 rows)
test2=#
test2=#
By default, the RESTRICT parameter is enabled. This allows the schema to be re-
moved only if it is empty. If there are any objects (such as tables, views, and triggers)
created in the schema, by default the DROP SCHEMA command will fail. Alternatively, if
you really want to remove the schema and all of the objects it contains, you must use the
CASCADE parameter.
Creating a Table
By far the most common CREATE SQL command you will need to use is the CREATE
TABLE command. Tables hold all of the application data. Advanced planning must be
used to ensure that the application tables are created properly. Often, poor table design
is the cause of many failed applications. Whenever possible, use standard relational da-
tabase techniques to match related data elements together within the same table (such as
all customer data in the Customer table, and all product data in the Product table).
Each data table must define the individual data elements (columns) contained in
the table, the data type used for each data element, whether any of the columns will be
used as a primary key for the table, whether any foreign keys need to be defined, and
whether any table constraints should be present (such as requiring a column to always
have a value). All of this information can make the CREATE TABLE command extremely
complex.
Instead of trying to include all of the information required to create a table in one
SQL CREATE command, often database administrators utilize the ALTER TABLE SQL
command. This command is used to alter the definition of an existing table. Thus, you
can create a base definition of a table using the CREATE TABLE command, then add ad-
ditional elements using ALTER TABLE commands.
The following sections go through the process of creating a new table and adding
additional elements to the base table.
150 PostgreSQL 8 for Windows
For tables with lots of columns, this can become quite a long statement. Database ad-
ministrators often split the statement into several command-line entries. Remember, by
default, PostgreSQL does not process the SQL command until it sees a semicolon. Here
is an example of creating a simple table in psql:
test2=>
Chapter 6: Using Basic SQL 151
There are lots of things to watch for in this example. First, by default, PostgreSQL
creates all new tables in the public schema for the database you are connected to. If
you want to create a table in a different schema, you must include the schema name in
the CREATE TABLE command. Next, notice that the psql command prompt changes
when additional input is required to finish the SQL statement as it is entered on multiple
lines. This helps you keep track of what is expected next when entering commands. As
you enter the definitions for each column, remember that if you want to use mixed-case
letters in the column names, you must enclose the names in double quotes.
When the CREATE TABLE command is finished, a message is displayed showing the
command was successful (if you use the -q command-line option, the command prompt
returns with no message if the command is successful).
Besides the column data type, you can also add constraints to the column data defini-
tion:
postgres=> create table Employee (
postgres(> EmployeeID int4 primary key,
postgres(> Lastname varchar,
postgres(> Firstname varchar,
postgres(> Department char(5) not null,
postgres(> StartDate date default now(),
postgres(> salary money);
NOTICE: CREATE TABLE / PRIMARY KEY will create implicit index "employee_pkey"
for table "Employee"
CREATE TABLE
postgres=>
In this example, notice that the keyword PRIMARY KEY is used to define a primary
key used for indexing the data. The PRIMARY KEY keyword forces each data entry in the
employeeid column to be both unique and not empty (called null in database terms).
An additional constraint is added to the department column, requiring an entry for
this column. The startdate column uses the PostgreSQL now function (discussed in
Chapter 8) to enter today’s date as a default value each time a record is entered.
The action parameter can be one or more SQL commands used to modify the table.
Table 6-3 lists and describes the commands available.
Be careful when altering a table that contains data. Altering column information ap-
plies to all records already present in the table. With the Customer table added earlier,
152 PostgreSQL 8 for Windows
I did not specify the primary key, and the constraint on the Phone column. I can now add
those using the ALTER TABLE command:
test2=> alter table store."Customer" add primary key ("CustomerID");
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"Customer_pkey" for table "Customer"
ALTER TABLE
test2=> alter table store."Customer" alter column "Phone" set not null;
ALTER TABLE
test2=> \d store."Customer"
Table "store.Customer"
Column | Type | Modifiers
————————————+———————————————————+——————————
CustomerID | character varying | not null
LastName | character varying |
FirstName | character varying |
Address | character varying |
City | character varying |
State | character(2) |
Zip | character(5) |
Phone | character(8) | not null
Indexes:
"Customer_pkey" PRIMARY KEY, btree ("CustomerID")
test2=>
Here are a few other examples of using the ALTER TABLE command:
postgres=> alter table employee rename to employees;
ALTER TABLE
postgres=> alter table employees add column birthday date;
ALTER TABLE
postgres=> alter table employees rename column birthday to bday;
ALTER TABLE
postgres=> alter table employees alter column bday set not null;
ALTER TABLE
postgres=> \d employees
Table "public.employees"
Column | Type | Modifiers
————————————+———————————————————+—————————————
employeeid | integer | not null
lastname | character varying |
firstname | character varying |
department | character(5) | not null
startdate | date | default now()
salary | money |
bday | date | not null
Indexes:
"employee_pkey" PRIMARY KEY, btree (employeeid)
As a final check of your work, you can start the pgAdmin III program (as demon-
strated in Chapter 4) and look at the newly created database, schema, and tables in the
graphical display.
As expected, there are lots of options that can be used when creating a role. Table 6-4
lists and describes the options that can be used.
Option Description
ADMIN rolelist Add one or more roles as administrative members of the new role.
CONNECTION Limit the number of connections the role can have to the database.
LIMIT ‘value’ The default is -1, which is unlimited connections.
CREATEDB Allow the role to create databases on the system.
CREATEROLE Allow the role to create new roles on the system.
ENCRYPTED Encrypt the role password within the PostgreSQL system tables.
IN ROLE rolelist List one or more roles that the new role will be a member of.
INHERIT Specify that the role will inherit all of the privileges of roles it is a
member of.
LOGIN Specify that the role can be used to log into the system (a Login Role).
NOLOGIN Specify that the role cannot be used to log into the system (a Group
Role).
PASSWORD passwd Specify the password for the role.
ROLE rolelist List one or more roles that will be added as members to the new role.
SUPERUSER Specify that the new role will have superuser privileges. Only the
superuser can use this option.
VALID UNTIL ‘date’ Specify a date when the role will expire.
By default, the CREATE ROLE command uses the NOLOGIN option to create a Group
Role (although it also does not hurt to specify it). Remember, you must have superuser
privileges to create new roles in the database. This is a job usually done with the post-
gres user account:
Notice that the WITH keyword is optional in the command. If it makes things easier
for you, go ahead and use it.
To create a new Login Role, the command can get somewhat complex. Just as in cre-
ating tables, you can also use the ALTER ROLE command to help break up the options.
Unfortunately, you cannot use the IN ROLE option in the ALTER TABLE command, so
that must be entered in the CREATE ROLE command:
test2=# create role wilma in role management;
CREATE ROLE
test2=# alter role wilma login password 'pebbles' inherit;
ALTER ROLE
test2=# create role betty in role technician;
CREATE ROLE
test2=# alter role betty login password 'bambam' inherit;
ALTER ROLE
test2=# \du wilma
List of roles
Role name | Superuser | Create role | Create DB | Connections | Member of
———————————+———————————+—————————————+———————————+—————————————+—————————————
wilma | no | no | no | no limit | {management}
(1 row)
test2=#
The INHERIT parameter tells PostgreSQL to allow the Login Roles to inherit any
privileges assigned to the Group Roles they belong to.
If you need to remove any Login or Group Roles, you can use the DROP ROLE com-
mand. However, PostgreSQL will not let you remove a role if it is the owner of any data-
base objects, such as schemas or tables. You must perform an ALTER command to change
the owner of the object to another role.
Assigning Privileges
Now that you have some Login and Group Roles created, it is time to assign privileges to
database objects. In the SQL language, the command to assign privileges is GRANT:
The privlist parameter contains a list of the privileges you want roles to have on the
object object.
Granting privileges can be a complex function to perform. There are two types of
GRANT commands, depending on what the object specified in the command is:
When using the GRANT command on database objects, by default the command as-
sumes you are using table objects. If you are granting privileges to any other database
object, you must specify the object type before the object name:
test2=# \z store."Customer"
Access privileges for database "test2"
Schema | Name | Type | Access privileges
————————+——————————+———————+———————————————————————————————————————
store | Customer | table | {fred=arwdRxt/fred,management=arw/fred,
technician=r/fred}
(1 row)
test2=#
The \z command shows all of the roles that have been assigned privileges for the
table, along with the privilege codes (described in Chapter 4). The Login Role listed after
the privilege is the one who granted the privileges.
To grant a privilege to all users on the system, use the special public Group Role:
Now all users on the system have the ability to query data from the Customer table.
If you try to view the privileges for the table, you may be confused by what you now see.
Instead of seeing an entry for public, you will see an entry with no name, assigned the
r privilege:
{fred=arwdRxt/fred,management=arw/fred,technician=r/fred,=r/fred}
158 PostgreSQL 8 for Windows
The empty privilege is the default for everyone on the system, which is the public
group.
To revoke privileges already granted, use the REVOKE SQL command. You can revoke
all assigned privileges on an object, or just a single assigned privilege for a single role:
HANDLING DATA
Now that you have your tables created, your user Login and Group Roles created, and
the proper privileges assigned, you probably want to start manipulating data in the da-
tabase. There are three SQL commands that are used for handling data in PostgreSQL:
X INSERT
R UPDATE
W DELETE
The following sections describe these commands, and show how to use them in your
application.
Inserting Data
The INSERT command is used to insert new data into the table. Each row of data in the
table is called a record. Sometimes in database books and publications you will see the
term tuple, which is just a fancy way of saying record.
A record consists of a single instance of data for each column (although it is possible
that one or more columns in a record can be empty, or null). If you think of the table as
a spreadsheet, the columns are the table columns, and the rows are the individual table
records.
The basic format of the INSERT command is
By default, the INSERT command attempts to load values from the valuelist into each
column in the table, in the order the columns appear in the table. You can view the col-
umn order by using the \d meta-command. To enter data into all columns, you use the
following command:
Notice that you can separate the INSERT command into separate lines. The state-
ment is not processed until the semicolon is entered. The response from psql shows two
values. The first value is the object ID (OID) of the table record if the table was defined
to use OIDs. If not, then the value is zero. The second value is the number of records that
were entered into the table from the command.
If you do not want to enter all of the values into a record, you can use the optional
columnlist parameter. This specifies the columns (and the order) that the data values will
be placed in:
test2=# insert into store."Customer" ("CustomerID", "LastName", "Phone")
test2-# values ('BLU002', 'Blum', '555-4321');
INSERT 0 1
test2=#
Remember that the table constraints defined when creating the table apply, so you
must enter data for columns created with the NOT NULL constraint. If a column was
created with a DEFAULT VALUE constraint, you can use the keyword DEFAULT in the
valuelist to assign the default value to the column data. You can also skip columns that
have default values by not listing them in the columnlist parameter. Those columns will
automatically be assigned their default values.
Modifying Data
If data entered into the table needs to be changed, all is not lost. You can modify any data
in the table, as long as your Login Role has UPDATE privileges for that table.
The UPDATE SQL command is used for updating data contained in tables. The
UPDATE command is another one of those commands that, while the idea is simple, can
easily get complex. The basic format of the UPDATE command is
The basic format of this command finds the specified column in the table and changes
the data in that column to the value specified in the command. However, using that ba-
sic format gets many database administrators into lots of trouble. Here is the way the
scenario usually plays out. Suppose you want to go back to a data record previously
inserted into the store.Customer table, and add some data that might have been left
160 PostgreSQL 8 for Windows
out in the original INSERT command, such as the customer’s first name. It is not uncom-
mon to see this erroneous SQL code:
test2=# update store."Customer" set "FirstName" = 'Barbara';
UPDATE 2
test2=#
Your first clue that something bad happened would be the output of the UPDATE
command. The value after the update message shows the number of rows affected by
the update process. Since this value is 2, you know that more than just the one row you
wanted to change has been changed. Obviously something went wrong. Doing a simple
query on the data shows what went wrong:
test2=# select * from store."Customer";
CustomerID | LastName | FirstName | Address | City |State| Zip | Phone
————————————+——————————+———————————+————————————+———————+—————+——————+————————
BLU001 | Blum | Barbara |123 Main St.|Chicago| IL | 60633|555-1234
BLU002 | Blum | Barbara | | | | |555-4321
(2 rows)
test2=#
Ouch. We managed to change the FirstName column in both records in the table. This
is an all-to-common mistake made by even the most experienced database programmers
and administrators when in a hurry. Without the WHERE clause portion of the UPDATE
command, the update is applied to every record in the table. In almost all cases, this is
not what you intend to do.
The WHERE clause allows you to restrict the records that the UPDATE command ap-
plies to. The WHERE clause is a logical statement that is evaluated by PostgreSQL. Only
records that match the condition contained within the WHERE clause are updated. Here is
an example of a simple WHERE clause:
test2=# update store."Customer" set "FirstName" = 'Rich'
test2-# WHERE "CustomerID" = 'BLU001';
UPDATE 1
test2=# select * from store."Customer";
CustomerID | LastName | FirstName | Address | City |State| Zip | Phone
———————————+——————————+———————————+————————————+———————+—————+—————+————————
BLU002 | Blum | Barbara | | | | |555-4321
BLU001 | Blum | Rich |123 Main St.|Chicago| IL |60633|555-1234
(2 rows)
test2=#
Every record in the table is examined to determine if it meets the condition defined in
the WHERE clause. In this simplistic example, there is only one record that could possibly
match the condition, as the primary key column cannot be duplicated in the table. Only
the record that met the WHERE clause condition was updated with the new information.
Chapter 6: Using Basic SQL 161
You can add as many elements within the condition to the WHERE clause as you need
to restrict the records to a specific subset within the table, such as updating employee
records for everyone making more than $100,000.
You can also update more than one column in a single UPDATE command:
The WHERE option is where things can start to get complex. You can specify expres-
sions for just about any column from any table in the database to determine which col-
umns are updated. To use columns from other tables, you must precede the WHERE option
with the FROM option, listing the tables the foreign columns come from:
Deleting Data
The last data function discussed is removing data that is no longer needed in the table.
Not surprisingly, the DELETE command is used to do this. The format for the DELETE
command is
This command is similar to the UPDATE command. Any records matching the con-
dition listed in the WHERE clause are deleted. As with the UPDATE command, extreme
caution is recommended when using the DELETE command. Another all-too-common
mistake is to quickly type the DELETE command and forget the WHERE clause. That re-
sults in an empty table, as all records are deleted.
Here are a couple of examples of using the DELETE command:
Notice that when the DELETE command fails to find any records to delete it does not
produce an error message. Instead, it just reports back that the total number of records
deleted was zero.
162 PostgreSQL 8 for Windows
Similar to the UPDATE command, you can also specify columns in other tables in the
WHERE clause. Before doing that you must precede the WHERE clause with the USING op-
tion, specifying a list of tables the columns come from. (Note that since the DELETE com-
mand already uses the FROM keyword, it uses the USING keyword to specify the tables.
This is different from the UPDATE command.)
QUERYING DATA
Quite possibly the most important function you will perform in your applications is to
query existing data in the database. While many application developers spend a great
deal of time concentrating on fancy GUI front ends to their applications, the real heart of
the application is the behind-the-scenes SQL used to query data in the database. If this
code is inefficient, it can cause huge performance problems and make an application
virtually useless for customers.
As a database application programmer, it is essential that you understand how to
write good SQL queries. The SQL command used for queries is SELECT. Because of its
importance, much work has been done on the format of the SELECT command, to make
it as versatile as possible. Unfortunately, with versatility comes complexity.
Because of the complexity of the SELECT command, the command format has be-
come somewhat unwieldy and intimidating for the beginner. To try and keep things
simple, the next few sections demonstrate how to use some of the more basic features of
the SELECT command. Some more advanced features of the SELECT command format
will be presented in Chapter 7, after you have had a chance to get somewhat familiar
with it here.
The columnlist parameter specifies the columns from the table you want displayed
in the output. It can be a comma-separated list of column names in the table, or the
wildcard character (the asterisk) to specify all columns, as was shown in the SELECT
examples used earlier in this chapter:
By default, the records are not displayed in any particular order. As records are add-
ed and removed from the table, PostgreSQL may place new records anywhere within the
table. Even if you enter data in a particular order using INSERT commands, there is still
no guarantee that the records will display in the same order as a result of a query. If you
need to specify the order of the displayed records, you must use the ORDER BY clause:
test=> select "CustomerID", "LastName", "FirstName" from store."Customer"
test-> order by "FirstName";
CustomerID | LastName | FirstName
————————————+——————————+——————————
BLU002 | Blum | Barbara
BLU004 | Blum | Jessica
BLU003 | Blum | Katie
BLU001 | Blum | Rich
(4 rows)
test=>
In this example (taken from the test database used in Chapter 5), only the columns
specified in the SELECT command are displayed, ordered by the FirstName column. The
default order used by the SELECT command is ascending order, based on the data type
of the column you selected to order by. You can change the order to descending by using
the DESC keyword after the column name in the ORDER BY clause.
test=>
Notice that the string literal Gary uses single quotes to signify that it is a string value,
while the mixed-case column names require double quotes to preserve the uppercase let-
ters. It is easy to get these two things confused.
164 PostgreSQL 8 for Windows
test=>
As you can see, it does not take long for a seemingly simple SQL command to get
fairly complex. The first part of the command defines the data columns you want to see
in the output display. Since you are using columns from two tables, you must precede
each column name with the table it comes from. As always, remember to use double
quotes around names that use mixed-case letters.
Next, you must define which tables the columns come from in the FROM clause. If
they are not located in the public schema, you must also indicate the schema where
they are located. Finally, the WHERE clause must define a query to filter out the records
you want to display. In this example, there are two conditions to the WHERE clause. The
AND operator means that both conditions must be met for a record to be displayed.
Chapter 6: Using Basic SQL 165
The first condition states that you are looking for the record in the Order table where
the OrderID column value is ORD001. The second condition states that you are looking
for records in the Customer table that have the same CustomerID column value as the re-
cords retrieved from the Order table. For each record where both of these conditions are
matched, the information in the data columns is displayed. Since only one record in the
Order table has that OrderID value, only the information for that record is displayed.
Using Joins
In the previous example, you had to write a lot of SQL code to match the appropriate
record from the Customer table to the Order record information. In a relational database,
this is a common thing to do. To help programmers out, the SQL designers came up with
an alternative way to perform this function.
A database join matches related records in relational database tables without you
having to perform all of the associated checks. The format of using the join in an SELECT
command is
SELECT columnlist FROM table1 jointype JOIN table2 ON condition
The columnlist parameter lists the columns from the tables to display in the output.
The table1 and table2 parameters define the two tables to perform the join on. The jointype
parameter determines the type of join to perform. There are three types of joins available
in PostgreSQL:
The LEFT and RIGHT JOIN join types are also commonly referred to as outer joins.
The condition parameter defines the column relation to use for the join operation.
It is common practice to use the same column name for columns in separate tables
that contain the same information (such as the CustomerID column in the Customer and
Order tables). You can use the NATURAL keyword before the join type to inform Post-
greSQL to join using the common column name. Here is the same query as used before,
but this time using a NATURAL INNER JOIN:
test=> select "Order"."OrderID", "Customer"."CustomerID",
test-> "Customer"."LastName", "Customer"."FirstName", "Customer"."Address"
test-> from store."Order" natural inner join store."Customer";
test=>
166 PostgreSQL 8 for Windows
That is a lot less typing to do! The result shows all of the records in the Order table
that have matching CustomerID records in the Customer table. To display all of the re-
cords in the Customer table with their matching records in the Order table, use a RIGHT
JOIN:
test=> select "Order"."OrderID", "Customer"."CustomerID",
test-> "Customer"."LastName","Customer"."FirstName", "Customer"."Address"
test-> from store."Order" natural right join store."Customer"
test=>
Notice in the result set that two records in the Customer table do not have any re-
cords in the Order table, but are still displayed as the result of the RIGHT JOIN.
Using Aliases
Use table aliases to help keep down the clutter in your SELECT commands. A table alias
defines a name that represents the full table name within the SELECT command. The
basic format for using aliases is
When the table name is defined as an alias, you can use the alias anywhere within
the SELECT command to reference the full table name. This is especially handy when
you have to use schema names and double quotes for all of your table names. Here is a
typical example:
test=> select a."OrderID", b."CustomerID", b."LastName", b."FirstName",
test-> b."Address" from store."Order" as a, store."Customer" as b
test-> where a."OrderID" = 'ORD001' and a."CustomerID" = b."CustomerID";
OrderID | CustomerID | LastName | FirstName | Address
—————————+————————————+——————————+———————————+—————————
ORD001 | BLU001 | Blum | Rich | 123 Main
(1 row)
test=>
The alias a is used to represent the store."Order" table, and the alias b is used to
represent the store."Customer" table. Notice that you can use the aliases anywhere in
the SELECT statement, even before they are defined.
Chapter 6: Using Basic SQL 167
SUMMARY
SQL is at the heart of the PostgreSQL system. All interaction with PostgreSQL, whether
you are using a fancy GUI or a simplistic command-line interface, is via SQL commands.
This chapter introduced the basic SQL commands necessary to create database objects,
insert and modify data, and query data.
The CREATE SQL family of commands allows you to create new database objects.
The CREATE DATABASE command is used to start a new database area separate from
the default area set up for system tables. The CREATE SCHEMA command allows you to
provide a separate working area for each application within a single database. And, of
course, the CREATE TABLE command is used for creating tables to hold the application
data. The CREATE TABLE command is by far the most complex, as you are required to
define all of the data elements and constraints contained within the application.
After creating the database environment, you must use the CREATE ROLE command
to create Group and Login Roles. These roles are used for allowing users access to da-
tabase objects. The GRANT and REVOKE commands are used to assign the appropriate
privileges to the Group Roles.
The next step in the process is managing data within the tables. The INSERT com-
mand is used to insert new data records into a table. Table columns can be set to provide
default values when a data element is not provided. The UPDATE command is used to
modify data contained in existing tables. The WHERE clause of the UPDATE command is
crucial in restricting the update to specific records within the table. Likewise, the DE-
LETE command is almost always modified with the WHERE clause when used for remov-
ing data elements from the table.
Possibly the most important SQL command available is the SELECT command. This
command allows you to create complex queries on the data stored within database ta-
bles. The SELECT command has lots of features, which require lots of command-line
parameters used to fine-tune the query.
The next chapter continues coverage of the SELECT command. Advanced uses for
SELECT are covered, as well as tips on how to maximize data retrieval using SELECT
commands. The chapter also covers how to create views, combining data elements from
multiple tables into a single object that can be used in queries, and triggers, which allow
you to perform table functions based on database events. The chapter closes out on how
to define SQL command transactions, grouping multiple SQL commands into a single
transaction that is either accepted or denied by PostgreSQL.
This page intentionally left blank
7
Using Advanced SQL
169
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
170 PostgreSQL 8 for Windows
C
hapter 6 covered the basics of interacting with your PostgreSQL system using
SQL. This chapter extends that thought, presenting some more advanced SQL
topics that will help you handle data within your PostgreSQL system. As was
mentioned in Chapter 6, the SELECT command is possibly the most complex of the SQL
commands. This chapter picks up on the discussion of the SELECT command and shows
all of the options that can be used when querying a database. After that, table views are
discussed. Views allow you to group data elements from multiple tables into a single
virtual table, making querying data much easier. Following views, indexes are covered.
Creating indexes on heavily queried columns can greatly speed up the query process.
Next, the idea of transactions is demonstrated. Transactions allow you to group SQL
commands together in a single operation that is processed by the database engine. The
chapter finishes by discussing cursors. Cursors are used to help maneuver around a
result set produced by a SELECT command. They can be used to control how you view
the result set.
This is about as complex of a command as you can possibly get in the SQL command
world. The best way to look at this is to walk through the various parameters one by one
and discuss each feature individually. The following sections do just that.
Chapter 7: Using Advanced SQL 171
The DISTINCT clause section defines how the SELECT command handles duplicate
records in the table. The ALL parameter defines that all records that are returned in the
result set will be displayed in the SELECT command output, even if there are duplicates.
This is the default behavior if neither of these parameters is specified in the DISTINCT
clause.
The DISTINCT parameter specifies that when more than one record in a result set
has the same values, only the first record is displayed in the output. The duplicate re-
cords are suppressed. This can be beneficial if you have tables that may contain duplicate
information.
By default DISTINCT only eliminates records that are complete duplicates (all the
column values match). You can use the ON option to define which column (or a comma-
separated list of columns) to compare for duplicates.
The most common use for the DISTINCT clause is to display a list of the distinct
number of values for a specific data column. For example, if you need to produce a
report on all the cities you have customers in, you would want only one occurrence of
each individual city, not one for each customer. To do this you could use the following
SQL command:
test=> select distinct on ("City") "City", "State" from store."Customer";
City | State
—————————+——————
Chicago | IL
Gary | IN
Hammond | IN
(3 rows)
test=>
Each city is displayed only once, no matter how many customers are located in that
city.
The SELECT list is the portion of the SELECT command between the SELECT and
FROM keywords. The SELECT list section lists the columns you want displayed in the
result set. The columns can all be from the same table, or they can be columns from mul-
tiple tables, preceded by the table name. The asterisk is used to indicate that all columns
in a table should be displayed in the output.
172 PostgreSQL 8 for Windows
The AS option allows you to change the column heading label in the output to a value
different from the column name. For example:
test=> select "CustomerID" as "ID", "LastName" as "Family",
test-> "FirstName" as "Person" from store."Customer";
ID | Family | Person
————————+————————+————————
BLU004 | Blum | Jessica
BLU001 | Blum | Rich
BLU002 | Blum | Barbara
BLU003 | Blum | Katie
(4 rows)
test=>
Instead of the generic column names, the output display now shows the alternative
column headings specified in the AS option.
Despite its innocent looks, the FROM clause can be the most complex part of the
SELECT command, as the from_list parameter has several different formats all of its own.
Here are the different formats, each one on its own line:
[ONLY ] table_name [ * ] [ [ AS ] alias [ (column_alias [,...] ) ] ]
( select ) [ AS ] alias [ (column_alias [,...] ) ]
function_name ( [argument [,...] ]) [ AS ] alias [ (column_alias [,...] |
column_definition [,...] ) ]
function_name ( [ argument [,...]) AS (column_definition [,...] )
from_item [ NATURAL ] join_type from_item [ ON join_condition | USING
(join_column [,...]) ]
Just as with the SELECT command, it is easier to break the FROM clause options down
one by one to discuss them.
First, by now you should be familiar with most of the first line of parameters. The
table_name parameter specifies the tables to find the requested columns in. The ONLY op-
tion directs PostgreSQL to search only the table specified, and not any tables that inherit
the specified table (child tables, often created for partitioning, discussed in Chapter 1).
An asterisk after the table name directs PostgreSQL to search all child tables of the speci-
fied table (this is the default behavior in PostgreSQL version 7.1 and later). As seen in
Chapter 7: Using Advanced SQL 173
Chapter 6, the AS parameter allows you to define an alias for the table name that can be
used anywhere within the SELECT command.
The Sub-select
( select ) [ AS ] alias [ (column_alias [,...] ) ]
The second format shows features that allow you to define a sub-query (called a sub-
select) to extract data from. The sub-select is evaluated first, then the result set is used in
the original select.
The result set from the sub-select acts as a temporary table, which is queried using
the original SELECT command. It is required that the sub-select be enclosed in paren-
theses and assigned an alias name. You can optionally provide aliases to the result set
columns for the sub-select:
test=> select * from (select "CustomerID", "FirstName" from store."Customer")
test-> as test ("ID", "Name");
ID | Name
————————+————————
BLU004 | Jessica
BLU001 | Rich
BLU002 | Barbara
BLU003 | Katie
(4 rows)
test=>
Functions
The next two formats describe the use for querying the result of a PostgreSQL function:
function_name ( [argument [,...] ]) [ AS ] alias [ (column_alias [,...] |
column_definition [,...] ) ]
function_name ( [ argument [,...]) AS (column_definition [,...] )
The result set of the declared function is used as the input to the first SELECT com-
mand. Just as with the sub-select, you must define an alias name for the function result
set, and optionally declare column names for multicolumn function output. The SELECT
command queries the function output just as if it were a normal database table.
Using PostgreSQL functions is discussed in detail in Chapter 8.
Joins
from_item [ NATURAL ] join_type from_item [ ON join_condition | USING
(join_column [,...]) ]
174 PostgreSQL 8 for Windows
The last format of FROM parameters handles joins. As discussed in Chapter 6, joins
allow you to easily match relational data between tables in the database. The NATURAL
keyword is used to join tables on common column names. Alternatively, you can use the
ON keyword to define a join condition, or the USING keyword to define specific matching
column names in both tables.
test=> select "Customer"."LastName", "Customer"."FirstName",
test-> "Product"."ProductName", "Order"."TotalCost" from
test-> store."Order" natural inner join store."Customer"
test-> natural inner join store."Product";
LastName | FirstName | ProductName | TotalCost
——————————+———————————+—————————————+——————————
Blum | Rich | Laptop | $5,000.00
Blum | Barbara | Laptop | $1,000.00
Blum | Katie | Desktop | $300.00
(3 rows)
test=>
This example shows the use of two inner joins to obtain data from tables related to
the Order table via defined foreign keys. The Customer and Order tables are joined on
the CustomerID column, while the Order and Product tables are joined on the ProductID
column. The Customer table matches the LastName and FirstName values related to the
CustomerID value stored in the Order table. The Product table matches the Product-
Name value related to the ProductID stored in the Order table.
The elements of the WHERE clause were discussed in detail in Chapter 6. The WHERE
clause specifies one or more conditions that filter data from the result set. While the con-
cept is simple, in reality the WHERE clause conditions can get somewhat complex, speci-
fying multiple conditions joined using Boolean operators, such as AND, OR, or NOT.
The GROUP BY clause specifies one or more columns of data to group together in the
result set. Unlike the DISTINCT clause, the GROUP BY clause allows you to aggregate
values of the grouped records. This is almost always used with a PostgreSQL function
that aggregates values from similar records. This example shows how to produce a re-
port showing the total number of products currently ordered by customers:
Chapter 7: Using Advanced SQL 175
test=>
The GROUP BY clause must declare which column is used to sort the records. The
sum() function adds the values of the Quantity column of each record (This will be
discussed further in Chapter 8). Be careful with the GROUP BY and ORDER BY clauses.
It is important to remember that the GROUP BY clause groups similar records before the
rest of the SELECT command is evaluated, while the ORDER BY clause orders records
after the SELECT commands are processed. The sum() function would not work prop-
erly using the ORDER BY clause, since the records would not be ordered until after the
sum() function executes.
The HAVING clause is similar to the WHERE clause, in that it is used to define a filter
condition to limit records used in the GROUP BY clause. Records that do not satisfy the
WHERE conditions are not processed by the GROUP BY clause. The HAVING clause filters
the records contained in the result set after the GROUP BY clause groups the records.
The Set Operation clauses use mathematical set theory to determine the result set of
two separate SELECT commands. The Set Operation clause types are:
X UNION Display all result set records in both select1 and select2
R INTERSECT Display only result set records that are in both select1 and select2
W EXCEPT Display only result set records that are in select1 but not in select2
By default, duplicate records in the output set are not displayed. The ALL keyword is
used to display all records from the output, including duplicate records.
The set operation is performed on the result sets of the SELECT commands. The re-
sult sets must have the same number of columns, and the columns must have the same
data types.
176 PostgreSQL 8 for Windows
When using the LIMIT and OFFSET parameters, you must be careful that you do not
overlap the records in the listing.
By default, the SELECT command does not lock any records during its operation.
The FOR clause allows you to change that behavior. It is not uncommon during a transac-
tion (discussed later in this chapter) to perform a SELECT command to obtain the current
data values, then immediately perform an INSERT, DELETE, or UPDATE command to
alter the values in the database. Between those operations, you may not want the current
values to be changed by other database users. This is where the FOR clause comes in.
The FOR UPDATE clause causes the records returned in the result set to be locked
in the table. This prevents other users from viewing, deleting, or modifying any re-
cord returned in the result set until the transaction that contains the SELECT command
finishes.
Using the FOR UPDATE clause causes the SELECT command to wait until any other
table locks set by other users are completed. If you specify the NOWAIT parameter, the
SELECT command does not wait, but instead exits with an error stating that the records
are locked.
The FOR SHARE clause behaves similarly to the FOR UPDATE clause, except it allows
other users to view the records while they are locked. However, they will not be able to
delete or modify any of the records.
By default, the FOR clause locks all records that are returned in the result set. If you
do not want to lock all of the records returned in the result set, you combine the FOR
clause with the LIMIT clause to limit the number of records locked in the transaction.
TABLE VIEWS
As described in Chapter 1, table views allow you to combine columns from multiple ta-
bles into a “virtual table” for querying. The combination of columns is performed as the
result of a query using the SELECT command. This feature allows you to create a result
set and use it just as if it were a physical table. This is demonstrated in Figure 7-1.
Views are often used to help simplify complex sub-selects within SELECT commands.
Instead of constantly having to type a complex sub-select, you can assign it to a view,
then use the view in your SELECT commands.
Views are created using the CREATE VIEW SQL command. The format of this com-
mand is
Orders table
Customer ID Product ID Quantity Cost
Customer ID Last name First name Product ID Product name Quantity Cost
After creating the view (remember, you need the proper privileges to do this) and as-
signing privileges to allow Group Roles to use it, other users can query the view just as if
it were a table. PostgreSQL only allows you to query views. As of the time of this writing,
you are not able to insert, remove, or modify data directly from a view.
To display the views available in a database, use the \dv meta-command (plus the
schema name if the view was created in specific schema):
test=>
Once created, users can query the view just as if it were a normal table:
test=>
If you no longer need the view, the view owner can remove it using the DROP VIEW
command.
TABLE INDEXES
Querying tables is the most computer-intensive part of using a database. For each query,
PostgreSQL must compare each record contained in the table with any conditions de-
fined in the SELECT command. This can consume both time and processor power, espe-
cially when working with large amounts of data.
180 PostgreSQL 8 for Windows
Fowler SNE001
WIL001 Williams Frank 9582 Oak St. Granger IN 46100 555-4342
Gary BLU002
WIS001 Wisecarver Emma 4332 Pine Ave. Valparaiso IN 46500 555-4352
WIS002 Wisecarver Ian 4938 Rose Terr. Hammond IN 46500 555-4322 Granger WIL001
SNE001 Snell Haley 731 Maple Ln. Fowler IN 46302 555-9184 Granger WIL003
WIL002 Williams Melanie 5439 Evergreen way Hammond IN 46500 555-6832 Hammond WIS002
BLU002 Blum Barbara 219 Dogwood Ave. Gary IN 46200 555-6728 Hammond WIL002
WIL003 Williams Nick 2819 Cherry St. Granger IN 46100 555-3298 Valparaiso WIS001
Creating an Index
Chapter 6 showed how to create one type of index when using the CREATE TABLE com-
mand. Part of creating a table is often defining a column to use as the primary key. The
primary key is a special index that causes PostgreSQL to automatically create an index for
the primary key column. You can see the index created using the \d meta-command:
Chapter 7: Using Advanced SQL 181
test=> \d store."Customer"
Table "store.Customer"
Column | Type | Modifiers
————————————+———————————————————+——————————
CustomerID | character varying | not null
LastName | character varying |
FirstName | character varying |
Address | character varying |
City | character varying |
State | character(2) |
Zip | character(5) |
Phone | character(8) |
Indexes:
"CustomerKey" PRIMARY KEY, btree ("CustomerID")
test=>
The CustomerKey index was automatically created for the CustomerID column
when the primary key was defined.
You can manually create indexes for any column within the table that might be used
in a query. To create an additional index, you must use the CREATE INDEX command:
When creating an index, you have the option of specifying several things. First off, if
you specify the UNIQUE keyword, each entry in the index file must be unique, so every
record must have unique data for the specified column. While using a UNIQUE index is
allowed in PostgreSQL, the preferred method of forcing a column value to be unique is
to add a constraint to the column (see “Creating a Table” in Chapter 6).
One or more columns or expressions that use columns are specified as the value to cre-
ate the index on. You can specify as an index value not only a column, but also a function
that uses a column. The classic example of this is the lower() string function (described
in Chapter 8). This function converts a string to all lowercase letters. By default, the in-
dex on a string value is case sensitive. String values in uppercase are considered different
from string values in lowercase. In some applications this can be a problem, as you never
know when a customer will enter data values such as names in upper- or lowercase. By
creating an index using the lower() function, you can eliminate this problem by con-
verting all string data values to lowercase for the index.
Picking which columns to create indexes for is often a science in itself. For each index
you define, there is overhead associated with it. Every time a new record is added to
the table, PostgreSQL must automatically determine where in the index file to place the
indexed column value. The trick is balancing which indexes will improve your query
182 PostgreSQL 8 for Windows
performance without having too much of a negative impact on data entry. For appli-
cations that do a lot of data input and little querying, it is often best to not mess with
indexes.
An example of creating a simple index follows:
test=# create index "Customer_City_Index" on store."Customer" ("City");
CREATE INDEX
test=# \di store.
List of relations
Schema | Name | Type | Owner | Table
————————+—————————————————————+———————+——————————+—————————
store | CustomerKey | index | postgres | Customer
store | Customer_City_Index | index | postgres | Customer
store | OrderKey | index | postgres | Order
store | ProductKey | index | postgres | Product
(4 rows)
test=# \d store."Customer_City_Index"
Index "store.Customer_City_Index"
Column | Type
————————+——————————————————
City | character varying
btree, for table "store.Customer"
test=#
Notice that the naming rules apply to indexes as well. If you use uppercase letters in
the index name, remember to use double quotes. Also, notice that by default the index
was created in the same schema as the data table it references.
You can also create indexes based on more than one column:
test=#
PostgreSQL will automatically use the resulting index on queries where the WHERE
condition references both the City and State columns.
Chapter 7: Using Advanced SQL 183
The TABLESPACE parameter allows you to create the index file on a tablespace dif-
ferent from where the original table is located. By default, the index is placed on the de-
fault system tablespace, usually pg_default. On a heavily loaded system, it is common
for database administrators to place table indexes on a separate hard drive from the data
table. This helps minimize the head movement on a specific hard drive as PostgreSQL
searches the index file and the table data.
The WHERE parameter allows you to create a partial index, which indexes only a sub-
set of the full table data. Only the data that meets the WHERE condition is placed in the
index. As mentioned, creating an index does have a performance cost. If there is a par-
ticular subset of data that is queried more so than other data, you can create the index
using only this data.
test=#
If you determine that the created index is detrimental to your database performance,
you can remove the index by using the DROP INDEX command. If lots of data is added
to the table, you may consider using the REINDEX command. The REINDEX command
is used to, obviously, reindex the index files associated with the table. In very active
tables, often the index can get somewhat out of wack, and reindexing can help speed up
performance of the index. The REINDEX command is part of the table maintenance tools
provided by pgAdmin III.
X btree
R rtree
R hash
W gist
184 PostgreSQL 8 for Windows
The btree option is the default value used by PostgreSQL. It uses the balanced-tree
(B-Tree) sorting algorithm to create links between index values. The B-Tree sorts the data
based on the index value, then references index values with links. Each index value links
to values both less than and greater than the value.
The B-Tree index contains a root value, which is usually in the middle of the data
range. The B-Tree index compares the desired result value to the root value and deter-
mines if it is less than, greater than, or equal to the root value. Obviously, if the desired
value is equal to the root value, the search is complete. If the value is less than the root
value, a link to the midpoint of the lower range is given. In a similar manner, if the de-
sired value is greater than the root value a link to the midpoint of the higher range is
given. Each of the resulting sides is then split in half, and so on until the desired value is
found in the index. This is demonstrated in Figure 7-3.
The B-Tree links allow PostgreSQL to quickly scan through the index looking for the
appropriate values. The City index starts with the middle value, Gary. If the city that is
being searched is greater than Gary, the right-side link is followed to Hammond. If the
desired value is less than Hammond (but greater than Gary), the left-side link is followed
to Granger. Thus, instead of having to read every record in the table, PostgreSQL was
able to find the proper record in just a couple of reads in the City index file.
The B-Tree index method is useful for indexing numerical and string values. If you
are querying a table for values equal to, less than, or greater than a value, the B-Tree in-
dex can quickly produce results, as this is the core basis of the underlying data sort in the
index. If you are in doubt about what indexing method to use, it is safe to use the default
B-Tree method.
The rtree option uses an R-Tree (real tree) index, and is often used for data ranges (such
as television programs that span a timeframe) or multidimensional data (such as spatial
data or combinations of data). It uses a recursion tree-indexing method. The layout of
the R-Tree index is similar to the B-Tree, but the benefit of the R-Tree is that it can group
data elements together within the individual nodes. An example of an R-Tree index is
shown in Figure 7-4.
The spatial points are first grouped into two groups, R1 and R2. The points are then
further grouped into smaller groups, R3, R4, R5, and R6. The R-Tree index is organized by
the individual groups, and shows which spatial groups are subsets of other spatial groups.
Gary
Fowler Hammond
Spatial data
R1 R2
R3
5
R5
R-Tree index
4 R1 R2
R6 R3 R4 R5 R6
3
R4
2
1 2 3 4 5 6
If you are specifically working with multidimensional point values, use the R-Tree index
method. Otherwise, it is not any benefit over the standard B-Tree method.
The hash option uses a hash index, and is the oldest and least-used indexing algorithm
available in PostgreSQL. It uses simplified hashing techniques to sort data elements. The
hash index is only useful for comparing value equalities, and is less useful than the stan-
dard B-Tree method. PostgreSQL recommends using the default B-Tree method instead
of the hash index method. The hash index method is mainly available for compatibility
purposes with older systems.
The gist option uses a Generalized Search Tree (GiST) index. This is a relatively new
indexing method that utilizes several different indexing techniques in a conglomeration
of algorithms. This method is supported by PostgreSQL starting in version 7.3.
The GiST index method allows administrators to define indexes for many differ-
ent types of data, indexing on many different user-defined features. Some examples of
advanced GiST indexes are indexing three-dimensional objects and full-text indexing of
complete text articles.
TRANSACTIONS
As described in Chapter 1, PostgreSQL supports the database concept of transactions.
Transactions allow you to bundle multiple SQL commands together into one event that
186 PostgreSQL 8 for Windows
is either fully entered into the database or fully denied by the database. If any one com-
mand in the transaction fails, the entire transaction fails.
A typical example of using transactions is when related data in multiple tables is
being inserted into the tables. In the store example, the Order table contains a column
to indicate the quantity of a product a customer purchases. Similarly, the Product table
contains a column to indicate the number of items of a product currently in inventory.
Ideally, if an order is processed, your application must subtract the quantity purchased
in an order from the product inventory count. If either one of these transactions fails, the
inventory count will be inaccurate.
If any of the SQL commands that were run within the transaction fails, all of the com-
mands executed before the failed command must be undone, called a rollback. PostgreSQL
must keep track of all data entered or modified in the database during the transaction.
The following sections describe how to perform transactions in the PostgreSQL
system.
Basic Transactions
PostgreSQL supports transactions with the BEGIN and COMMIT SQL commands. As you
can probably guess, the BEGIN command indicates the start of a transaction block of
commands, and the COMMIT command indicates the end of the transaction block.
By default, each SQL command processed by PostgreSQL is handled as a separate
transaction. If you are using psql and enter a SQL INSERT command to enter a new
record in the Order table, then a SQL UPDATE command to update the Product table,
PostgreSQL handles each command as a separate transaction. If the INSERT command
fails because the data entry clerk forgot to enter the OrderID column data, but the
UPDATE command succeeds on the Product table, you will have a problem.
To force PostgreSQL to handle the two commands as a single transaction, use the
BEGIN and COMMIT commands. Here is an example of a failed transaction:
test=> begin;
BEGIN
test=> insert into store."Order" values ('BLU001', 'LAP001', 2, '1000.00',
test(> 'ORD004');
INSERT 0 1
test=> update store."Product" set "Inventory" = 9 where "ProductID" = 'LAP001';
ERROR: permission denied for relation Product
test=> commit;
ROLLBACK
test=> select * from store."Order";
CustomerID | ProductID | Quantity | TotalCost | OrderID
————————————+———————————+——————————+———————————+————————
BLU001 | LAP001 | 10 | $5,000.00 | ORD001
BLU002 | LAP001 | 2 | $1,000.00 | ORD002
BLU003 | DES001 | 1 | $300.00 | ORD003
(3 rows)
test=>
Chapter 7: Using Advanced SQL 187
The INSERT command to add the new record to the Order table was processed with-
in the transaction, but the UPDATE command to change the inventory total failed, as the
user did not have permission to update the Product table. When the COMMIT statement
was processed to commit the transaction, it produced the ROLLBACK message, which
indicates the transaction was rolled back and not processed. Looking at the Order table,
you can see that the record that was inserted during the transaction was removed (rolled
back) from the table.
Advanced Transactions
Besides the auto-rollback feature, PostgreSQL also allows you to perform a manual roll-
back at any point within the transaction using the ROLLBACK command:
test=> begin;
BEGIN
test=> insert into store."Order" values ('BLU001', 'LAP001', 2,
test(> '1000.00', 'ORD004');
INSERT 0 1
test=> select * from store."Order" where "OrderID" = 'ORD004';
CustomerID | ProductID | Quantity | TotalCost | OrderID
————————————+———————————+——————————+———————————+————————
BLU001 | LAP001 | 2 | $1,000.00 | ORD004
(1 row)
test=> rollback;
ROLLBACK
test=> select * from store."Order" where "OrderID" = 'ORD004';
CustomerID | ProductID | Quantity | TotalCost | OrderID
————————————+———————————+——————————+———————————+————————
(0 rows)
test=> commit;
WARNING: there is no transaction in progress
COMMIT
test=>
When the INSERT command is processed, the record is added to the table, but only
within the confines of the transaction. PostgreSQL hides the changes from other users
using the Multi-version Concurrency Control (MVCC) feature, discussed in Chapter 1. If
another user should query the table at this point in time, the record would not be visible
to them. After inserting the new record, the ROLLBACK command is executed. This re-
moves any previously executed commands since the start of the transaction. The newly
added record is removed from the table, and the transaction is aborted.
In this example, once the ROLLBACK command is executed, the entire transaction
is aborted. Sometimes in extremely long and complex transactions, you may need to
188 PostgreSQL 8 for Windows
perform sub-transactions—that is, little mini transactions within the larger transaction.
In a normal transaction, if any of the commands within the transaction fail, the entire
transaction is doomed for failure. To demonstrate this, look at this example:
test=> begin;
BEGIN
test=> update store."Product" set "Inventory" = 9
test-> where "ProductID" = 'LAP001';
ERROR: permission denied for relation Product
test=> insert into store."Order" values ('BLU001', 'LAP001', 2, '1000.00',
test(> 'ORD004');
ERROR: current transaction is aborted, commands ignored until end of
transaction block
test=> commit;
ROLLBACK
test=>
In this example, the failed command was the first command in the transaction. When
the second command was processed, even though it should have succeeded, PostgreSQL
indicated that it was ignored, as the transaction as a whole was already on its way to be-
ing rolled back. No other commands will be processed by PostgreSQL until the COMMIT
command is entered, completing the rollback of the transaction.
In some cases, though, you may want the parts of a transaction that succeed to be
committed, even if commands farther down in the transaction fail. PostgreSQL allows
you to create mini transactions within the larger transaction. If any of these mini transac-
tions fails, PostgreSQL can still process the previous commands within the larger trans-
action.
The SAVEPOINT command allows you to set places within the transaction block
where PostgreSQL can roll back to instead of aborting the entire transaction. This allows
you to save steps along the transaction process, and roll back to those steps if a future
command fails, thus salvaging the work that was completed.
You can set multiple savepoints within the transaction. You must assign each save-
point a name, and then use that name in the ROLLBACK command. Here is an example of
using a SAVEPOINT command to salvage a failed transaction:
test=> begin;
BEGIN
test=> insert into store."Order" values ('BLU001', 'LAP001', 2,
test(> '1000.00', 'ORD004');
INSERT 0 1
test=> SAVEPOINT first_point;
SAVEPOINT
test=> update store."Product" set "Inventory" = 9
test-> where "ProductID" = 'LAP001';
ERROR: permission denied for relation Product
Chapter 7: Using Advanced SQL 189
test=>
In this example, a savepoint named first_point was declared after the INSERT
command. When the UPDATE command failed, the ROLLBACK command was entered,
using the specific savepoint name. The transaction was rolled back to that point, and the
COMMIT command was entered. The transaction was processed just as if the command
after the SAVEPOINT was never entered.
CURSORS
Earlier, the “Revisiting the SELECT Command” section of this chapter showed how
to use the LIMIT and OFFSET parameters to walk your way through large result sets.
However, even these features can get cumbersome when working with large result sets.
Fortunately for us, there is an easier solution available.
SQL cursors are used as pointers within a result set returned from a SELECT com-
mand. The cursor points to a specific row of data within the result set. There is a subset
of SQL commands available that allows you to work with the cursor, return data from a
result set based on where the cursor is located, and move the cursor around within the
result set.
A cursor is a temporary object and, by default, only exists within a transaction,
although you can specify that the cursor stay active within an entire session. When the
transaction is terminated (either from a rollback or a commit) or the user terminates the
session, PostgreSQL automatically deletes the cursor object. You can also manually close
a cursor object within a transaction if you need to.
The following sections show how to create and use a cursor in your transactions.
Creating a Cursor
The DECLARE SQL command is used to create a new cursor. The format of the DECLARE
command in PostgreSQL is
DECLARE cursorname [BINARY] [ [NO] SCROLL]
CURSOR [ (WITH | WITHOUT) HOLD ] FOR query
190 PostgreSQL 8 for Windows
As you can tell from the command format, there are a few options available when
you create the cursor:
X BINARY Allows you to use the result set as binary data instead of the default
text data.
R SCROLL Allows you to move backward within the result set (specifying NO
SCROLL may increase performance if you do not need this feature).
W WITH HOLD Allows the cursor to stay active for the entire session (the default
is WITHOUT HOLD, which deletes the cursor at the end of the transaction).
The query parameter defines the SELECT command used to produce the result set.
Any valid SELECT command (including sub-selects) can be used to produce the result
set for the cursor.
If you create a cursor without the hold property (which is the default), you must start
a transaction using the BEGIN command before you can create the cursor:
test=> begin;
BEGIN
test=> declare all_customers cursor for select * from store."Customer";
DECLARE CURSOR
test=>
Notice that the result set for the SELECT command is not displayed automatically.
The cursor all_customers contains the data from the result set. The next step is to use
the cursor to view the data.
Using a Cursor
When the cursor is created, it points to the first record in the result set. You can now
display records from the result set based on the location of the cursor. As records are
displayed, the cursor moves on to the next record. There are a few commands that can be
used for handling active cursors:
X FETCH Retrieves (displays) records from the result set and sets the cursor to
the next record in the result set
R MOVE Moves the cursor to another record without displaying records
W CLOSE Removes the cursor before the end of the transaction
The FETCH command is the workhorse for cursors. This is the command you use to
maneuver around the result set and view the data. The basic format for the FETCH com-
mand is
Direction Description
NEXT Fetch the next record in the result set (the default if no
direction is specified)
PRIOR Fetch the previous record in the result set
FIRST Fetch the first record of the result set
LAST Fetch the last record of the result set
ABSOLUTE count Fetch the count’th record in the result set
RELATIVE count Fetch the count’th record from the current location in the
result set
Count Fetch the next count records in the result set
ALL Fetch all remaining records in the result set
FORWARD Fetch the next record in the result set (same as NEXT)
FORWARD count Fetch the next count records in the result set
FORWARD ALL Fetch all of the remaining records in the result set
BACKWARD Fetch the previous record in the result set (same as PRIOR)
BACKWARD count Fetch the previous count records in the result set
BACKWARD ALL Fetch all of the previous records in the result set
Obviously cursor is the cursor name used in the DECLARE command to open the cur-
sor. The direction parameter specifies the direction within the result set to get the next
record from. Remember, if the cursor was created using the NO SCROLL option, you can
only move forward in the result set. Table 7-1 shows the various direction values you
can use.
The MOVE command uses the exact same format and parameters as the FETCH com-
mand, but only moves the cursor to the new location without displaying any of the result
set records.
To test out the cursor functions, you should insert some records into one of your test
tables and play around with the various direction parameters. Here are a few examples
of using the FETCH and MOVE commands in a result set:
test=> begin;
BEGIN
test=> declare all_customers cursor for select "CustomerID",
192 PostgreSQL 8 for Windows
When the DECLARE command is executed, the cursor points to the first record in
the result set. The FETCH command displays all of the records in the result set, leaving
the cursor at the end of the result set. If you try to fetch the next record, no error will be
produced, but you will not get any records returned:
To retrieve the first record in the result set, you can use the FIRST direction:
Now the cursor points to the first record in the result set. When you use another
FETCH command, you will retrieve records relative to the first record, where the cursor
points to:
The three records after the first record are displayed, and the cursor is set to the last
record that is displayed.
So far so good, but look at what happens when you display the previous two records:
The previous two records were displayed, but in reverse order from how they were
in the result set. This is an important thing to realize when moving backward in the
result set. The records are displayed in the order based on the direction the cursor is
moving. When the cursor is moving backward, the records are displayed in reverse
order. When the cursor is moving forward, the records are displayed in normal order.
Another common mistake happens when using the MOVE command. The MOVE com-
mand moves the cursor to a particular record within the result set without displaying
any records. Often you will see a beginner try to move the cursor to the first record of the
result set, then attempt to use a FETCH command to display the record:
test=>
Notice that the FETCH command retrieved the second record in the result set, not the
first. Most likely this is not the result that was intended. The MOVE FIRST command
moves the pointer to the first record in the result set, but the FETCH NEXT command
retrieves the next record, which is actually the second record in the result set.
If you need to retrieve the record that the cursor currently points to, you can use the
FETCH 0 command:
test=>
194 PostgreSQL 8 for Windows
Now the FETCH command displays the first record in the result set.
When the transaction is ended (using the COMMIT command), the all_customers
cursor goes away. You can also close the cursor before the transaction ends by using the
CLOSE command.
Be careful when using cursors in transactions where you insert, delete, or modify
data. Remember, the cursor only returns the records (and their values) at the time of the
query in the DECLARE command (sort of a snapshot of the table). Any modifications to
the data after that time, even if the cursor is still open, do not appear in the cursor results.
This can cause huge problems if you do not remember this rule.
The best way to get comfortable with cursors is to play around with them. It is usu-
ally not until you make a few mistakes that you will get a true understanding of how
cursors work within the result set.
SUMMARY
This chapter continued the journey through SQL commands. First, a recap of the SELECT
command was presented, along with the full command format and descriptions. There
are lots of features involved in the SELECT command for producing queries. Next, using
views was discussed. Views allow you to create temporary tables combining columns
from multiple tables. This can help simplify complex queries, and also give your custom-
ers a single place to look for data, rather than having to hunt through several tables in
the database.
After views, helping speed up queries using table indexes was shown. Just as a book
index helps you to find topics in a book, table indexes help PostgreSQL to find data in
tables quicker and easier. Indexes on columns that are often used for queries can often
speed up the query process. However, overusing indexes can be a detriment to database
performance. As each new record is inserted into a table, all of the associated table in-
dexes must be updated as well.
Transactions can be used to help keep the data in tables consistent. A transaction
groups a set of SQL commands into a single block that is either processed or rejected
by PostgreSQL. If a single command within the block fails, PostgreSQL undoes, or rolls
back, the results of all the previous commands in the transaction. Savepoints can be
set within long transactions to provide places for PostgreSQL to roll back to within the
transaction. This allows some commands within the transaction to succeed, even if fu-
ture commands fail.
Finally, cursors were discussed. Cursors allow you to walk through the result set
from a SELECT command. They allow you to move both forward and backward through
the result set to examine the records contained in the result set.
The next chapter continues showing how to use SQL in PostgreSQL. PostgreSQL
provides many built-in functions that can be used in your SQL commands. Chapter 8
discusses these functions, and shows how to use them to simplify your applications.
8
PostgreSQL Functions
195
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
196 PostgreSQL 8 for Windows
W
hen handling data in a table, applications are often required to perform
functions with the data, such as obtaining the square root of a value, totaling
the values in a column, or converting a text value to all upper- or lowercase
letters. Instead of having to write code in your application to do these types of functions,
PostgreSQL provides many commonly used functions for you. All you need to do is
include the PostgreSQL functions directly in your SQL commands. This chapter walks
through the built-in PostgreSQL data functions and shows how to use them in your
applications.
WHAT IS A FUNCTION?
Functions are used by the database engine to manipulate data contained in the tables.
The database engine provides functions to help database programmers minimize the
amount of code required in an application program to manipulate data contained in
tables. Instead of having to write code for basic calculations, such as adding numbers to
a column data value, summing column data values, determining averages, or convert-
ing strings to integers, the programmer can utilize functions already provided by the
database engine.
While the ANSI SQL standards include some generic functions that should be imple-
mented, they do not specify the exact format of the functions. Thus, any functions avail-
able within a particular database engine are designed purely by the database engine de-
velopers. There is no guarantee that a function present in one database engine is portable
to other database engines. That said, many functions available within current database
engines are somewhat consistent between vendors.
The PostgreSQL developers have included many functions within the PostgreSQL
database engine to assist you with your database programming needs. The following
sections describe the basics of functions in the PostgreSQL database engine.
Operators
Operators are the most basic form of data functions. They include the standard math-
ematical operators you are familiar with (such as addition, subtraction, multiplication,
division, and Boolean operators), along with some you might not be familiar with (such
as cubed roots and factorials). The \do meta-command in psql displays all of the opera-
tors available in the PostgreSQL database engine. You can also specify a single operator
with the \do meta-command to see the format for just that operator:
test=> \do &
List of operators
Schema | Name | Left arg type | Right arg type | Result type | Description
————————————+——————+———————————————+————————————————+—————————————+————————————
pg_catalog | & | bigint | bigint | bigint | binary and
pg_catalog | & | bit | bit | bit | bitwise and
Chapter 8: PostgreSQL Functions 197
test=>
Notice that there are several entries for the Boolean AND operator, based on the input
value data type. It is crucial that you use the proper data types when using operators.
Each operator specifies a set of data types that can be used for the input, and the data
type of the resulting output. You must follow these rules. You cannot mix and match data
types, such as add a character string to an integer data type.
There are three basic groups of data type operators:
X Numeric values
R String values
W NULL values
The following sections describe how to use operators within each of these data
types.
test=>
One thing you should notice in these simple examples is that if you are not using data
from a table, you do not need a FROM clause when working with operators and functions.
The result of the function appears within the result set, just as with a SELECT statement
that queries tables.
198 PostgreSQL 8 for Windows
While these are cute examples, in real life they are not too practical. A common use
for operators is within the UPDATE command to update values in the table. For example,
if you need to increase the price of all the products in the store by 3 percent, you would
do the following:
test=> update store."Product" set "UnitPrice" = "UnitPrice" * 1.03;
UPDATE 2
test=> select * from store."Product";
ProductID | ProductName | Model | Manufacturer | UnitPrice | Inventory
———————————+—————————————+———————————+——————————————+———————————+——————————
DES001 | Desktop | Standard | Acme | $309.00 | 25
LAP001 | Laptop | Takealong | Acme | $515.00 | 10
(2 rows)
test=>
Since no WHERE clause was specified, the expression within the UPDATE command
changed the UnitPrice data values in all records in the table. Notice that you can use col-
umn names within the expressions as well as constant values. This provides for a wealth
of possibilities for manipulating data within the tables using SQL commands, without
having to write any separate code in your application!
Besides the standard numerical operators, PostgreSQL also provides a few operators
that you might not be familiar with. A complete list of numerical operators supported in
PostgreSQL is shown in Table 8-1.
The concatenation operator (||) allows you to splice multiple string values together
into a single string value:
test=>
Chapter 8: PostgreSQL Functions 199
The concatenation operator merged the two column string values along with a con-
stant string value to produce the desired output in the result set. Again, it is important
to notice that the constant string value uses a single quote, and the column names use a
double quote. The concatenation operator is an extremely useful tool to have available
when creating reports from a query.
The most popular type of string operators available in PostgreSQL are comparison op-
erators. Comparison operators are used to (obviously) compare string values. This allows
you to do such things as query a table looking for a specific name or address.
There are three categories of string comparison operators available in PostgreSQL:
Each type uses a slightly different method for comparing strings, as discussed in the
following sections.
test=>
test=>
The regular expression used in the WHERE clause compares the FirstName column
value to the regular expression [RB]. This expression matches any string that contains
either an R or a B anywhere within the string.
Here is another example of using a regular expression on a string:
test=> select "LastName", "FirstName" from store."Customer"
test-> where "FirstName" ~ '[A-K]';
LastName | FirstName
——————————+———————————
Blum | Jessica
Blum | Barbara
Blum | Katie
(3 rows)
test=>
This regular expression checks for strings that contain any letter between capital A
and capital K.
The ANSI SQL LIKE Operator The LIKE operator is an ANSI SQL standard operator that
compares string values using a limited version of a regular expression. The LIKE regular
expression allows character matching, but uses only two special symbols within the
expression:
Thus the expression 'c%' matches any string that begins with the letter c, the expres-
sion 'c_t' matches three-character strings that start with the character c and end with
the letter t (such as cat, cot, or cut), and the expression '%tion' matches any string that
ends with the characters tion.
Here is an example of using the LIKE operator:
test=>
The LIKE parameter uses the expression '_a%' to filter out records where the
FirstName value has a lowercase a in the second character position (anything can be in
the first position), followed be zero or more characters.
Chapter 8: PostgreSQL Functions 203
You can use the NOT LIKE operator to filter strings that do not match the indicated
string pattern. PostgreSQL also supports the ILIKE operator to compare strings without
case sensitivity.
The ANSI SQL SIMILAR TO Operator The SIMILAR TO operator is a relatively new addition
to the ANSI SQL standard. It uses the underscore and percent symbols used in the LIKE
operator, along with a few symbols found in the Unix-style regular expressions discussed
earlier. The operators allowed in the SIMILAR TO operator are listed and described in
Table 8-4.
As an example, if you want to display the records where the FirstName column value
starts with either a J or a K, you would use the following expression:
test=> select "LastName", "FirstName" from store."Customer"
test-> where "FirstName" similar to '(J|K)%';
LastName | FirstName
——————————+——————————
Blum | Jessica
Blum | Katie
(2 rows)
test=>
The parentheses in the regular expression group the expression to represent one
character. The percent sign at the end of the regular expression matches zero or more
characters in the string.
Operator Description
_ Match any one character at that position
% Match zero or more characters
* Match repetitions of the previous character zero or more times
+ Match repetitions of the previous character one or more times
[abc] Match one or more characters listed between the brackets
() Group items into a single logical item
Testing if a column data value is NULL can be tricky. You cannot use standard math-
ematical operators to check if a value is NULL, as you cannot compare two unknown
values. Another thing to remember is that the NULL value is not the same thing as either
an empty string or the numeric value of zero. This confuses and traps many a novice
database programmer.
To add insult to injury, there is some controversy between the various database en-
gine vendors as to exactly how to handle NULL values (see Chapter 12). This can often
make porting a database application from one vendor’s database engine to another’s
extremely complicated.
The ANSI SQL standard provides the IS NULL and IS NOT NULL operators to help
you determine when a column value is NULL. PostgreSQL fully supports the IS NULL op-
erator. The IS NULL operator is the only way to determine if a column value is empty.
Here is an example of using the IS NULL operator to check a column value. First,
you must create a record that contains a NULL value for a column value. This can be
done by inserting a new record and not assigning a value to one or more data columns
(remember, some of the columns were created so that they cannot be NULL):
test=> insert into store."Customer" ("CustomerID", "LastName", "Phone")
test-> values ('MUL001', 'Mullen', '555-4829');
INSERT 0 1
test=>
Now you can perform a query on the table, looking for records that contain a NULL
value for one of the missing data columns:
test=> select * from store."Customer" where "FirstName" IS NULL;
CustomerID | LastName | FirstName | Address | City | State | Zip | Phone
————————————+——————————+———————————+—————————+——————+———————+—————+————————
MUL001 | Mullen | | | | | | 555-4829
(1 row)
test=>
The preceding SQL command queries the table, looking for records that do not have
a value in the FirstName column. Sure enough, the new record appears in the result set.
To see how easy it is to confuse NULL values with empty strings, we can create a re-
cord containing an empty string value, then try to use the IS NULL operator to find it:
test=> update store."Customer" set "FirstName" = ''
test-> where "CustomerID" = 'MUL001';
UPDATE 1
test=> select * from store."Customer" where "FirstName" IS NULL;
CustomerID | LastName | FirstName | Address | City | State | Zip | Phone
————————————+——————————+———————————+—————————+——————+———————+—————+——————
(0 rows)
test=>
Chapter 8: PostgreSQL Functions 205
Since the FirstName column has a value now, even though it is an empty string, the
IS NULL operator does not filter the record out. You can also set a column value to NULL
by using the NULL keyword:
test=> update store."Customer" set "FirstName" = NULL
test-> where "CustomerID" = 'MUL001';
UPDATE 1
test=> select * from store."Customer" where "FirstName" IS NULL;
CustomerID | LastName |FirstName| Address | City | State | Zip | Phone
————————————+——————————+—————————+—————————+——————+———————+—————+—————————
MUL001 | Mullen | | | | | | 555-4829
(1 row)
test=>
Setting the FirstName column value to NULL has the same effect as creating the new
record with a column value.
Built-in Functions
Besides all of the standard operators, the PostgreSQL developers also included many
useful functions for use with various data types. As with operators, functions can be
used in any expression value within the SQL command.
To get an idea of the number of functions available in PostgreSQL, try the following
\df meta-command from psql. The output goes on for quite a long time. Just as with
the operators listing, you can specify a specific function to view information on only that
function. Also just as with operators, notice that within the functions available, there are
often several versions of the same function, depending on the input data types.
test=>
The area() function has three different versions, depending on the data type of the
input data. Even though the three formats use different input data types, they all use the
same data type for the result.
Just as with operators, it is imperative that you use the correct input data types with
the correct functions. There is not much sense in trying to calculate the square root of
206 PostgreSQL 8 for Windows
a string. The PostgreSQL functions can be divided into groups based on the data types
they handle:
X String functions
R Date and time functions
R Math functions
R Data conversion functions
W Aggregate functions
The following sections describe each of these groups of functions and show a few
basic examples of how to use the different types of functions.
STRING FUNCTIONS
Along with the standard string comparison operators discussed earlier, PostgreSQL pro-
vides a wealth of functions that can be used to manipulate string values. The ANSI SQL
standard specifies some standard string functions that all database vendors must imple-
ment to be ANSI standard. Table 8-5 lists and describes the ANSI SQL string functions.
The upper() and lower() functions are commonly used to convert strings to all
one case:
test=> select upper("LastName"), lower("FirstName") from store."Customer";
upper | lower
————————+————————
BLUM | jessica
BLUM | rich
BLUM | barbara
BLUM | katie
MULLEN |
(5 rows)
test=>
Notice that by default the column names in the result set change to the function name
used (upper and lower), and not the table column name of the data used in the function
(LastName and FirstName).
Besides the standard ANSI SQL string functions, PostgreSQL provides some of its
own string functions, listed and described in Table 8-6.
Using string functions within SQL commands is fairly simple. Just provide the re-
quired number of parameters for the function. You should also be careful of what the
function returns. Some functions return the result as a string, and some return the result
as an integer value. Remember that an expression within a WHERE clause must compute
Chapter 8: PostgreSQL Functions 207
to a Boolean TRUE/FALSE result, so you cannot use string functions by themselves with-
in a WHERE clause. You can however use them within the expression:
test=>
208 PostgreSQL 8 for Windows
test=>
210 PostgreSQL 8 for Windows
Besides these functions, there is also the now() and timeofday() functions:
test=>
Chapter 8: PostgreSQL Functions 211
The date_part() and extract() functions allow you to extract individual date
or time values from larger timestamp values. This is handy when trying to combine or
compare items by date, such as looking for all items purchased in September. These
functions use common words such as year, day, month, hour, minute, second, and
millisecond, to extract the desired value:
test=> select date_part('month', current_timestamp);
date_part
——————————
8
(1 row)
test=>
MATH FUNCTIONS
It is not surprising that PostgreSQL includes lots of math functions. There are math func-
tions for just about every common math utility you could require in your application.
Table 8-8 lists and describes the various math functions supported by PostgreSQL.
All of the math functions return numeric values. If you need to use a math function
within a WHERE clause, remember to place it in an expression that evaluates to TRUE or
FALSE.
The random() function is often used to produce random numbers for applications.
The random() function only returns a random number between 0 and 1. It is your job
to extrapolate that value to the range you require for your application. If you need to
produce a random integer value between 10 and 50, you would use
test=>
By multiplying the random number times 40, you are producing a random floating-
point number between 0 and 40. The round() function is used to round the result to
the nearest integer value. Adding 10 makes the range go from 10 (if the random number
was 0) through 50 (if the random number was 1).
AGGREGATE FUNCTIONS
The functions covered so far in this chapter work on a single value within a record. Often
there are times when you need to extract information based on the data contained in all
of the records in a table. An aggregate function is a function that PostgreSQL performs on
all values contained within a specified column for all records.
212 PostgreSQL 8 for Windows
These functions are helpful when you need to perform a statistical analysis of data
contained within a table. Instead of having to read every record in the table and perform
calculations within your application, the PostgreSQL aggregate functions can produce
the statistical results automatically for you (and usually much quicker).
The aggregate functions supported in PostgreSQL are listed and described in Table 8-9.
These functions are used within the SELECT list portion of the SELECT SQL command
and are displayed in the result set as a separate column, similar to the other functions:
test=> select * from store."Order";
CustomerID | ProductID | Quantity | TotalCost | OrderID
————————————+———————————+——————————+———————————+————————
BLU001 | LAP001 | 10 | $5,000.00 | ORD001
BLU002 | LAP001 | 2 | $1,000.00 | ORD002
BLU003 | DES001 | 1 | $300.00 | ORD003
BLU001 | LAP001 | 2 | $1,000.00 | ORD004
(4 rows)
test=>
The sum() aggregate function automatically added the TotalCost column values
in one easy command. For more advanced calculations, you can use the GROUP BY
clause to sum individual groups of data values found in a column:
test=> select "ProductID", sum("Quantity")
test-> from store."Order" group by "ProductID";
ProductID | sum
———————————+————
LAP001 | 14
DES001 | 1
(2 rows)
test=>
This produces two records, one for each type of ProductID value that appears in the
table. If you are working with aggregate functions on column data, the GROUP BY clause
is almost always needed to produce the correct aggregate values. Be careful that you use
the correct column to group the data by, or you may not get the results you intend.
SUMMARY
This chapter discussed the functions that PostgreSQL provides for programmers to
help make their lives easier. PostgreSQL provides lots of common functions for all data
types for you to use in your SQL programming. The simplest type of functions are called
operators. Operators allow you to easily add, subtract, and multiply numeric values
within your SQL commands. There are comparison operators provided that allow you
to compare complex string values. PostgreSQL provides three methods for comparing
string values. The Unix-style regular expressions use wildcard symbols to match one or
more characters within a string value. The ANSI SQL LIKE operator is used for compar-
ing a string value to a basic simplified regular expression. The ANSI SQL SIMILAR TO
operator is a combination of the simplicity of the LIKE operator with some of the ad-
vanced features of the Unix-style regular expression.
Finally, the chapter discussed the math, string, date and time, and aggregate func-
tions. Date and time functions allow you to easily manipulate and compare date and
time values in your SQL commands. The aggregate functions allow you to perform sta-
tistical analysis, such as sums, standard deviations, and averages, on all or a subset of
records in a table.
The next chapter discusses how to use stored procedures and triggers in your tables.
Both stored procedures and triggers allow you to create complex table environments that
can be utilized in your database applications.
9
Stored Procedures
and Triggers
215
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
216 PostgreSQL 8 for Windows
C
hapter 8 showed how to use the built-in functions provided by PostgreSQL in SQL
commands, making the life of database programmers much easier. This chapter
goes a step further in the process by showing how to create your own functions.
PostgreSQL supports a number of different programming languages that can be used to
create functions. This chapter concentrates on using the PL/pgSQL procedural language.
After walking through how to use the PL/pgSQL language, the chapter demonstrates
how to use it to create functions that are unique to your specific applications and available
for all of the application users. The chapter discusses how to create and use both stored
procedure functions, which allow you to create functions for commonly used actions
within your own specific applications, and trigger functions, special functions that are
performed when a predefined event occurs in a table.
X PL/pgSQL
R PL/Tcl
R PL/Perl
R PL/Python
W PL/Java
languages are also similar. Very little effort is required to port functions created in
PL/SQL to PL/pgSQL, and visa versa.
The function examples in this chapter all use the PL/pgSQL procedural language. If
you want to run these examples within your PostgreSQL database, you must have opted
to install the PL/pgSQL procedural language during the PostgreSQL install. If you did
not, you can still install it using the normal PostgreSQL installation, this time just select
the PL/pgSQL procedural language to install (just remember to uncheck the PostgreSQL
database installation, or you will overwrite your existing database structure).
TYPES OF FUNCTIONS
There are two types of functions that can be created using procedural languages:
X Stored procedures
W Triggers
Stored procedures are probably the thing you would normally think of when thinking
of creating a new function. They behave exactly as do the built-in PostgreSQL functions
within common SQL command expressions. They appear in the \df meta-command list-
ing of functions, and can be used by other database users in their SQL commands to eas-
ily manipulate data within the database tables.
The benefit of creating stored procedures is that instead of each individual user hav-
ing to create a lot of SQL commands to perform an action on table data, you can create
a specialized stored procedure that performs multiple SQL commands as one function.
Each user can then execute the stored procedure as a single function within a SQL com-
mand. When the function executes, the entire block of SQL commands contained in it is
processed by the database engine.
Triggers are functions that are executed directly by the database engine, based on an
event that occurs in a table. PostgreSQL monitors each action performed on tables (such
as inserts, updates, and deletes). If an action is performed on a table that has a trigger
linked to it, the database engine automatically performs the trigger function assigned to
the trigger. Trigger functions are often used to update related data in tables automatically
when a table value is inserted or updated in a single table. This allows you to maintain
data relations automatically, without having to worry about that in your application
code.
Besides SQL commands, the PL/pgSQL language also uses standard programming
statements to define and evaluate variables, accept input values for the function, supply
output values from the function, and use control logic, such as conditional loops and
IF/THEN statements, to control execution flow within the program. The PL/pgSQL lan-
guage is an interpreted programming language. This means that it is not compiled into a
binary form, but rather the database engine executes the text program lines one by one
as the function is executed. That said, there is a catch to the way PostgreSQL handles
PL/pgSQL code, and you should be aware of it.
The first time a PL/pgSQL program is run in the database engine, an execution plan
is prepared (see Chapter 10). The execution plan is the roadmap PostgreSQL uses to
interact with tables, indexes, and result sets. Once the execution plan is created for a
function, PostgreSQL keeps the plan and uses it every time the function is called (during
the lifetime of the PostgreSQL server session). This feature increases performance, as the
PL/pgSQL code does not have to be interpreted each time the function is run.
The downside to this is that you must be careful when making dynamic changes to
functions. Normally PostgreSQL detects if a function’s code has changed, and creates a
new execution plan when necessary. What it does not know, though, is whether a func-
tion calls another function, and whether that function’s code has been changed. This
can often cause problems when working with stored procedures, as the old execution
plan for the called function will be used. It is up to you to know when this situation
may occur. The only way to solve the problem is to re-create the original function (even
though it has not changed) so that PostgreSQL knows to create a new execution plan, or
manually stop and restart the database server session (see the discussion of the pg_ctl
program in Chapter 3).
The following sections describe the PL/pgSQL procedural language and show how
to use it to create your own functions for your applications.
DECLARE
<variable declarations>
BEGIN
<code section>
END;
$$ LANGUAGE languagename
[ IMMUTABLE | STABLE | VOLATILE
| CALLED ON NULL INPUT| RETURNS NULL ON NULL INPUT | STRICT
| SECURITY INVOKER | SECURITY DEFINER ]
The languagename parameter defines the procedural language used to create the func-
tion. For functions created using PL/pgSQL, this value is plpgsql. The first line of the
function declares the functionname parameter, which defines the unique name used to
identify the function. By default, the function will be created in the public schema. If
you need to create the function for a different schema, you must specify a schema name
along with the function name.
You can define for the function both input and output arguments that are used to
pass input and output variables between the function and the SQL code that uses the
function. There are three components used to define a function argument:
The argmode value defines the direction of the argument, IN for input variables, OUT
for variables returned by the function, or INOUT for variables that can be used for both.
The argname value defines the name of the variable. In PL/pgSQL this variable name can
be used within the program code of the function. The argtype value defines the data type
of the argument. You may define as many input and output variables as you need (each
separated by a comma).
For compatibility with previous versions of PostgreSQL, PostgreSQL 8 also supports
defining a single output variable using the RETURNS keyword in the command. The
returntype parameter defines only the data type of the returned output variable. The last
line of code in the function must be a RETURN statement. This statement specifies which
variable is used to return a value to the calling program.
The AS section defines the program code used in the function. Since PL/pgSQL is an
interpreted language, all of the lines of program code must be in text format. The double
dollar signs ($$) signify the start and end of the function code text.
PostgreSQL interprets the function code as a single literal text string. If you remem-
ber from Chapter 6, literal text strings are normally enclosed with single quotes, and
that is certainly permissible in this case as well. However, when creating a function, this
could be a problem, as you often need to use single quotes to define other literal strings
used within the function.
Working with literal strings embedded within literal strings can quickly become con-
fusing, as quotes for embedded literal strings must be identified with multiple single
quotes. PostgreSQL has provided the double dollar sign since version 7.3 as an alterna-
tive method for signifying literal strings. This allows the programmer to use the double
220 PostgreSQL 8 for Windows
dollar sign to delineate the function code literal string, then use single quotes within the
BEGIN code section to delineate other literal strings.
The program code is divided into two sections. The DECLARE section of the code is
used to declare variables used within the function. Each variable declaration statement
uses the following format:
variablename datatype [ := value];
The variablename identifies the variable and can be used anywhere within the func-
tion. The datatype identifies the data type of the variable. Any of the available data types
used in table columns can be used here. Optionally, you can also assign a default value to
the variable. Notice that the symbol for assigning a value is := and not just an equal sign.
Be careful to note that in PL/pgSQL equal signs are preceded by a colon. The equal sign
is used for a comparison, and not for an assignment. Also make sure that each statement
in the DECLARE section ends with a semicolon, another syntax requirement.
The BEGIN and END keywords are used to delineate the code section of the function.
This is where you enter the SQL commands and programming statements to manipulate
table data and variables. If the function uses the RETURN parameter, this statement must
be the last statement before the END keyword, indicating the variable to return.
There are two methods for returning values from functions. The old-fashioned
RETURN statement can be used within the function, or you can specify an output argu-
ment using the OUT keyword. Do not let this confuse you; both methods perform the
same action, to return an answer from the function. As discussed later in this chapter,
the pgAdmin III program makes a distinction between functions that use the RETURN
keyword and those that use the OUT keyword (calling functions that use the RETURN
keyword “functions” and functions that use the OUT keyword “procedures”). In reality,
there is no difference between the two methods.
After the LANGUAGE keyword, there are several parameters that can be used to further
define the behavior of the function. These parameters are listed and described in Table 9-1.
The easiest way to get the hang of using stored procedure functions is to watch one in ac-
tion. Here is a simple example of defining and using a stored procedure function in psql:
test=# create function store.test(IN val1 int4, IN val2 int4, out result int4)
test-# AS $$
test$# DECLARE
test$# multiplier int4 := 3;
test$# BEGIN
test$# result := val1 * multiplier + val2;
test$# END;
test$# $$ language plpgsql;
CREATE FUNCTION
test=# select store.test(10, 5);
test
-----
35
(1 row)
test=#
Chapter 9: Stored Procedures and Triggers 221
This stored procedure function defines two input arguments of int4 data type, and
one output argument, also of int4 data type. The DECLARE section also declares an ad-
ditional variable, multiplier, and assigns it a default value of 3. The actual code sec-
tion performs a simple assignment, assigning a simple mathematical equation using the
variables to the variable result. Since this variable was identified as the OUT argument,
this is the value the function returns. Remember, you could just as easily use the RETURN
keyword to return this value.
After creating a stored procedure function, you most likely want to share it with
other database users. You must use the GRANT SQL command and grant the EXECUTE
privilege to Group or Login Roles before other users can use the new stored procedure
function.
Notice that in this example I entered each line of the program one at a time directly
at the psql command prompt. Any typing mistakes would have ruined the entire func-
tion, and I would have had to start entering the function from the start. That is not a good
thing if you are working with a 100-line function (unless you are an impeccable typist).
222 PostgreSQL 8 for Windows
There are two recommended methods for creating complex functions. The first method
is to create a text file that contains the complete CREATE FUNCTION command. If you are
in development mode and need to experiment with various program code, you often have
to constantly rewrite a function. It is advised to use the CREATE OR REPLACE FUNCTION
format when creating your function code in the text file, so any code updates will replace
the function without you having to manually delete the function. Also, remember you
must end the CREATE FUNCTION command with a semicolon, or psql will not process it.
After the text file is created, you can load it into the database using the psql -f option:
C:\Program Files\PostgreSQL\8.2\bin>psql -f test.sql test postgres
Password for user postgres:
CREATE FUNCTION
C:\Program Files\PostgreSQL\8.2\bin>
The function code was loaded into the database engine and the new function was
created.
The other recommended method for creating a function is to use a graphical de-
velopment environment that allows you to alter program code within an editing win-
dow. It just so happens that our friend pgAdmin III contains a graphical environment
for creating functions. The next section describes how to use this to create and modify
functions.
Notice that pgAdmin III makes a distinction between functions and stored proce-
dures. This is a new feature within pgAdmin III. It considers functions that were cre-
ated using the RETURNS keyword as functions, and functions created using the OUT pa-
rameter as stored procedures. This is somewhat important to know when looking for a
function that was manually created in psql. Remember, in reality, both functions and
procedures are the same thing, just with different methods of returning values to the
calling program.
Chapter 9: Stored Procedures and Triggers 223
If you created the example function in the previous section, it should appear in the
Procedures section of the store schema, since it uses the OUT parameter to return a
value. Right-clicking the test procedure name and then selecting Properties produces
the Procedure window, shown in Figure 9-1.
The Properties tab shows all of the CREATE FUNCTION options that were used to
create the function. Notice that check boxes are used to define the STRICT and SECU-
RITY DEFINER parameters. Clicking the Parameters tab shows the parameters defined
for the function, as well as the function code used to define the function, as shown in
Figure 9-2.
The function code appears in a windowed textbox. You can add new lines, remove
existing lines, and even move lines around using standard Windows cut-and-paste tech-
niques. This makes the job of modifying functions almost easy.
You can create a new procedure or function by right-clicking the appropriate object
type in the main window and selecting New Procedure or New Function from the menu.
The New Function or New Procedure window appears, which is similar to the Procedure
window shown in Figure 9-1, except all of the textboxes are empty.
You must fill in the appropriate values for your function in the textboxes. If you are
creating a function instead of a stored procedure, you must set the data type of the return
variable in the Return Type drop-down box. For a PL/pgSQL function, you must also
change the Language entry to plpgsql.
If the function will return the result from a query, you must check the Result Set check
box. If you want the function to return a NULL value if any of the input values are NULL,
check the Strict check box. By default, the function is run using the privileges of the user
executing the function. Checking the Security of Definer check box forces the function to al-
ways run using the privileges of the function owner, no matter who executes the function.
The Parameter tab allows you to define the input arguments for the function. If you
are creating a new stored procedure, you must also select whether the argument is IN,
OUT, or INOUT. The Definition textbox is where the program code goes. You can use stan-
dard Windows cut-and-paste functions to move code lines around the textbox.
Chapter 9: Stored Procedures and Triggers 225
The Privileges tab allows you to define what Group Roles are allowed to execute the
function. By default, only the function owner is allowed to use it. If you want everyone to be
able to use the function, you must grant execute privileges for the public Group Role.
variable := expression;
The expression can be any normal mathematical expression that uses constant values
as well as other variables. The result of the expression is assigned to variable.
Since functions often work with table data, PL/pgSQL provides a method for as-
signing table column data values to variables. The SELECT INTO statement is used to
perform a typical SELECT command against a table, assigning the output to a variable.
The format of the SELECT INTO statement is
One or more variables can be used to receive column data from the SELECT INTO
statement. Each variable is matched against a column from the column list. The variables
used in the SELECT INTO statement must have the same data types as the columns
listed. The clause parameter defines a normal SELECT clause used to filter data into a
result set.
You can create a single variable that holds all of the data columns of a table. The vari-
able must be declared using the table’s %ROWTYPE table attribute:
customervar store."Customer"%ROWTYPE
SELECT INTO customervar * FROM store."Customer" where "CustomerID" = 'BLU001';
This statement assigns the contents of the result set record to the customervar vari-
able. Each individual column data value can then be retrieved as an individual element
of the variable:
Of course, this method assumes that only one record of data will be returned in the
result set. Later, this chapter discusses the FOR loop, showing how to deal with multiple
records returned in the result set.
226 PostgreSQL 8 for Windows
Condition Statements
Just like any other language, PL/pgSQL supports the use of IF/THEN/ELSE statements
to allow program code branching based on conditions:
IF condition THEN
<statements1>
ELSE
<statements2>
END IF;
Notice that the IF and ELSE statements do not contain a semicolon. The condition can
be any expression that evaluates to a Boolean result. You can also combine additional IF
statements after the ELSE statement to continue the comparisons:
IF condition1 THEN
<statements1>
ELSE IF condition2 THEN
<statements2>
ELSE
<statements3>
END IF;
This series can continue to check each set of conditions. If none of the conditions
evaluates to a TRUE value, the ELSE statements are performed.
Loop Statements
PL/pgSQL provides three ways to loop through data:
LOOP
<statements>
EXIT [ WHEN expression];
END LOOP;
The statements inside the LOOP structure are executed continuously. To exit this
loop, you must execute an EXIT statement. The EXIT statement may also include a WHEN
clause. The loop is followed until the WHEN clause expression returns a TRUE value.
Chapter 9: Stored Procedures and Triggers 227
The WHILE loop works the opposite way. The WHILE loop continues looping until an
expression is FALSE:
The FOR loop allows you to iterate through a set range of values:
The variable value must be defined with the %ROWTYPE attribute of the table used in
the select_clause. An example of this would be
customervar store."Customer"%ROWTYPE;
FOR customervar IN SELECT * FROM store."Customer" LOOP
<statements>
END LOOP;
In this example, the customervar variable would contain the values of each record
for each FOR loop iteration. This is an extremely handy tool to have for evaluating data
when a result set returns multiple records.
TRIGGERS
A trigger defines an operation that is performed when a specific event occurs on a table.
A trigger can be defined for when a user inserts a new record, updates an existing record,
or deletes a record. The trigger is configured to execute a specific function when the
event occurs. The function executed as a result of a trigger is called a trigger function. This
section describes what trigger functions are and demonstrates how to use them.
228 PostgreSQL 8 for Windows
X Trigger functions do not use input arguments in the function, but rather are
passed arguments from a trigger event.
W Trigger functions have access to special variables from the database engine.
Since trigger functions are not executed by SQL commands, you are not able to send
input arguments to the function through the function definition. Instead, arguments are
passed from the trigger to the trigger function using special variables. Similarly, since the
trigger function does not send output to the calling program, there are no output data
type arguments to define. The trigger returns either a NULL value or a record having the
same columns as the table the trigger was defined for.
When a trigger function is called, the database engine passes a group of special vari-
ables to the trigger function to define the environment the trigger function is working in.
The special variables can be used within the trigger function to determine environment
features, such as how the function was called, what data is present when the trigger was
fired, and when the trigger was fired. Table 9-2 lists and describes the trigger special
variables available for use in a trigger function.
After the trigger function is created, you must define the actual trigger event that
executes the function. This is done using the CREATE TRIGGER command:
CREATE TRIGGER name { BEFORE | AFTER } {event [OR ... ] }
ON table [ FOR [ EACH ] { ROW | STATEMENT }]
EXECUTE PROCEDURE function (arguments)
The trigger can be defined to occur either before or after the event occurs. The event can
be an INSERT, UPDATE, or DELETE SQL command on the table the trigger is defined for.
There are two methods for firing triggers. You can select to fire the trigger once for
each row that is affected by the event (such as multiple rows in an UPDATE or DELETE
command) by specifying the ROW parameter. Alternatively, you can select to fire the trig-
ger only once for each statement that triggers the event, no matter how many rows are
returned (even if no rows are returned) by specifying the STATEMENT parameter.
Finally, the function to execute when the trigger is fired is declared. This defines the
trigger function created using the CREATE FUNCTION SQL command. The arguments
listed in the function parameter are not passed to the function within the normal func-
tion definition (remember, trigger functions do not define input arguments). Instead, the
arguments in the CREATE TRIGGER command are passed using the TG_ARGV special
variable.
Chapter 9: Stored Procedures and Triggers 229
In the Trigger Function object within the store schema in the test database created
in Chapter 4, create a new trigger function called update_inventory. Set the Owner
value to postgres, set the Language to plpgsql, keep the Volatility set to VOLATILE,
and check the Security of Definer check box (this allows anyone with access to the Order
table to run the trigger, which will modify the Product table). On the Parameter tab, type
the following function code:
DECLARE
onhand int4;
BEGIN
SELECT into onhand "Inventory" from store."Product" where
"Product"."ProductID" = NEW."ProductID";
IF TG_OP = 'INSERT' THEN
update store."Product" set "Inventory" = onhand - NEW."Quantity"
where "ProductID" = NEW."ProductID";
END IF;
END;
Chapter 9: Stored Procedures and Triggers 231
When a new record is inserted into the Order table, the NEW special variable will be
set by the database engine to the data values of the inserted record. The individual col-
umn values are referenced as if the NEW keyword were the table name.
This function code uses the SELECT INTO command to extract the current inventory
value from the Product table that matches the same ProductID value as the new record’s
value, and places it into the onhand variable.
Next, the TG_OP special variable is checked to ensure that the event was indeed an
INSERT (this is a common practice, so you can use the same trigger function for mul-
tiple trigger events). If the event is an INSERT event, the UPDATE command is executed.
Again, the NEW special variable is used to retrieve the Quantity column data value in-
serted into the Order table, and to ensure the proper Product record is updated.
Now that the trigger function is created, you must define a trigger event for the Or-
der table that automatically executes the trigger function whenever someone inserts a
new record.
Under the Order table object is the Triggers object. Right-click Triggers and select
New Trigger. The New Trigger window appears, shown in Figure 9-4.
The textboxes and check boxes represent the parameters available for the CREATE
TRIGGER function. After entering the trigger name, you must check the Row Trigger
check box so that the trigger is fired for each record inserted into the Order table. Next,
select the store.update_inventory trigger function from the drop-down box. Since
we do not want to update the product inventory until after we are sure the order has
been entered, click the AFTER radio button, and then check the INSERT check box. If the
INSERT command for the Order table fails, the trigger will not fire. That is all there is.
Click OK to create the trigger.
This command inserts a record showing there are ten laptops in inventory. Once you
know you have valid data, try entering a new record in the Order table (remember, you
must be logged in as someone in the Salesman Group Role, such as the barney user):
test=> insert into store."Order" values ('ORD001', 'BLU001', 'LAP001',
test(> 2, '2,000');
INSERT 0 1
test=> select * from store."Product";
ProductID | ProductName | Model | Manufacturer | UnitPrice | Inventory
———————————+—————————————+————————————+——————————————+———————————+——————————
LAP001 | Laptop | Takealong | Acme | $1,000.00 | 8
(1 row)
test=>
After the order for two laptops was added to the Order table, the inventory value of
the product in the Product table automatically decreased by the same amount. The trig-
ger function worked as expected!
SUMMARY
This chapter discussed creating your own functions for your database. PostgreSQL sup-
ports several procedural languages for creating functions. PL/pgSQL is the default pro-
cedural language provided by PostgreSQL. It allows you to create function code using
standard SQL commands as well as normal programming statements.
Chapter 9: Stored Procedures and Triggers 233
There are two types of functions: stored procedures and trigger functions. Stored
procedures are used just like the PostgreSQL built-in functions within SQL commands.
You can create a stored procedure in a schema and make it available for any database
user to use. Trigger functions are special functions that manipulate table data based on a
table event. A trigger is defined for a specific table event, such as inserting a new record,
updating an existing record, or deleting a record. When the trigger event occurs, the da-
tabase engine automatically executes the trigger function.
The next chapter discusses the security and performance issues that often occur in
PostgreSQL systems. It is important that you exercise proper security procedures to en-
sure the safety of your application data. It is also important that database customers have
the quickest response time as possible for the application.
This page intentionally left blank
10
Security
235
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
236 PostgreSQL 8 for Windows
S
ecurity is an important, but often misunderstood, concept for database
administrators. These days database security is not an option, it is a requirement,
especially if your PostgreSQL database is running on a system attached to a network.
Knowing how to allow your customers into the database while blocking the bad guys is
something that all database administrators must know how to do. This chapter shows
the basic security pieces required for putting your PostgreSQL database on a network for
multiple customers to access.
Back in the early days of network computing, as a system administrator, you could
put a database system onto a network and safely assume that only your customers would
connect to it. In today’s environment, though, every database server connected to the
network becomes a target for unscrupulous individuals, both from outside the network
and, unfortunately, sometimes even from inside your own network.
This section discusses three methods that can be used in multi-user database systems
to help control the security environment of the database server:
X Controlling network users
R Encrypting network sessions
W Monitoring database users
Each method provides a different level of security for the database, and each is in-
valuable in your arsenal in protecting your application data.
The server firewall is the front door of the system on the network. It can block at-
tempts to access individual services on the server before the packets even get passed to
the server applications. This is the first line of defense in protecting your PostgreSQL
databases from attack and intrusion.
There are many different network firewall products available on the market today,
both free and commercial. Microsoft Windows XP Home and Professional workstations
with Service Pack 2 installed, as well as Microsoft 2003 servers, include a basic firewall
application built into the operating system.
When the Windows Firewall application is enabled, it blocks all access to the Post-
greSQL server service by default. If you want network clients to be able to access your
PostgreSQL server, you must manually configure the firewall application to allow access
to the PostgreSQL server TCP port.
To start the Windows Firewall application on a Windows XP workstation, choose
Start | Control Panel, click Network and Internet Connections, and then click Windows
Firewall. On a Windows 2003 server, choose Start | Control Panel and click the Windows
Firewall icon.
When the Windows Firewall application is started, the main window appears, shown
in Figure 10-1.
To block all unauthorized access, enable the firewall. Once the firewall is enabled, all
but the basic network TCP ports are blocked. You must configure an exception to allow
network traffic onto the server. Enable exceptions by unchecking the Don’t Allow Excep-
tions check box.
When exceptions are allowed, you can then click the Exceptions tab to see the pro-
grams and services that are allowed access through the firewall to the system.
To enable remote clients access to your PostgreSQL server, you must add a new port
exception to the firewall for PostgreSQL. Click the Add Port button to open the Add a
Port window, shown in Figure 10-2.
Select a name to call the new port (such as “PostgreSQL”), then enter the PostgreSQL
TCP port number that is configured in the postgresql.conf configuration file (dis-
cussed in Chapter 3). The default TCP port for PostgreSQL is 5432. After entering the
port number, ensure that the TCP radio button is selected, to indicate that this is a TCP
port.
Next you must define which remote network clients will be allowed through the fire-
wall to this TCP port. Click the Change Scope button to open the Change Scope window,
shown in Figure 10-3.
The Change Scope window offers you the following three options for defining the
network clients that are allowed access to the TCP port. Which you should choose de-
pends on your network requirements.
X Any Computer (Including Those on the Internet) Select this option if you
want any computer from anywhere to be able to access your PostgreSQL server.
R My Network (Subnet) Only If the customers accessing your PostgreSQL server
are all located on the same subnetwork as the PostgreSQL server, you can select
this option. If your company network is split into multiple subnets, you cannot
use this option, because it allows only one subnet access to the service.
W Custom List Most likely, you do not want everyone to be able to access the
server. In this scenario, you must add your customer client address ranges in-
dividually by using this option. In the host list, you can add either individual
IP addresses, such as 192.168.10.21, or an IP address/subnet mask combination,
such as 192.168.10.0/255.255.255.0, which allows all hosts on the 192.168.10.0
network. You may add multiple entries to the list, separated by commas. This
allows you to configure subnet ranges that are allowed access to the PostgreSQL
TCP port.
Once you have chosen the appropriate option for your network environment, click
the OK button to accept the new scope, then click the OK button on the Add a Port win-
dow. This enters the newly defined exception in the Exceptions list. Now any remote
customer whose network address is accepted by the scope will be allowed through the
firewall to your PostgreSQL server. This finishes the firewall configuration. Next you
must configure the PostgreSQL configuration files to enable customers to connect.
To enable remote network clients to connect to the PostgreSQL server, you must ei-
ther enter the IP addresses of the network interfaces to activate or enter the asterisk (*)
wildcard character to enable all network ports, as follows:
listen_addresses = '*'
After setting the listen addresses in the postgresql.conf file, the next step is to
use the pg_hba.conf file to define which clients can connect to which database, using
which Login Role.
As described in Chapter 3, the pg_hba.conf configuration file defines access rules
for allowing remote customers access to the PostgreSQL server. Each entry in the con-
figuration file defines an individual access rule. If a client meets any one of the rules, it is
allowed access to the specified resources on the server. If a client fails all of the rules, it is
prevented from accessing the databases on the server.
As shown in Chapter 3, the format of each rule entry is
The connection-type field defines the method used by the remote client to connect to
the server. For network hosts, there are three possible values:
X host Allows a remote client to connect using either a plain-text TCP connec-
tion or an SSL-encrypted TCP connection
R hostssl Requires that a remote client connect to the server using an SSL-
encrypted TCP connection
W hostnossl Requires that a remote client connect to the server using a plain-
text TCP connection
After the connection type, you must define the database the network client is allowed
to connect to, along with the user Login Role allowed to connect. You can either specify
individual database and Login Role names or use the all keyword to allow the net-
work address defined in the rule to connect to any database using any Login Role on the
server.
The network-address field specifies which network clients the rule applies to. It can
specify a single network client IP address, or an IP address and subnet mask combi-
nation to define an entire subnet of clients. When a client connects to the PostgreSQL
server, its network address is compared with the network-address field in the pg_hba
.conf file. Only rules that match that network address are applied to the client.
Finally, the login-method field specifies the method the remote client must use to au-
thenticate the Login Role name sent by the client to the PostgreSQL server. The most
common methods are password and md5. The password method passes a plain-text
password, which, while not recommended in plain-text TCP sessions, is just fine if used
Chapter 10: Security 241
in an SSL-encrypted TCP session. The md5 method allows for encrypting the password
before sending it to the server, which is recommended if using a plain-text TCP session.
Here are a few sample entries in the pg_hba.conf file:
This first example allows any client on the 10.0.1.0 subnetwork to connect to any
database using any Login Role. The client must connect using SSL encryption. In the
second example, the 192.168.120.5 network client can only log into the test database
using the Login Role barney (and using an encrypted password). The third example,
however, lets the barney Login Role from the same network client also log into the
test2 database as well. The final example allows any client from the 10.0.2.0 subnet to
log in using any Login Role to the test database. This allows multiple customers from
that network to connect to the test database.
After editing the configuration files, the PostgreSQL server must be either stopped
and started or reloaded for the changes to become active. You can do this either from the
pgAdmin III interface or from the pg_ctl command-line program.
test=>
Once the connection to the remote host is established, you can perform all of the nor-
mal psql meta-commands and SQL commands just as if you were connected to a local
PostgreSQL server.
242 PostgreSQL 8 for Windows
sent out on the network. The PostgreSQL server and the remote client pass an encryption
key that is used to encrypt the data. Theoretically, only those two devices should be able
to decrypt the data that is encrypted and sent out on the network. Figure 10-5 shows a
network trace of a SQL command contained in an SSL-encrypted TCP session.
The text SQL command has been encrypted and is unintelligible to the naked eye
while it is on the network. The PostgreSQL server is able to receive the encrypted packet,
decrypt the text, and process the SQL command just as normal. The process is then re-
versed when the PostgreSQL server sends the result set back to the client. While encrypt-
ing data is not 100 percent foolproof (the SSL encryption can be broken using a brute-
force technique, which could take a few days), it does provide a basic level of security
for clients connecting to the PostgreSQL server across public networks. By the time the
encryption key is cracked, the database connection is most likely no longer active, and
the key is useless.
The following sections describe the process of how to set up your Windows Post-
greSQL server to accept SSL-encrypted TCP sessions.
When PostgreSQL is restarted, it will recognize the configuration change and enable
SSL connections. The PostgreSQL server now listens for both normal TCP sessions and
secure SSL TCP sessions on the same TCP port (port 5432 by default). However, once
SSL is enabled, the PostgreSQL server will look to make sure that the encryption key
and certificate files are available in the PostgreSQL data directory (see Chapter 3), and
will not start unless it finds them. Next you must obtain the proper encryption key and
certificate files.
There are several companies that provide certificates for commercial use. These com-
panies have agreements with Microsoft to include their corporate certificates within the
Windows operating system. When you register to purchase a certificate from the com-
pany, it verifies that you are who you are, and then validates a certificate for you to use
for your encryption key. The certificate belongs to you, and only you are allowed to use
it for encrypting data. When a Windows workstation or server sees an SSL encryption
key signed by one of these commercial companies, it assumes the key is authentic and
accepts the encryption key for use in decrypting the TCP session.
There are several companies that provide certificate files for use with Windows. The
most popular is the VeriSign corporation. In a production database environment, using
a commercially assigned certificate is recommended.
Chapter 10: Security 245
Instead of purchasing a certificate from a well-known, trusted company, you can cre-
ate your own certificate, validating that the encryption key came from you. This is called
a self-signed certificate. This certificate is not automatically validated by the Windows sys-
tem. It is up to your clients to determine whether they can trust your certificate, and ac-
cept any connection encrypted using the encryption key provided with the certificate.
This method is not recommended for production environments where you expect
external customers to connect to your database, as it requires your customers to accept
your certificate on faith. However, if you are just testing your PostgreSQL server, or pro-
viding a database that will only be used by a few internal customers, it works just fine.
By self-signing a certificate, the server validates that the encryption key was sent by the
specific host that generated it. If the encryption key is received from any other host on
the network, it is assumed to be invalid.
Once you obtain a valid encryption key and certificate pair, you must place them in
the PostgreSQL data directory. The encryption key file must be named server.key,
and the certificate file must be named server.crt (these are the only filenames Post-
greSQL recognizes). After this is done, the PostgreSQL server can be started.
The following section describes how to create your own server.key and server
.crt files.
The openssl.exe program is used to create the encryption key and self-signed cer-
tificate files. By default, it is located in the c:\openssl\bin directory. Since this is a
command-line program, it is best to work from a command prompt window. Start a com-
mand prompt window and navigate to this directory using the cd command:
C:\Documents and Settings\RICH>cd \openssl\bin
C:\OpenSSL\bin>
There are three steps required to create the encryption key and self-signed certificate
files:
The first step is to create the encryption key used by PostgreSQL for encrypting SSL
sessions. This is done using the req openssl option:
OpenSSL will respond by asking lots of questions. Each of these questions is present
in the encryption key and certificate, identifying you to your clients. Make sure you enter
as much information as necessary to enable your clients to properly identify you:
C:\OpenSSL\bin>openssl req -new -text -out server.req
Loading 'screen' into random state - done
Generating a 1024 bit RSA private key
................................................++++++
......++++++
writing new private key to 'privkey.pem'
Enter PEM pass phrase:
Verifying - Enter PEM pass phrase:
----
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter '.', the field will be left blank.
----
Country Name (2 letter code) [AU]:US
State or Province Name (full name) [Some-State]:Indiana
Locality Name (eg, city) []:Indianapolis
Chapter 10: Security 247
C:\OpenSSL\bin>
When answering the questions, you can choose to include the necessary information
to identify yourself to your customers. The pass phrase is the encryption key’s password.
It ensures that only you can offer the encryption key to remote clients. You must enter at
least a four-character pass phrase.
Another important item is the Common Name field. You must use the hostname of
your PostgreSQL server as the Common Name. This identifies the host that issues the
certificate. The hostname of your system can be determined by using the ipconfig
Windows command in a command prompt window. You do not need to include the do-
main name, unless you want to use it to further identify your system to remote custom-
ers. It would not be a self-signed certificate if you were to use a different hostname from
the one that will offer the certificate to remote clients. Finally, you do not have to enter a
challenge password when prompted; just hit enter.
This process creates the file privkey.pem, which contains the encryption key, and
the file server.req, which contains a basic certificate. Unfortunately, this key file is
protected by the pass phrase, which you entered when the encryption key was created.
This pass phrase must be entered every time the encryption key is used. This poses a
problem if you automatically start PostgreSQL as a Windows service. Since the service is
started by the operating system, you can not enter the pass phrase to enable the encryp-
tion key.
To solve this problem, you can remove the pass phrase from the encryption key by
using another openssl option:
When this command is executed, OpenSSL asks for the pass phrase for the encryption
key (hopefully you remember what you used to create it). It then creates a new encryp-
tion key, called server.key, which does not require the pass phrase to be entered.
Now you have an encryption key and a basic certificate. All that is left to do is to con-
vert the certificate to a standard X.509 format (required by the SSL protocol) and self-sign
it using the encryption key. This is done using the following command:
openssl req -x509 -in server.req -text -key server.key -out server.crt
248 PostgreSQL 8 for Windows
The certificate is created in text mode using the standard X.509 format and is saved
in the file server.crt. Since the certificate was created in text mode, you can look at it
by using the type command:
C:\OpenSSL\bin>type server.crt | more
Certificate:
Data:
Version: 3 (0x2)
Serial Number:
94:c0:32:bb:ef:23:d9:d8
Signature Algorithm: sha1WithRSAEncryption
Issuer: C=US, ST=Indiana, L=Indianapolis, O=Blum,
CN=CIN-exekiel/[email protected]
Validity
Not Before: Aug 28 20:02:26 2006 GMT
Not After : Sep 27 20:02:26 2006 GMT
Subject: C=US, ST=Indiana, L=Indianapolis, O=Blum,
CN=ezekiel/[email protected]
Subject Public Key Info:
Public Key Algorithm: rsaEncryption
RSA Public Key: (1024 bit)
Modulus (1024 bit):
00:c7:e2:0f:b3:0e:12:84:bb:72:e3:5d:0a:ac:03:
The first part of the certificate file contains the information you entered when creat-
ing the encryption key. If you continue to scroll down through the text file, you will see
the actual certificate section that contains the encrypted certificate.
Now copy the server.key and server.crt files to the PostgreSQL data direc-
tory (at the time of this writing, located at C:\Program Files\PostgreSQL\8.2\
data), then restart the PostgreSQL server service.
You can easily stop and restart the PostgreSQL server service from the pgAdmin III
program. After starting the program, right-click the PostgreSQL server name and select
Stop Service. After waiting a few seconds, you can right-click the server name and select
Start Service. You should then be able to log into the PostgreSQL server from pgAdmin III
as normal.
test=>
The message shows that the SSL connection has been established using the encryp-
tion key. Now your queries are safer from prying eyes on the network.
MONITORING USERS
The third method used for increasing server security is to monitor which users are con-
nected to your PostgreSQL server. It always helps to know which users are connected
from what network clients. Fortunately, the pgAdmin III application provides an excel-
lent interface for this.
The Server Status tool is a window available from the pgAdmin III menu bar. After
starting pgAdmin III and logging into the server, choose Tools | Server Status to open
the pgAdmin III Server Status window, shown in Figure 10-6.
The Server Status window contains four tabs:
Each of these tabs provides valuable information about the current status of the Post-
greSQL databases. The Status tab provides a wealth of information about each Login
Role that is connected to the server. The refresh rate textbox at the bottom of the Server
Status window allows you to set how often the status information is updated in the win-
dow. This allows you to constantly monitor the Login Roles connected to the database
and watch for problems.
Besides monitoring user connections, you can also monitor the transactions submit-
ted by users. As described in Chapter 3, you can enable full transaction logging in the
PostgreSQL server log file. Take care when doing this, though, because the log file can
quickly fill up with logging transactions.
250 PostgreSQL 8 for Windows
SUMMARY
This chapter dove into the world of database security. Database security has become a
hot topic, and implementing a security plan for a database is a must. PostgreSQL allows
remote clients to connect to the server using TCP communications. You can restrict which
clients can connect to the PostgreSQL server by using both a firewall application on the
system and the pg_hba.conf configuration file. The firewall application is used to block
packets from unknown clients from entering the server. The pg_hba.conf configuration
file is used to restrict what databases and Login Roles can be used from a network client.
Another aspect of security you must be concerned about is network security. When
remote clients send data to the PostgreSQL server, that data is contained in TCP pack-
ets that can be intercepted. To protect data from remote clients, you should enable SSL
encryption support on the PostgreSQL server. You can either purchase or generate an
encryption key and certificate file that are used to communicate with remote clients
using the SSL protocol. Data transported using SSL is encrypted and is more secure on
the network because it cannot be read with a network sniffer.
The next chapter tackles another important issue for database administrators, the
topic of database performance.
11
Performance
251
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
252 PostgreSQL 8 for Windows
D
atabase performance is often a difficult concept for new database administrators
to grasp, and even more difficult to implement. There are lots of variables
involved in trying to troubleshoot and solve database performance issues. This
chapter shows some tips that can be useful in trying to get the most performance out of
your PostgreSQL database.
test=>
Chapter 11: Performance 253
The SELECT query is not actually run in the database engine. The database engine
goes through the motions of processing the query, and produces statistics on how it
thinks it would handle the query. The EXPLAIN command output displays three sets of
statistics:
The cost estimate is provided as two numbers. The first value estimates how much
time it would take the database engine to return the first record of data for the query. In
this example, the first value estimates it would take 0.00 time units to produce one record
of the result set. The second value estimates how much time it will take the database
engine to return the entire result set. In this example, PostgreSQL estimates it would take
1.04 time units to return the entire result set.
While these values represent time, they do not use standard time units, such as mil-
liseconds. These time values are generic time references. While this sounds confusing,
there is a reason for this.
Time within the system processor (or processors, if the system is a server with more
than one processor) is inexact. Each application is provided time slices of the processor
as the processor switches between multiple applications. At any given time, an applica-
tion does not necessarily know how many time slices it will have for using the processor.
Attempting to estimate the exact real time a query would take on any processor on any
given system running multiple applications would be nearly impossible.
Instead, the PostgreSQL developers chose to provide values that represent the amount
of “time” the database engine takes to work on the query. These values show how much
actual work time the query components take, not necessarily how much real time. While
these values are not exact, they are consistent. You can use these values to compare one
query to another.
Besides the estimated time, you can also choose to view the actual time it takes the
database engine to process a query. To see the actual time a query takes, use the ANALYZE
parameter after EXPLAIN:
test=> explain analyze select * from store."Customer";
QUERY PLAN
------------------------------------------------------------------
Seq Scan on "Customer" (cost=0.00..1.04 rows=4 width=232)
(actual time=0.082..0.087 rows=5 loops=1)
Total runtime: 0.135 ms
(2 rows)
test=>
254 PostgreSQL 8 for Windows
This produces two sets of values. The actual time shown in the output is still de-
fined using relative units and not actual time. The real time the query took to complete
is shown at the end of the statistics. Although the command actually runs the query to
produce the time results, the output from the query is suppressed and not displayed.
This example showed that to produce the result set for this SELECT command, the
database engine performed a sequential scan of the Customer table, looking for records
that match the query. The entire query was processed in a single step. Things get more
interesting with more complicated queries:
test=> explain analyze select * from store."Customer" order by "LastName";
QUERY PLAN
--------------------------------------------------------------------------
Sort (cost=1.08..1.09 rows=4 width=232)
(actual time=0.148..0.150 rows=5 loops=1)
Sort Key: "LastName"
-> Seq Scan on "Customer" (cost=0.00..1.04 rows=4 width=232)
(actual time=0.010..0.015 rows=5 loops=1)
Total runtime: 0.258 ms
(4 rows)
test=>
By using the ORDER BY clause, we force the database engine to perform two steps.
One thing that you need to get used to when analyzing query explain plans is that you
must work from the bottom up. The first step the database engine performs is the last
item in the list.
Just as with the first example, the first step in this query was to scan the Customer
table for the records that match the query. This process did not take all that long (rela-
tively speaking).
The second step the database engine took was to sort the data records in the result set
from the scan. This component is what slowed down the query. While the query explain
plan estimated that it would not take much time, the actual numbers show that it took
over 0.150 time units to complete sorting the data. Compared to the 0.015 time units for
the scan, you can see how sorting your result set consumes more database engine time.
Now we can compare this query to a query that sorts the data based on a table key:
test=> explain analyze select * from store."Customer" order by "CustomerID";
QUERY PLAN
---------------------------------------------------------------------------
Sort (cost=1.08..1.09 rows=4 width=232)
(actual time=0.032..0.034 rows=5 loops=1)
Sort Key: "CustomerID"
-> Seq Scan on "Customer" (cost=0.00..1.04 rows=4 width=232)
(actual time=0.006..0.011 rows=5 loops=1)
Total runtime: 0.077 ms
(4 rows)
test=>
Chapter 11: Performance 255
By examining the query explain plan for this query, you can see the benefit of using
indexes in queries. The query on the CustomerID column only took 0.077 millisecond,
compared to the 0.258 millisecond for the query sorting the result set on a non-key value.
While these values do not look all too much different, remember this example was for
a table with only five records. If instead this table had five million records, these values
would be multiplied many times over, creating quite a performance hit.
As you use the EXPLAIN command, you may notice that sometimes the database
engine does not do things you expect it to do. Usually, the database engine keeps track of
new database objects, such as table indexes, and utilizes them in the query explain plan.
Sometimes, however, you must force the database engine to re-evaluate how to execute
a query.
The ANALYZE SQL command (different from the ANALYZE parameter in the EX-
PLAIN command) tells the database engine to re-examine all database objects. The table
owner can use the ANALYZE command along with the table name to force PostgreSQL to
re-examine the table and reset any query explain plans set for the table:
test=# analyze store."Customer";
ANALYZE
test=#
The normal output of the ANALYZE command is not too exciting. To get some infor-
mation on what is happening, you can use the VERBOSE keyword:
test=# analyze verbose store."Product";
INFO: analyzing "store.Product"
INFO: "Product": scanned 1 of 1 pages, containing 2 live rows and 6 dead rows;
2 rows in sample, 2 estimated total rows
ANALYZE
test=#
The output now shows some information about the table. In this example, there are
two records that contain live data, and six records that had been either deleted or updated.
By performing a VACUUM command on the table, you can free up six records’ worth of
storage space. If your application performs lots of record updates and deletions, it is a
good idea to regularly analyze the status of the table, and vacuum when necessary.
This is the basic idea behind using the EXPLAIN command. As your customers start
using more complex queries, and your application starts containing more data, query
performance can become a problem. Using the EXPLAIN command gives you (and the
database programmers) a simple tool for examining how individual queries affect ap-
plication performance.
As discussed in Chapter 4, the pgAdmin III Query tool can be used to view the query
explain plan in a graphical mode. Start the Query tool by clicking the SQL icon in the
pgAdmin III toolbar.
In the top window of the Query tool, enter your SQL query command just as normal
(without the EXPLAIN keyword). After entering the query, click the Explain Query icon
on the toolbar. The lower window contains the result as if you used the EXPLAIN com-
mand for the query.
The Data Output tab shows the normal output you are used to seeing from the
EXPLAIN command. The Explain tab shows a graphical representation of the EXPLAIN
output. Each query component is shown as a separate icon in the diagram. If you click
an individual icon, it displays the statistics for that component. Figure 11-1 shows a more
complex query example displayed in the Explain tab.
There are lots of components that were generated from this query. The SQL query
used to produce this output contained two inner joins to join three tables:
By looking at the graphical output, it is much easier to see how each of the compo-
nents is used to process the query.
Just as with the EXPLAIN SQL command, you can set the Query tool query explain
function to include actual performance data. Choosing Query | Explain Options | Ana-
lyze enables the ANALYZE keyword for the EXPLAIN command. Now when you click
an icon in the graphical output, you can view the actual times as well as the estimated
times.
Query Tuning
As shown earlier in the “Enhancing Query Performance” section, PostgreSQL creates a que-
ry explain plan for each query executed by the database engine. The postgresql.conf
258 PostgreSQL 8 for Windows
file contains parameters that can be used to control what components PostgreSQL uses when
executing the query explain plan. Modifying these parameters can alter the way the database
engine processes your queries, and sometimes can help increase the performance of your
queries.
The individual components used by the database engine to process queries are called
planner methods. Planner methods are the building blocks used within the query explain
plan to obtain the result set, as we saw graphically in Chapter 10, Figure 10-7. The data-
base engine puts these building blocks together to produce the result set for the query.
Each icon in the explain query output is a separate planner method. The various planner
methods used by the PostgreSQL database engine are described in Table 11-1.
You can select to disable individual planner methods in the database engine to force
the database engine to utilize a different planner method combination to produce the re-
sult set. The Query Tuning section of the postgresql.conf configuration file contains
entries for each of the planner methods (see Chapter 3). To disable a particular planner
method, alter the appropriate setting in the postgresql.conf configuration file:
enable_bitmapscan = off
By setting a parameter to off you are disabling that planner method in the database
engine. You should use this feature with caution. Remember that disabling a planner
method disables that method for all queries on the server. While this may increase the
performance of one query, it may have severe adverse effects on other queries. This fea-
ture is not intended as a permanent method of altering the PostgreSQL query planner.
It is most often used as a debugging tool. You can disable an individual planner method
and see how your queries behave, then re-enable the planner method and determine
how to rewrite the queries to work around that planner method.
Resource Usage
As described in Chapter 3, the Resource Usage section of the postgresql.conf file
contains parameters that affect how the PostgreSQL server utilizes system memory.
Memory usage is one of the biggest performance bottlenecks on a server. PostgreSQL
provides lots of different configuration parameters that allow you to control how the
server uses system memory.
There are a few parameters, described next, that are key for controlling how the Post-
greSQL server performs in your particular environment.
shared_buffers = 1000
The shared_buffers parameter allows you to set how much memory (in 8KB
blocks) the PostgreSQL server will use on the system. This is a somewhat controversial
topic, and different administrators have different opinions on what this should be set
to. The default setting of 1000 (8MB of memory) is often considered just fine for most
normal Windows server environments. If you are working in a high-volume application
with lots of customers connecting to the server and processing queries and transactions,
you can increase this value to see if it helps performance in your specific environment.
Remember, increasing this value may have adverse effects on other processes running
on the system. I do not recommend making large increases to this parameter all at once,
as it could severely affect other applications running on the server.
work_mem = 1024
This parameter limits the amount of working memory (in kilobytes) available for
each PostgreSQL sort and hash function. If you are using an application that constantly
sorts data, you can try to increase this value to increase performance of the sorts. Be care-
ful when increasing this value, though, as it applies to each individual sort or hash func-
tion, not the system as a whole. If your application performs simultaneous sorts, each
one is granted this amount of memory. The default value of 1MB of memory per sort or
hash function is fine for most normal database environments.
max_fsm_pages = 20000
The max_fsm_pages parameter sets the number of free-space pages tracked by Post-
greSQL. When a record is updated or deleted from a table, PostgreSQL does not remove
260 PostgreSQL 8 for Windows
the record immediately, it is just tagged internally as being deleted. The free-space map
tracks these deleted records. The more free-space pages available, the more records Post-
greSQL can track before requiring a vacuum of the table to remove deleted records. In
applications that perform lots of deletes and record updates, the free-space pages fill up
quickly, decreasing performance. Increasing this value means providing more free-space
pages and having to perform fewer vacuum operations to clear out the free-space pages.
Runtime Statistics
The last set of parameters do not directly affect the server performance, but allow you
to configure the server so that it logs performance data for you to analyze. The Runtime
Statistics section of the postgresql.conf configuration file allows you to enable sev-
eral different performance logs, as described next.
log_statement_stats = on
Setting this parameter to on enables logging statistics for each SQL command pro-
cessed by the database engine. It is not hard to see that, for busy databases, this could
lead to an enormous amount of data. The logging entries are placed in the standard
PostgreSQL log file in the pg_log directory (see Chapter 3). A statement entry looks
like this:
These statistics show how much time the query took, how much data was read and
written, and how much of the data was already contained in buffers. The buffer hit rate is
an important factor in query performance. The more information the PostgreSQL server
can retrieve from memory buffers, the faster the query will perform.
While the log_statement_stats parameter logs statistics for an entire SQL com-
mand, there are three other parameters that can log statistics for individual components
within the database engine process:
log_parser_stats = off
log_planner_stats = off
log_executor_stats = off
If any of these three parameters is enabled, you must disable the log_statement_
stats parameter, or an error condition will occur. Each of these parameters can be used to
log query statistics during the parsing, planning, and executing phases of the database en-
gine. The statistics produced are placed in the currently active PostgreSQL server log file.
Chapter 11: Performance 261
Instead of changing the run-time statistics settings in the configuration file and re-
starting the PostgreSQL server, you can change the values on-the-fly in psql using the
set_config command:
The config parameter is the name of the configuration parameter, and value is the
parameter value, either on or off. The is_local parameter defines whether the setting
applies to the current transaction (true) or the entire psql session (false):
This command enables statement statistics logging for the remainder of the psql
session. Using this feature, you can enable logging for a single transaction, then turn
logging off. This prevents you from having to pour over thousands of lines of logging
information looking for the transaction stats.
SUMMARY
Database performance is an important topic for administrators. The key to database per-
formance is often the queries used to produce reports. The ability to analyze queries and
see how they are processed by the database engine is a vital tool in evaluating database
performance. PostgreSQL supports the EXPLAIN SQL command to allow you to view
how the database engine processes a query. You can view each step taken to produce
the result set, and see the time required to complete the step. Using this information, it
is often possible to either create objects, such as index files, or rewrite queries to utilize
other factors to increase performance.
Finally, the chapter discussed some of the parameters available in the postgresql
.conf configuration file that affect database performance. There are several parameters
that control how the database engine processes queries, along with several parameters
that control how the PostgreSQL server handles system resources, such as memory.
Knowing how to fine-tune these parameters can often help increase the performance of
the PostgreSQL server in your database environment.
The next (and final) part of this book deals with how to use PostgreSQL in your Win-
dows programming environment. There are several different programming languages
that PostgreSQL supports. The final chapters in this book provide details on how to use
PostgreSQL as a back-end database in four different Windows programming environ-
ments. Chapter 12 discusses how to migrate your Microsoft Access databases to Post-
greSQL, and how to use Microsoft Access as a front end to a PostgreSQL database.
This page intentionally left blank
III
Windows Programming
with PostgreSQL
263
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
This page intentionally left blank
12
Microsoft Access
and PostgreSQL
265
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
266 PostgreSQL 8 for Windows
M
icrosoft Access is one of, if not the, most popular database packages available
for the Windows platform. It provides an excellent, user-friendly graphical
environment for creating tables, queries, data entry forms, and customized
reports. The downside is that Access often gets used inappropriately in corporate
environments. While Access is perfect for single-user database applications, it suffers
from performance problems when used in a multi-user environment. The problem is that
it is not uncommon to see users become proficient in Access on their local workstations,
then port an Access application to a network environment and have dozens of other
people using the application. This can create a problem, both in application performance
and network load. However, trying to convert a multi-user application completely from
Access to PostgreSQL can be time-consuming. This chapter shows how you can leverage
your Access programming skills so that you can use your existing front-end Access
applications while utilizing the PostgreSQL database on the back end to manage your
data in a multi-user-friendly environment.
PostgreSQL server using ODBC must have the psqlODBC package installed and config-
ured. There are two ways to obtain and install psqlODBC:
The standard PostgreSQL installation package includes the psqlODBC package, along
with all of the other parts of a PostgreSQL server. To install psqlODBC, you must make
sure that the ODBC option is enabled in the PostgreSQL installation window, shown in
Figure 12-1.
The ODBC drivers are a set of Windows .dll files installed in the Windows system
directories. The ODBC documentation files are placed under the PostgreSQL installation
directory (C:\Program Files\PostgreSQL\8.2\doc\psqlODBC).
For a remote client, however, all you need to install is the ODBC driver; you obvi-
ously do not need to install the entire PostgreSQL server package. This makes using
the standard PostgreSQL installation package somewhat overkill for installing just the
ODBC driver. To compensate for that, there is another installation package that only in-
cludes the psqlODBC package for Windows. This is obviously a much smaller package,
and installing it is much easier for customers than extracting the ODBC drivers from the
standard PostgreSQL installation package.
The psqlODBC driver package can be found by following the links on the main Post-
greSQL web site (www.postgresql.org). From the main web page, click the Downloads
link. On the Downloads page, click the FTP Browser link to go to the PostgreSQL FTP
Browser mirror site, www.postgresql.org/ftp/. Click the odbc folder, and then click
the versions folder. This folder contains different binary installation packages for the
psqlODBC program. The msi folder contains the full Windows installation packages, so
click that link to open the folder.
The msi folder contains several versions of the psqlODBC package in zip format.
Select the latest version (at the time of this writing it is 8_02_0200) to download. This link
takes you to a list of available download sites for the file. Select a site, and download the
zip file to your hard drive.
After the psqlODBC package is downloaded, you must extract the Windows installer
file (psqlodbc.msi) from the zip file. Double-click the zip file from Windows Explorer
and extract the file. Start the psqlODBC installer by double-clicking the .msi package
from Windows Explorer. You must have administrative rights on the system you are
installing this on. The main installation window appears. Click the Next button to walk
through the welcome and license windows. The Feature Selection window shows the
components available to install, as shown in Figure 12-2.
Notice that there are only two options to install: the ODBC driver and the psqlODBC
documentation. By default, the installer will place the ODBC driver files and documen-
tation in the C:\Program Files\psqlODBC directory. You can change that location by
clicking the Browse button. Select the components you want to install, and click the Next
button. After clicking another Next button to confirm the install, the installation process
starts. Click the Finish button to complete the installation.
Tab Description
User DSN User-level data sources available only to the user
System DSN System-wide data sources available to all users, including
services, on the system
File DSN System-wide file data sources shared by users
Drivers The ODBC drivers (and versions) currently installed on the system
Tracing Start and stop an ODBC trace
Connection Enable or disable connection pools, allowing applications to reuse
Pooling a single open session to the ODBC driver
About Information about the system ODBC environment
After installing the PostgreSQL psqlODBC driver package (through either the main
PostgreSQL installer or the separate psqlODBC installation package), you should see
two PostgreSQL driver entries in the Drivers tab:
Figure 12-4. The PostgreSQL (Unicode or ANSI) ODBC Driver Setup configuration window
name you will see when searching through the User or System DSN list. It is a good idea to
use a descriptive name, especially if you are creating multiple ODBC sessions for multiple
PostgreSQL databases (remember, each session can connect to only one database).
The Description textbox is for information purposes. You can put a longer text phrase
in the Description textbox to help identify the ODBC session.
The rest of the textboxes define the properties of the PostgreSQL database you need
to connect to. These should be fairly self-explanatory. Put the name of the database that
the ODBC session should connect to in the Database textbox, the Login Role used to con-
nect to the database in the User Name textbox, and the password used for the Login Role
in the Password textbox.
The value of the Server textbox defines the hostname or IP address of the PostgreSQL
server. If the PostgreSQL server is running on the same system you are creating the ODBC
session on, you can use the hostname localhost. Otherwise, enter the appropriate
IP address or hostname of the system where the PostgreSQL server is running. In the
Port textbox, enter the TCP port used by the PostgreSQL server to listen for connections
(port 5432 by default).
The SSL Mode drop-down box allows you to determine the encryption method the
ODBC session will attempt to use to connect to the PostgreSQL server (see Chapter 10).
The SSL Mode options available are the following:
X Prefer Attempt an SSL encryption session first. If that fails, attempt a plain-text
TCP session.
R Allow Attempt a plain-text TCP session first. If the server requires an SSL-
encrypted session, then attempt to connect using SSL.
272 PostgreSQL 8 for Windows
R Require Attempt an SSL encryption session first. If that fails, do not attempt
any other connections.
W Disable Attempt a plain-text TCP session first. If that fails, do not attempt any
other connections.
These are the basic features required to configure an ODBC session to communicate
with a PostgreSQL server. In most applications, this is all you should have to configure.
However, for more advanced users, there are two additional configuration buttons avail-
able in the configuration window.
These buttons are used to fine-tune the ODBC session parameters by telling the
ODBC driver how to handle specific types of data, and whether or not to set up log files
to log ODBC connection activity. Each of these additional configuration areas contains
default values that work fine for most database situations.
Clicking the Datasource button opens the Advanced Options window, shown in
Figure 12-5.
There are two pages of advanced options you can configure. Clicking the Page 2 but-
ton goes to the second page of options, and clicking the Page 1 button returns to the first
page. Table 12-2 lists the advanced options available, along with a short description of
their purpose.
The first group of options, up to the Read Only option, are located on page 1. The
remaining options are on page 2. By default, these settings only apply to the specific
ODBC session they are defined in. For testing, you can create multiple ODBC sessions
pointing to the same database using different configuration settings. The Global button
allows you to set the log files to be used for all psqlODBC sessions. This enables logging
on all psqlODBC sessions, across all users on the system.
To easily execute the SQL, copy the commands into a text file called auction.sql.
Then you can use the psql program to load the commands into the database to create
the objects:
As psql processes the commands, you will see the output messages. This SQL code
creates a new schema called auction, then the four tables in the auction schema (you
may need to click the database server name, then click the Refresh button on the top
menu bar to see the new items). A sample Login Role to use with the application (called
earl) is created, along with a sample Group Role (auctioneer).
The auctioneer Group Role is assigned all privileges to all of the tables created,
along with privileges to the schema. In a real application, you would probably have
different levels of access, such as a group to control adding new bidders and donors,
another group to update item information, and another group to update the Item table
with the winning bids and insert bidder payments to the Payment table.
on or from a remote client system, start the ODBC Data Source Administrator program.
From the ODBC Data Source Administrative window, click the System DSN tab, then
click the Add button. From the Create New Data Source window, select either the ANSI
or Unicode PostgreSQL driver, then enter the information for your new database in the
PostgreSQL (Unicode or ANSI) ODBC Driver Setup dialog box, as shown in Figure 12-6.
Remember to set the SSL Mode settings as required in the pg_hba.conf configura-
tion file of the PostgreSQL server for the client (see Chapter 3). In the Data Source text-
box, enter auction. This value will be used to reference the auction schema tables in the
Microsoft Access programs.
Unfortunately, there is no way to test the ODBC DSN configuration. Fortunately, if
you try to connect using the ODBC source and it fails, an fairly detailed error message
appears, giving you a good idea what value(s) needs to be changed.
The blank database does not contain any tables, queries, forms, reports, or macros.
The next step is to create new tables that map to the PostgreSQL server tables. This is
done by linking tables in Access.
Linking tables allows Access to retrieve table data directly from the remote table
without having to actually have the data in the .mdb file on the system. Multiple sys-
tems can link to the same data sources. This enables you to have only one copy of the
data on the data source and still allow multiple users to accessing the single data copy.
Access retrieves the data using standard SQL commands. This greatly reduces network
load, because the contents of an entire .mdb data file do not need to be moved across the
network.
To create a new linked table, choose File | Get External Data | Link Tables. The Link
window appears, pointing you to a default directory. Since this application will link to
an ODBC connection instead of to an .mdb file, click the Files of Type drop-down arrow
and select ODBC Databases. The Select Data Source window appears. Select the Machine
Data Source tab. This shows a list of all the DSNs (both User and System) configured on
the system.
The auction data source that was created in the previous step is available. Click
that entry, then click OK. This produces the Link Tables window, shown in Figure 12-7.
If you get an error message at this point, check your ODBC session configuration set-
tings, especially the user and password values. Remember to use the earl Login Role,
along with the auction password assigned to the Login Role. It is also important to
ensure that your PostgreSQL server’s pg_hba.conf file is configured to allow access
from the network address of the remote client (see Chapter 3).
If the ODBC session connected to the PostgreSQL server properly, you will see a list
of all tables available on the PostgreSQL database. Notice that there are lots of hidden
tables in the database that appear in the table list, along with the schema tables you cre-
ated earlier.
Click the four auction schema tables (you can select more than one table at a time
to link to) and then click OK. The four tables (auction.bidder, auction.donor,
auction.item, auction.payment) now appear in the Access tables list, as shown in
Figure 12-8.
You can now treat these tables as any other Access tables. Double-clicking a table
produces the Access Datasheet view, allowing you to easily enter new data into the table.
You can also use the Access graphical Design View mode to view the properties of the
table columns, although you will not be able to alter any of these properties.
Use the Access Datasheet view to add some sample data to your tables. Create a few
sample bidders, one or two donors, then a few sample items to track. As you enter data,
notice that all of the constraints defined in the tables apply to the Access Datasheet view.
If you try adding an item with an invalid donorkey value, you get an error for the un-
matched foreign key value, shown in Figure 12-9.
The errors returned by the ODBC driver are very descriptive, allowing you to easily
identify problems.
After you have entered some sample data to work with, you are ready to create que-
ries, forms, and reports in Access using the data tables from the PostgreSQL server. If
you are familiar with how to use the Access environment, this should be a snap. All of
the programming functions and features in Access behave exactly the same when using
the linked PostgreSQL tables.
As an example, click the Reports tab, then click the New button to create a new report.
Select the Design View mode to create the new report. In the Choose the Table or Query
Where the Object’s Data Comes From drop-down box, select the auction_bidder table
created from the link to the PostgreSQL auction.bidder table. A blank report appears
in the Design View window, linked to the bidder table.
282 PostgreSQL 8 for Windows
In the new report, you can now add bidder table fields within the Detail section of
the report. Select a textbox control and drag it into the Detail report area. Double-click
the new textbox to produce the properties for the control. In the Properties window, the
Control Source parameter contains all of the columns (fields) of the table. You can select
any of the columns to include in the report, shown in Figure 12-10.
After adding the desired fields to the report, save the report and run it. The resulting
report shows the data retrieved from the bidder table on the PostgreSQL server, shown
in Figure 12-11.
Besides the standard arsenal of Access programming tools, you now can utilize Post-
greSQL tools to help simplify your Access application. You now have triggers, functions,
and views available for your PostgreSQL tables. You can use these features to help sim-
plify and speed up your database applications.
of the PostgreSQL database at your disposal. Instead of using the old Access queries,
you can create a PostgreSQL view that replicates the query by combining columns of
multiple tables.
The difference is that once a view is created in the PostgreSQL database, it is available
along with the regular tables as an external link. When the view is accessed by users, the
query is processed internally on the PostgreSQL server, and only the result set data is
returned. This greatly reduces the network overhead, and reduces the time required to
process the query.
As an example, create the view itemswon in the auction schema:
This view produces a result set that contains only the items that had successful win-
ning bidders, and lists the names of the bidders instead of the bidder number.
284 PostgreSQL 8 for Windows
Once the view is created in the PostgreSQL database, you can link a new Access table
to the view by using the Get External Data feature described in the previous section. The
new view appears in the table list along with the other database tables. You can now use
the view as a regular table, using the Datasheet view to see the values in the view, and
using the view in reports similar to the original query. The one drawback is that, whereas
Access allows you to use queries to insert data, you cannot insert new data into a Post-
greSQL view. The PostgreSQL developers are still working on that functionality.
After you have some sample data in the Access table, you can test exporting it to a
PostgreSQL table. Right-click the Access table and select Save As/Export. In the Save As
dialog box, select To an External File or Database, then click OK. Another dialog box ap-
pears, asking you for the filename and file type to save the data as. In the Save as Type
textbox, select Text Files, choose a location and a filename, and click the Export button.
The Export Text Wizard dialog box then guides you through creating an export file
of the table data. In the first wizard window, select that you want a delimited text file,
then click Next. In the second wizard window, select the delimiter character you want to
use. Remember, you should choose a delimiter character that is not present in any of the
text data in the table. Also, select the Text Qualifier character in the drop-down box. In
most cases, you can set this to none, as the \copy meta-command does not require text
qualifiers. When done, click the Finish button to start the export.
After creating the export text file, you can copy it to the PostgreSQL server and use
the \copy meta-command in psql to import the data into the new table:
If any of the data items does not import properly, psql will produce an error mes-
sage saying which line it is located in. Also, since the \copy command is treated as a
single transaction, if any of the lines fails, the entire import will fail.
SUMMARY
While the Microsoft Access program is a popular database application, it does not scale
well to large multi-user databases. To solve this problem, you can leverage your existing
Access applications, but migrate the Access data tables to a PostgreSQL server. The exist-
ing Access applications can connect to the PostgreSQL tables using ODBC connectivity.
The PostgreSQL ODBC driver must be loaded on each system that requires access to the
tables. The PostgreSQL ODBC driver is included in the standard PostgreSQL installation
package, or you can download it separately in the psqlODBC package.
After installing the PostgreSQL ODBC driver, you can create an ODBC session using
the Windows ODBC Data Source Administrator program that is included in Windows
XP Home and Professional, Windows 2000 Workstation and Server, and Windows 2003
Server. Once the ODBC session is created, you can create linked tables in Access that link
directly to the PostgreSQL tables. Each linked table behaves exactly as a normal table in
Access. Access queries, forms, and reports can all be used as normal using the linked
tables.
When migrating Access applications to PostgreSQL, you may have to also migrate
data. You can use the standard Access export feature to export table data to a text file. Af-
ter creating the duplicate table in PostgreSQL, use the \copy meta-command included
in psql to import the text data file into the new PostgreSQL table.
The next chapter describes how to access a PostgreSQL database from your Micro-
soft .NET applications. The .NET platform is the latest programming environment intro-
duced by Microsoft. PostgreSQL includes a library that can be used by .NET applications
to directly access a PostgreSQL database across a network.
13
Microsoft .NET
Framework
287
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
288 PostgreSQL 8 for Windows
T
he Microsoft .NET Framework is the preferred programming environment for
Microsoft Windows platforms. Microsoft has developed the .NET platform to
allow multiple programming languages to interface using the same library files
and development environment. PostgreSQL also provides a development library for
.NET programmers. You can create .NET programs in Visual Basic .NET, C#, and Visual
C++ that can communicate with a PostgreSQL server. This chapter describes the .NET
programming environment and shows how to use the PostgreSQL .NET library to create
Windows programs that interface with your PostgreSQL server.
C++.NET
C#.NET
Visual Basic.NET
C++
The Visual Studio .NET suite is the flagship development environment for Microsoft.
This integrated development environment (IDE) offers many advanced features to as-
sist your Windows application programming tasks. Microsoft describes the Visual Stu-
dio package as “a rapid application development (RAD) tool, enabling programmers to
quickly code and debug .NET applications.” It includes a complete graphical environ-
ment for creating Windows forms, console applications, and web applications, as well as
providing a complete debugging environment for programmers.
290 PostgreSQL 8 for Windows
The .NET Framework SDK offers an inexpensive way to get started with .NET pro-
gramming. The SDK is available for free from Microsoft, and contains command-line
tools for compiling and debugging .NET programs. This application allows you to get
the feel for .NET programming without investing in an expensive IDE.
There are two components that make up the .NET Framework SDK development
environment:
The .NET redistributable package provides the CLR libraries necessary to run .NET
applications on a Windows system. This package must be installed on all Windows sys-
tems that run .NET applications. If you provide your application for other workstations
to use, they must each have at least the .NET redistributable package installed.
The .NET Framework SDK package contains the compilers and library files neces-
sary to create .NET applications from the command line. The .NET Framework SDK
contains the Visual Basic .NET and C# compilers. This provides a single development
environment for using the .NET programming language of your choice! This package
needs to be installed only on the systems you use to develop your applications.
You start the .NET Framework SDK package by double-clicking the setup.exe file.
Similar to the redistributable package, the Framework SDK package is a compressed
file that is automatically extracted into a temporary directory for installation. After the
usual introduction and license windows, the Framework SDK installation options are
displayed, shown in Figure 13-2.
The Framework SDK installation package contains three options:
X QuickStart Samples Installs lots of Visual Basic .NET and C# program exam-
ples to help you learn and understand .NET programming
R Tools and Debugger Installs the compilers, library files, and debugging tools
necessary for developing .NET programs
W Product Documentation Installs HTML documentation for each of the .NET
namespaces, as well as general information about the .NET platform
Select the options you want to install, then click Next. The next window asks for the
location to install the Framework SDK. All of the library and executable files are stored
in default directories on the system. You can, however, change the location where the
product documentation and sample programs are stored.
After you select the installation location, the installation starts. The Framework SDK
installation takes a long time to install, even on super-fast systems. When the installation
is finished, you have a complete .NET development environment on your system. To test
this, choose Start | Microsoft .NET Framework v2.0 | SDK Command Prompt. A com-
mand prompt window opens, using the default location of the Framework directory. At
the command prompt, run the C# compiler program (csc) with no options, as shown in
Figure 13-3.
The compiler header is shown, along with an error that no source code file was speci-
fied to compile. In the header, you can see the .NET Framework version that is loaded
(in this example, version 2.0.50727). If you get this result, you are ready to start creating
.NET applications.
The Linux Mono project provides a development environment for creating and run-
ning .NET applications on Linux systems. This is the first attempt at cross-platform
development using the .NET technology. While not 100 percent compatible with the
Microsoft .NET environment, great strides have been made in Mono to make most .NET
applications work seamlessly on a Linux system.
Npgsql supports two different Microsoft .NET platforms. The original .NET version 1.0
platform has been officially deprecated by Microsoft, and is not recommended for new
development work. Customers who are using the .NET version 1.0 platform are recom-
mended to upgrade to version 1.1. Recently, Microsoft has released the updated .NET
version 2.0 platform. This platform contains a new CLR environment that supports many
new features. It is recommended for all new .NET development to utilize this platform.
Downloading Npgsql
The Npgsql library download can be found from the PostgreSQL main web site (www
.postgresql.org). From the main web page, click the Downloads link. On the Downloads
page, click the FTP Browser link to go to the list of available downloads.
On the FTP Browser page, click the projects folder link, then click the pgFoundry
folder link. This is the repository for projects that are not officially part of the PostgreSQL
package, but are obviously related to PostgreSQL. In the pgFoundry area, click the
npgsql folder link.
The npgsql folder contains all of the available Npgsql library formats currently sup-
ported. The filename for each package includes the npgsql version along with the pack-
age format. The Npgsql library can be download both as a source code package that
must be compiled and as a binary distribution that is precompiled to run on a specific
platform. Source code packages are identified by the filename tag src. Binary distribu-
tion packages use the filename tag bin, along with the platform they are compiled for.
For the Microsoft .NET platform, it is by far the easiest to download the binary dis-
tribution package. Look for the latest version available, and download the binary distri-
bution for the .NET platform you have installed. For example, to download the binary
Npgsql library for the .NET version 2.0 platform, get the following file:
Npgsql1.0-bin-ms2.0.zip
At the time of this writing the official 1.0 version of Npgsql had just been released. By
the time you read this, there may be updates that have been released to fix any bugs or
security issues. Please use the latest version available for download. After downloading
the binary distribution package, you must use a zip program to unzip the contents into
a directory. Windows XP Home and Professional, as well as Windows 2003 Server, can
automatically detect and handle .zip files. For Windows 2000 workstations and serv-
ers, you must install a third-party .zip file handler.
294 PostgreSQL 8 for Windows
C:\>
Do the same for the Mono.Security.dll file included in the Npgsql binary distri-
bution package. You are now ready to run a .NET PostgreSQL application.
automatically included. When using the command-prompt compilers, though, you must
use the /r option to directly reference the npgsql.dll file in the compiler command
each time you compile a program.
The format for compiling a C# application is
This command adds the npgsql.dll library file as a resource for the compiler. The
npgsql.dll file must be in a directory contained in the PATH environment variable
of the current command-prompt session, or it must be in the same directory where the
program source code file is located. To add the directory to the PATH environment vari-
able, right-click My Computer, select Properties, click the Advanced tab, and then click
the Environment Variables button, which takes you to the Environment Variable editor,
where you can modify the PATH setting.
For a Visual Basic .NET program, you must use the /r option to reference the
npgsql.dll file, and also reference the System.Data.dll file, which is standard in
the .NET Framework SDK:
Again, the npgsql.dll file must be located in either the same directory as the source
code file or in a directory listed within the PATH environment variable. The System
.Data.dll file is already located in the standard .NET library directory.
Besides referencing the Npgsql library file in the compiler command, you must also
include the Npgsql and System.Data namespaces in the Visual Basic .NET or C# pro-
grams that use the Npgsql library. By including the namespaces, you can directly refer-
ence any of the classes contained in the namespaces. In a C# program, this is done with
the using statement at the start of a program:
using System.Data;
using Npgsql;
Now you can reference any class in the Npgsql namespace within the C# program. In
a Visual Basic .NET program, this is done with the Imports statement:
Imports System.Data
Imports Npgsql
The next section walks through the classes included in the Npgsql library, and shows
how to use them in both C# and Visual Basic .NET programs.
Class Description
NpgsqlCommand Contains a SQL statement or function to be
executed on a PostgreSQL database
NpgsqlCommandBuilder Builds commands for automatically inserting,
updating, and deleting table records
NpgsqlConnection Creates a connection to a PostgreSQL server
NpgsqlDataAdapter Contains record data for SELECT, INSERT,
UPDATE, and DELETE SQL commands
NpgsqlDataReader Reads a result set from a query and makes the
data values available
NpgsqlError Contains error and notice messages returned by
the PostgreSQL server
NpgsqlEventLog Handles Npgsql event and debug logging
NpgsqlException The exception that is thrown if an Npgsql error
event occurs
NpgsqlNoticeEventArgs Notice events for less important errors
NpgsqlNotificationEventArgs Parameters for notification events
NpgsqlParameter Contains parameters used for Npgsql commands
NpgsqlParameterCollection A collection of parameters used in Npgsql
commands and mappings to columns in a dataset
NpgsqlTransaction Contains a set of SQL commands that are bundled
into a complete transaction for the database
ServerVersion Contains the version of the PostgreSQL server
The classes contain members and properties that are used to interact with the
PostgreSQL server to help you perform standard SQL commands and functions in your
application code. The following sections describe these classes in more detail.
NpgsqlConnection(connectionstring)
Chapter 13: Microsoft .NET Framework 297
The connectionstring parameter sets the values used to define the PostgreSQL server
connection. This is stored in the ConnectionString property of the class. The
ConnectionString property is a text string that can be built using standard string
techniques. The ConnectionString property uses identifiers and values to define the
connection properties. The identifiers and values are defined in the string using the fol-
lowing format:
ident1=value1;ident2=value2;ident3=value3...
Identifier Description
Server The hostname or IP address of the PostgreSQL server.
Port The TCP port of the PostgreSQL server.
Protocol The protocol to use if not automatic. May be 2 for pre-version 7.3
PostgreSQL servers or 3 for version 7.3 or later servers.
Database The name of the database to connect to. The default is the username.
User Id The Login Role to connect to the server as.
Password The plain-text password for the Login Role specified.
SSL Determines whether to use SSL (True or False), similar to
Sslmode. The default is False.
Pooling Controls connection pooling (True or False). The default is False.
MinPoolSize Sets the minimum size of the connection pool when using pooling.
MaxPoolSize Sets the maximum size of the connection pool when using pooling.
Encoding Sets the encoding to use for the connection (ASCII or UNICODE).
The default is ASCII.
Timeout Sets the time to wait for the connection to open (in seconds). The
default is 15 seconds.
CommandTimeout Sets the time to wait for a command to finish executing (in seconds).
The default is 20 seconds.
Sslmode Sets the mode for the SSL connection (Prefer, Require, Allow,
or Disable). The default is Disable.
ConnectionLifeTime Sets the time to wait before closing unused connections if a
connection pool is used (in seconds). The default is 15 seconds.
SyncNotification Specifies whether Npgsql should use synchronous notifications.
Server=127.0.0.1;Port=5432;User Id=earl;Password=auction;Database=test
The identifiers and values define the typical connection parameters used to connect
to a PostgreSQL server database, using a specific Login Role. Remember, the PostgreSQL
server must be configured to accept connections, and the pg_hba.conf configuration
file must be configured to allow the specific Login Role to connect to the specified data-
base (see Chapter 3).
After an NpgsqlConnection class instance is created, there are two class meth-
ods that are used to control the connection. The Open() class method is used to start a
session with the PostgreSQL server. If the connection fails, the Open() method throws
an NpgsqlException exception. This should be caught within a try-catch block and
handled accordingly. If the exception is not caught by the program, the CLR will abort
the application and produce an error message to the user—not exactly user-friendly.
When you are finished with the PostgreSQL session, use the Close() class method
to stop the session. This disconnects from the PostgreSQL server and closes all of your
session objects. Likewise, if the Close() method fails, it throws an NpgsqlException
exception.
There are a few NpgsqlConnection class properties that come in handy after the
connection is established:
{
conn.Open();
ServerVersion sv = conn.ServerVersion;
String vers = sv.ToString();
Console.WriteLine("Connected to server, version: {0}", vers);
} catch( NpgsqlException e)
{
Console.WriteLine("problem connecting to server: {0}", e.Message);
} finally
{
conn.Close();
}
}
}
As you can see from the version.cs program, the ConnectionString property
for the NpgsqlConnection object can be built as any type of String value, including
a String that has been created from several pieces. Remember to include a semico-
lon between each value in the ConnectionString. This code example establishes a
connection to the PostgreSQL server running on the local system (using the loopback
network address 127.0.0.1), logging into the test database using the earl Login Role
created in Chapter 12. You may have to change some of these values if your PostgreSQL
environment is different. The ConnectionString value is long, so to split it up I built
the string from two parts.
The NpgsqlConnection instance is created using the ConnectionString param-
eter. This does not need to be placed within a try-catch block, as the connection is not
attempted until the Open() method is called.
Within the try-catch block section, after opening the connection using the Open()
method, the program uses the ServerVersion property of the connection to retrieve
the version of the PostgreSQL server. The ServerVersion property uses the Server-
Version class to hold the version value.
This class contains a ToString() method that converts the version to a simple string
value that can be displayed. You can also extract the individual major, minor, and patch
values of the version as integer values for comparison. After displaying the version, the
program uses the Close() method to stop the PostgreSQL session.
The try-catch block catches any NpgsqlException exceptions that are thrown by
the methods, and displays the error message to the user. It is always a good idea to catch
exceptions rather than letting the .NET CLR produce its own ugly error messages and
abruptly terminate the program.
To compile the version.cs program from the command prompt, type
csc /r:npgsql.dll version.cs
Make sure that the version.cs and npgsql.dll files are available for the com-
piler. The resulting executable program, version.exe, can be run on any worksta-
tion that has the .NET redistributable package installed (of course, if it connects to a
remote PostgreSQL server, you must use the correct server hostname or IP address in the
300 PostgreSQL 8 for Windows
ConnectionString parameter). Here is an example of what you should see when run-
ning the program:
C:\>version
Connected to server, version: 8.2.0
C:\>
C:\>version
problem connecting to server: FATAL: 28000: password authentication
failed for
user "earl"
C:\>
Property Description
CommandText Gets or sets the SQL command or function to execute on
the PostgreSQL server
CommandTimeout Gets or sets the time to wait (in seconds) for a command
to complete
CommandType Gets or sets the format of the CommandText property
(either Text, the default, or StoredProcedure)
Connection Gets or sets the NpgsqlConnection instance used by
the command instance
LastInsertedIOD Returns the OID of the last inserted row
Parameters Gets the NpgsqlParametersCollection instance
used for the command
Transaction Gets or sets the NpgsqlTransaction instance within
which the command executes
The constructor for an NpgsqlCommand instance can define the CommandText and
Connection properties:
NpgsqlCommand(CommandText, Connection)
The SQL command is entered into the CommandText property of the NpgsqlCom-
mand class instance. By default, the CommandText property is a text SQL command. You
can also specify a stored procedure by setting the CommandType property to Stored-
Procedure.
The Connection property contains an NpgsqlConnection instance that may or
may not be already opened using the Open() method. These properties can also be set
separately after an instance of the NpgsqlCommand class is created:
NpgsqlConnection conn = new NpgsqlConnection(connstring);
NpgsqlCommand command = new NplgsqlCommand();
command.Connection = conn;
command.CommandText = "DELETE from auction.bidder WHERE bidderid = 100";
After the NpgsqlCommand object is created, there are several methods that can be
used to handle the command, listed and described in Table 13-4.
As you can see from Table 13-4, there are three different methods used to execute
commands stored in an NpgsqlCommand object. The proper method to use depends on
the expected results from the stored SQL command:
Method Description
Cancel Attempts to cancel the execution of a command
Clone Creates a new command based on this one
ExecuteNonQuery Executes a SQL command that does not return a value
ExecuteReader Executes a SQL command that returns a result set
ExecuteScalar Executes a SQL command that returns a single value
Prepare Sends the command to the PostgreSQL server before
execution so it can be prepared by the server
If anything goes wrong with the SQL command on the PostgreSQL server, an Npg-
sqlException exception is thrown by the method, which can be caught in a try-catch
section.
The update.vb program demonstrates using the ExecuteNonQuery() method to
add a record to the Bidder table:
Imports System
Imports System.Data
Imports Npgsql
Try
conn.Open()
result = comm.ExecuteNonQuery()
Console.WriteLine("Inserted {0} records", result)
Catch e as NpgsqlException
Console.WriteLine("Problem: {0}", e.Message)
Finally
conn.Close()
End Try
End Sub
End Class
The update.vb program declares the necessary variables using the Dim statement,
then creates a connection to the local PostgreSQL server using the earl Login Role from
Chapter 12. The CommandText string is built on three separate lines by concatenating a
string value. You can also place the command string directly in the CommandText prop-
erty of the NpgsqlCommand constructor.
The SQL command is not executed until the ExecuteNonQuery() method is exe-
cuted in the program. This is done in a try-catch block to catch any possible errors with-
out blowing up the program. The ExecuteNonQuery() method returns the number of
records affected by the command. This value is displayed on the screen to verify that the
INSERT command worked.
Compiling and running the example should look like this:
C:\>update
Inserted 1 records
C:\>
The compile finished successfully, and the program indicated that the record was
added with no problems. You can use pgAdmin III or psql to view the Bidder table to
see if the record was indeed added. You can also try to run the update program again
and see what happens:
C:\>update
Problem: ERROR: 23505: duplicate key violates unique constraint "bidder_pkey"
C:\>
This time an NpgsqlException exception was thrown. The exception indicates that
the new record could not be added because it would violate the primary key constraint,
which is what you would expect when trying to insert the same record again.
304 PostgreSQL 8 for Windows
comm.Connection = conn;
comm.CommandText = "Select timeofday();";
try
{
conn.Open();
String servertime = (String)comm.ExecuteScalar();
Console.WriteLine("The time on the server is: {0}", servertime);
C:\>getvals
The time on the server is: Mon Sep 18 19:20:01.343000 2006 EDT
The value of pi on the server is: 3.14159265358979
C:\>
The NpgsqlDataReader class contains several properties that can be used to ana-
lyze the result set returned. These are listed and described in Table 13-5.
Besides these properties, there are also several class methods available to use in the
NpgsqlDataReader class, listed and described in Table 13-6.
The columns in each record in the result set are referenced by their index value. You
can extract the record column values into variables using the Getxxx methods. You must
know the column data type in the table, and extract the column data into the proper vari-
able data type. Any mismatches are marked as errors when the program runs.
306 PostgreSQL 8 for Windows
Property Description
FieldCount Indicates the number of columns in the record returned
HasRows Indicates if the NpgsqlDataReader object has rows to
be read
IsClosed Indicates if the NpgsqlDataReader object is closed
Item Gets the value of a column using the PostgreSQL
data type
RecordsAffected Indicates the number of rows changed, inserted, or
deleted by the SQL command
Method Description
Close() Closes the NpgsqlDataReader object
GetBoolean(int) Gets the value of column int as a Boolean data type
GetDataTypeName(int) Gets the data type name of column int
GetDateTime(int) Gets the value of column int as a DateTime
data type
GetDecimal(int) Gets the value of column int as a Decimal data type
GetDouble(int) Gets the value of column int as a Double data type
GetFloat(int) Gets the value of column int as a Float data type
GetInt16(int) Gets the value of column int as an int16 data type
GetInt32(int) Gets the value of column int as an int32 data type
GetName(int) Gets the column name of column int
GetString(int) Gets the value of column int as a String data type
NextResult() Gets the next result set when multiple result sets
are returned
Read() Reads the next record in the result set
try
{
conn.Open();
comm = new NpgsqlCommand("select * from auction.itemswon", conn);
data = comm.ExecuteReader();
if (data.HasRows)
{
while(data.Read())
{
String itemkey = data.GetString(0);
String lastname = data.GetString(1);
String firstname = data.GetString(2);
float finalbid = data.GetFloat(3);
Console.WriteLine("{0} - {1},{2}\t${3}", itemkey, lastname,
firstname, finalbid);
}
} else
Console.WriteLine("There were no rows in the result");
} catch (NpgsqlException e)
{
Console.WriteLine("Problem: {0}", e.Message);
} finally
{
conn.Close();
}
}
}
308 PostgreSQL 8 for Windows
In the getitemswon.cs program, the SQL command performs a SELECT query on the
itemswon view created in Chapter 12. When executing the ExecuteReader() method
for the NpgsqlCommand class, the results are assigned to an NpgsqlDataReader object
(data). The HasRows property is used to determine if any records were returned in the
result set. If so, the Read() method is used to step through each record in the result set.
For each record, the individual column data values are extracted using the appropriate
data type variables and column index value. When the view was defined in Chapter 12,
the first column was a varchar data type representing the item key, the second column
was a varchar data type representing the bidder’s last name, and the third column was a
varchar data type representing the bidder’s first name. The data from each of these col-
umns can be stored in a String variable. The fourth column contains the final bid value,
which is a Double data type, and stored as a Double variable.
The values are displayed for each record returned in the result set:
C:\>getitemswon
SC001 - Blum,Barbara $450
SC002 - Blum,Rich $240
TR001 - Blum,Barbara $575
C:\>
The Prepare() method allows the PostgreSQL server to determine the optimal way
to execute the query before the ExecuteReader() method is executed.
Chapter 13: Microsoft .NET Framework 309
The :bidder identifier is a parameter placeholder that can be replaced with a value
when the SQL command is executed. The next step is to add the parameter to the Npg-
sqlParameterCollection object defined in the NpgsqlCommand object:
comm.Parameters[0].Value = 100;
While(True)
Console.Write("Enter bidder: ")
bidder = Console.ReadLine()
bidderval = Convert.ToInt32(bidder)
If (bidderval = 0)
Exit While
End If
Try
comm.Parameters(0).Value = bidderval
data = comm.ExecuteReader()
If (data.HasRows)
While(data.Read())
bidderkey = data.GetInt32(0)
lastname = data.GetString(1)
firstname = data.GetString(2)
Console.WriteLine("{0} - {1},{2}", bidderkey,lastname,firstname)
End While
Else
Console.WriteLine(" No bidder found")
End If
Catch e2 as NpgsqlException
Console.WriteLine("Problem: {0}", e2.Message)
Exit While
End Try
End While
conn.Close()
End Sub
End Class
Chapter 13: Microsoft .NET Framework 311
The first part of the paramtest.vb program should look familiar. A connection
is established with the PostgreSQL server, and an NpgsqlCommand object is created,
using a parameter placeholder in the query. The parameter is defined using the Add()
method, and set as an Int32 data type.
Next, a WHILE loop is started. In the WHILE loop a value is retrieved from the user
input, converted to an int32 data type, then assigned to the parameter Value property.
After the value is assigned, the ExecuteReader() method is executed and the result set
is stored in the NpgsqlDataReader object. The values are extracted using the Getxxx
methods, and the record is displayed. The output of the program should look something
like this:
C:\>paramtest
Enter bidder: 110
No bidder found
Enter bidder: 100
100 - Blum,Rich
Enter bidder: 103
103 - Blum,Jessica
Enter bidder: 150
150 - Test,Ima
Enter bidder: 0
C:\>
SUMMARY
The Microsoft .NET technology has become a popular programming platform for both
Microsoft Windows and web development. The Npgsql library allows .NET program-
mers to create applications that can access a PostgreSQL server. The Npgsql library can
be downloaded and installed in a standard .NET development environment using either
the Visual Studio suite or the .NET Framework SDK package.
The Npgsql library provides several classes for working with a PostgreSQL server.
The NpgsqlConnection class allows you to establish a connection to a local or remote
PostgreSQL server, using a Login Role and a database name. The NpgsqlCommand class
allows you to specify a SQL command to send to the server. There are several methods
contained in the NpgsqlCommand class for processing the returned result set.
The next chapter discusses how to interface your non-.NET C++ applications with a
PostgreSQL server. The libqxx library is used to connect a Win32 C++ application to the
PostgreSQL server, send SQL commands, and process the result sets.
This page intentionally left blank
14
Visual C++
313
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
314 PostgreSQL 8 for Windows
C
hapter 13 demonstrated how to use the Npgsql library to incorporate a
PostgreSQL database into your Microsoft .NET applications. However, not all
applications for the Microsoft world are written using .NET technology. Many
programmers still prefer to use the older Win32 programming platform. The workhorse
of the Win32 programming platform is the Microsoft Visual C++ programming language,
commonly used by professional application programmers. Fortunately, there is a library
package that can be used to support PostgreSQL databases in Visual C++ applications.
This chapter describes the libpq library package. This library package enables Visual
C++ programmers to utilize C function calls to access PostgreSQL databases in their
applications, without having to upgrade to the .NET technology.
The downside to the Visual Studio platform is cost. For hobbyists and programming
students looking for low-cost ways of programming, there is an alternative choice. Mi-
crosoft has recently released the Visual Studio Express Editions of each of its compiler
lines. Instead of purchasing the expensive Visual Studio package that includes all of the
compilers in one package, Microsoft has produced separate free versions of the indi-
vidual compiler packages. The Visual C++ Express Edition package is a free download
that can be used to create C++ applications.
Of course, there are some limitations to the Express Edition packages. They do not
incorporate all of the libraries and advanced features that are included in the Visual Stu-
dio package, but they do contain much of the same functionality.
To augment the Visual C++ Express Edition package, Microsoft also provides the
Windows Platform Software Development Kit (SDK). This SDK contains C++ libraries
for many commonly used Windows Win32 functions, such as creating Window Forms
and using network connectivity. The Platform SDK can also be downloaded for free and
incorporated into the Visual C++ Express Edition installation. This SDK and some cre-
ative configuring allow you to create regular Win32 Windows applications using the
Visual C++ Express Edition software.
If you click the Download link on the web page, you are directed to the interactive
network installation. If you are installing Visual C++ Express Edition on a system with
a high-speed Internet connection and want to install the package in real time, select the
316 PostgreSQL 8 for Windows
Interactive network installation option. This option downloads the installation compo-
nents as needed across the Internet during the installation. When the installation is fin-
ished, the installation files are removed from your system.
If you prefer to burn a CD that can be used any time for installation, click the Manual
installation instructions link under the Download link. This link takes you to a web page
where you can manually create an installation CD from either an .img or .iso CD im-
age file downloaded from the web site. You must still manually download the image file
to your system, so a high-speed Internet connection is required for these options as well.
However, once you download the appropriate image file for your CD-burning software,
you can use the created CD to install as many copies of Visual C++ Express Edition as
you want.
After creating the installation CD, you are ready to install Visual C++ Express Edi-
tion. The installation CD automatically starts to the installation program, and you see the
Welcome to Setup screen, shown in Figure 14-1.
To continue the installation, click Next. The next window displays the license agree-
ment. If you agree to its terms, accept the agreement and click Next. The following win-
dow is the Installations Options window, shown in Figure 14-2.
Figure 14-1. The Visual C++ Express Edition installation Welcome to Setup screen
Chapter 14: Visual C++ 317
The installation package also allows you to install the Microsoft MSDN Express Edi-
tion library, which includes documentation for the Express Edition products. If you de-
sire, you can also elect to install Microsoft SQL Server 2005 Express Edition. This is the
free version of the popular Microsoft SQL Server 2005 database package (as described in
Chapter 1). If you plan to use only PostgreSQL, you do not need to install the Microsoft
SQL Server package.
After selecting the packages to install, click Next. Continue clicking Next until the
Install button appears. Clicking this button starts the software installation. After a long
process of installing files, the installation program completes.
After completing the installation, start the Visual C++ Express Edition graphical IDE
by choosing Start | All Programs | Visual C++ 2005 Express Edition | Microsoft Visual
C++ 2005 Express Edition. The main Visual C++ window appears, shown in Figure 14-3.
From this window you can start a new project, or select an existing project from the
list in the window. The Visual C++ Express Edition installation also provides a command
prompt compiler environment. You can use the command prompt compiler to quick-
ly compile and run applications. There is a special link created that starts a command
318 PostgreSQL 8 for Windows
prompt with the proper environment variables set for compiling applications. Choose
Start | All Programs | Visual C++ Express Edition | Visual Studio Tools | Visual Studio
2005 Command Prompt. In the command prompt window, you can type cl (the compiler
program) to see if it is available:
C:\Program Files\Microsoft Visual Studio 8\VC>cl
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
The cl command started the compiler, which complained that there were not any
files to compile. You are now ready for the next step in the installation process.
Chapter 14: Visual C++ 319
Next, select the Library Files option in the Show Directories For drop-down list and
click the New Folder icon. Enter the location for the Platform SDK library files:
After entering the new library location, click OK to return to the main window. You
are now ready for the next step in the process.
You can edit this file using the standard Notepad application. Look for the following
line in the file:
AdditionalDependencies="kernel32.lib"
The actual line in the file should be a single line, with no carriage returns. After modi-
fying the line, save the file under the same filename. Just one more step to go!
Again, you can use the standard Notepad editor to edit this file.
The lines to change are line numbers 441 through 444. If you are using the Notepad
application to edit this file, choose Edit | Go To and then enter the line number (441).
These lines must be commented out by placing two forward slashes (//) at the front of
the line like so:
// WIN_APP.disabled = true;
// WIN_APP_LABEL.disabled = true;
// DLL_APP.disabled = true;
// DLL_APP_LABEL.disabled = true;
This comments out the configuration files. After commenting out the lines, save the
file using the same filename.
Now you should be able to create Win32 Windows applications using your Visual
C++ Express Edition installation. To start a new project, choose File | New | Project.
Under the Visual C++ section, there are three options, CLR, Win32, and General. In the
Win32 section, you will still not see an option for creating Win32 Windows applications.
However, if you select to create a new Win32 Console application, the Win32 Application
Wizard now gives you the option to create a Win32 Windows application (after assigning
a project name), as shown in Figure 14-5.
Your Visual C++ development environment is almost complete. All that is left is to
configure the libpq library package. This is demonstrated in the next section.
322 PostgreSQL 8 for Windows
The libpq.dll file is precompiled for Microsoft Visual C++ applications. It contains
the libpq functions that are used by Visual C++ applications to access the PostgreSQL
server. This file is installed in the PostgreSQL Windows installation.
Chapter 14: Visual C++ 323
The libpq include files are used to define the libpq functions for C and C++ pro-
grams. They are referenced in a Visual C++ program by using the #include directive to
reference the lead header file, libpq-fe.h:
#include "libpq-fe.h"
This header file references all other libpq header files necessary to compile the pro-
grams. Likewise, the libpq library file provides the function code used to compile C and
C++ programs that utilize the libpq functions. The library file must also be included
when compiling the programs that use these functions.
The libpq include and library files must be installed as part of the PostgreSQL pack-
age installation (see Chapter 2). During the Installation Options section, the Development
options allow you to install the include files and library files. You can check if you have
already installed them by looking in the c:\program files\postgresql\8.2\lib
folder. There should be an ms folder containing the library files.
If you did not install these options when you originally installed your PostgreSQL
software, you can rerun the installation program and modify the installation to include
them. Just click the X in the Development box and select the option to enable the instal-
lation. This is shown in Figure 14-6.
After your Visual C++ environment is ready and you have libpq installed on your sys-
tem, you can start creating C++ programs that access your PostgreSQL database server.
There are a few things you need to remember when creating libpq Visual C++ programs.
First, if you are developing new applications using the Visual C++ IDE, you need
to include the PostgreSQL include and library directories in the IDE search path. This
is the same procedure as performed earlier when configuring the Platform SDK fea-
tures. Choose Tools | Options, expand the Projects and Solutions section, and then select
VC++ Directories. In the Show Directories For drop-down list box, select Include Files
and then add the PostgreSQL include directory:
c:\program files\postgresql\8.2\include
Next, select Library Files and then add the PostgreSQL library directory:
c:\program files\postgresql\8.2\lib\ms
When you build a new Visual C++ project in the IDE, you must include the libpq
.lib file in the files referenced in the project. To do that, right-click the project name in
the left window frame, choose Add | Existing Item, and then find the libpq.lib file
in the c:\program files\postgresql\8.2\lib\ms directory. The libpq.lib file
appears as a resource file in the project file listing.
If you are adventurous and build Visual C++ applications from the command prompt,
you must use the cl.exe command prompt compiler program. To do this, you must
have the proper environment variables set for compiling. This is already done for you if
you open a command prompt window by using choosing Start | All Programs | Visual
C++ 2005 Express Edition | Visual Studio Tools | Visual Studio 2005 Command Prompt.
At the command-prompt line, you must use the /I parameter to manually specify
the location of the include files, and include the libpq.lib file on the command line.
The compiler must be able to locate the libpq.lib file. The easiest way to do this is to
copy the libpq.lib file from the c:\program files\postgresql\8.2\lib\ms
directory to the directory where you maintain your source code. A complete compile
command should look like this:
C:\>cl /I"c:\program files\postgresql\8.2\include" test.c libpq.lib
Microsoft (R) 32-bit C/C++ Optimizing Compiler Version 14.00.50727.42 for 80x86
Copyright (C) Microsoft Corporation. All rights reserved.
test.c
Microsoft (R) Incremental Linker Version 8.00.50727.42
Copyright (C) Microsoft Corporation. All rights reserved.
/out:test.exe
test.obj
libpq.lib
C:\>
By default, the compiler creates a Win32 application named test.exe. To run the
new application, there is one more thing you must do. As discussed in Chapter 3, to run
a PostgreSQL application, the application must have access to the PostgreSQL DLL files
in the bin directory. This can be accomplished either by moving your new application
program to the PostgreSQL bin directory or by adding the PostgreSQL bin directory to
your PATH environment variable.
Now your PostgreSQL Visual C++ development environment is complete. It is time
to start looking at how to produce some programs. The following sections describe the
libpq functions and demonstrate how to use them in Visual C++ applications.
Chapter 14: Visual C++ 325
The PQconnectdb() function returns a pointer to a PGconn data type. This value is
used in all subsequent functions that send commands to the server using this connection.
The conninfo constant character pointer defines the values used to connect to the
PostgreSQL server. These values are not much different from what you are already used to
for connecting to a PostgreSQL server. As expected, there are several standard parameters
that can be used to define the connection. These are listed and described in Table 14-2.
Function Description
PQconnectdb(const char Start a connection to a PostgreSQL server using
*conninfo) parameters in conninfo and wait for an answer.
PQconnectStart(const char Start a connection to a PostgreSQL server in
*conninfo) nonblocking mode using parameters in conninfo.
PQconnectPoll(PGconn Check the status of a pending nonblocking
*conn) connection attempt conn.
PQfinish(PGconn *conn) Close (end) an established PostgreSQL server
session conn.
PQreset(PGconn *conn) Reset a PostgreSQL server session by closing a
previous session conn and starting a new session
using the same parameters. This command waits
for an answer from the server.
When listing multiple parameters in the character string, each parameter pair must
be separated by one or more spaces. Following is an example of creating a new connec-
tion:
const char *conninfo;
Pconn *conn;
conninfo = "host = 127.0.0.1 dbname = test user = earl password = auction";
conn = PQconnectdb(conninfo);
This example connects to the PostgreSQL server running on the local system using
the loopback address (127.0.0.1). It attempts to establish the session with the database
test using the earl Login Role.
The PQconnectdb() function is a blocking function. A blocking function stops
(blocks) execution of the program until the function completes. If the PostgreSQL server
specified in the connection string is unavailable, you will have to wait for connect_
timeout seconds before control will return to your program. For Windows applica-
tions that are event-driven, sometimes this can cause problems. While the program is
Chapter 14: Visual C++ 327
If the connection has not succeeded or failed, you must poll the connection again.
This should continue until either the connection is established or has ultimately failed.
After you have established a connection to a PostgreSQL server, there are several
functions you can use that allow you to check on the status of the connection, as well
as the connection parameters used to establish the connection. These are listed and de-
scribed in Table 14-3.
All of the status functions use the PGconn data value returned by the PQconnect-
db() function to identify the connection. Most of the status functions return a pointer to
a string that contains the status information. An example of this is
char *user;
user = PQuser(conn);
The user character pointer points to the Login Role value used to establish the con-
nection. The exception is the PQstatus() function. This function returns a ConnSta-
tusType data type, which has two defined values:
Function Description
PQdb(PGconn *conn) Returns the database name of the
connection
PQuser(PGconn *conn) Returns the Login Role used for the
connection
PQpass(PGconn *conn) Returns the password of the Login Role
used for the connection
PQhost(PGconn *conn) Returns the server hostname of the
connection
PQport(PGconn *conn) Returns the TCP port of the connection
PQoptions(PGconn *conn) Returns any command-line options used
to establish the connection
PQstatus(PGconn *conn) Returns the status of the server
connection
PQtransactionStatus(PGconn *conn) Returns the transaction status of the
server
PQparameterStatus(PGconn *conn, Returns the current setting of a parameter
const char *param) param on the server
PQprotocolVersion(PGconn *conn) Returns the PostgreSQL backend protocol
version
PQserverVersion(PGconn *conn) Returns the server version as an integer
value
PQerrorMessage(PGconn *conn) Returns the most recently generated
server error message
PQsocket(PGconn *conn) Returns the server file descriptor of the
socket connection
PQbackendPID(PGconn *conn) Returns the server process ID (PID) of the
PostgreSQL process
PQgetssl(PGconn *conn) Returns NULL if SSL is not used, or an SSL
structure if it is used for the connection
conn = PQconnectdb(conninfo);
if (PQstatus(conn) != CONNECTION_OK)
{
printf("Unable to establish connection: %s",
PQerrorMessage(conn));
return 1;
} else
{
printf("Connection established!\n");
serverversion = PQparameterStatus(conn, paramtext);
printf("Server Version: %s\n", serverversion);
}
PQfinish(conn);
return 0;
}
This small example demonstrates all the basics of connecting to and interacting with
a PostgreSQL server in Visual C++. It connects to the test database on the localhost
using the earl Login Role created in Chapter 12. If you have not created these database
objects, you will have to use other values for these parameters on your system.
After attempting the connection, the PQstatus() function is used to test the status
of the connection. If the connection is OK, the PQparameterStatus() function is used
to obtain the value of the server_version parameter from the PostgreSQL server.
If you are compiling the program on the command prompt, the command should
look like:
Remember the libpq.lib file must be available in the same directory as the ver-
sion.c file. After compiling the application, the program version.exe is generated
330 PostgreSQL 8 for Windows
(assuming you do not have any typographical errors in your code). However, if you try
to run this program, you may have a problem. You may get a Windows error message
that a component necessary to run the application was not found.
In order for libpq programs to run, they must have access to the PostgreSQL DLL
files, normally located in the bin directory of the PostgreSQL installation. To solve this
problem, you must either include the PostgreSQL bin directory in your PATH environ-
ment variable or copy your applications to the PostgreSQL bin directory along with the
other PostgreSQL applications.
After solving the DLL problem, the application should run with no trouble:
C:\Program Files\PostgreSQL\8.2\bin>version
Connection established!
Server Version: 8.2.0
C:\Program Files\PostgreSQL\8.2\bin>
The program worked as expected. The next section moves on to show more advanced
functions that can be executed on the server.
The PQexec() function requires two parameters. The first parameter is the PGconn
object created when connecting to the server. The second parameter is a character string
that contains the SQL command to execute on the server.
When the PQexec() function is executed in the program, the command is sent to
the server and the program waits for a response. The response is placed in an PGresult
data object. Since there are many types of output associated with different SQL com-
mands, this data object must be capable of handling lots of possibilities.
This data object must first be checked using the PQresultStatus() function to de-
termine the status of the command and the type of output available from the command.
The result status can be
Function Description
PQexec(PGconn *conn, const char Submits a string command to the server
*command) and wait for the result.
PQexecParams(PGconn *conn, const Submits a command that includes
char *command, int nParams, const Oid parameters to the server and wait for
*paramTypes, const char *paramValues, the result. The parameters are defined,
const int *paramLengths, const int along with lengths and formats, in the
*paramFormats, int resultFormat) other parameters. This command also
specifies the result format (0 for text or
1 for binary).
PQprepare(PGconn *conn, const char Submits a command query to be prepared
*name, const char *query, int nParams, on the server using connection conn. The
const Oid *paramTypes) command may use parameters, defined
in the PQexecPrepared() function. The
prepared statement is referenced by name.
PQexecPrepared(Pgconn *conn, Submits a request to execute a previously
const char *name, int nParms, prepared statement name. You may
const char *paramValues, const specify parameters and the result format
int paramLengths, const int (0 for text or 1 for binary).
paramFormats, int resultFormat)
PQresultStatus(Pgresult *result) Returns the result status of an executed
command.
PQresStatus(ExecStatuwsType status) Returns a string result status given a
PQresultStatus value.
PQresultErrorMessage(PGresult Returns an error message string from an
*result) executed command, or a NULL value if no
error occurred.
PQclear(PGresult *result) Clears (frees) the storage memory of
result status result.
These values are defined in the libpq library and can be directly checked in your
code:
PGresult *result;
result = PQexec(conn, "SELECT lastname from auction.bidder");
if (PQresultStatus(result) != PGRES_TUPLES_OK)
{
...
If the result returned by the command is a result set, there are several “helper” func-
tions that can be used to determine what the returned data looks like, and how to handle
it. These functions are listed and described in Table 14-5.
These functions are invaluable in sorting through the result set data returned within
the PGresult object. The way your program handles the result set data depends on
what type of data (if any) is returned. The following sections demonstrate how to handle
different types of data returned from executing SQL commands in libpq.
Function Description
PQntuples(PGresult *res) Returns the number of records (tuples) in a
result set res.
PQnfields(PGresult *res) Returns the number of columns (fields) in a
result set res.
PQfname(PGresult *res, int column) Returns a column name given a result set res
and a column number column.
PQfnumber(PGresult *res, const Returns a column number given a result set
char *colname) res and a column name colname.
PQftable(PGresult *res, int column) Returns the OID of the table a given column
column was fetched from.
PQgetvalue(PGresult *res, int rec, Returns the value of column column of a
int column) single record rec in result set res.
PQgetisnull(PGresult *res, int rec, intTests if a value in column column of record rec
column) in result set res is NULL. Returns 1 if the value
is NULL, and 0 if it is not.
PQgetlength(PGresult *res, int rec, Returns the length (bytes) of a value in
int column) column column in record rec in a result set res.
PQclear() function is used to clear the memory used by the PGresult object. It is
always a good idea to clear this memory, especially if you reuse the PGresult object
for another command.
After compiling the update.c program, execute it on your PostgreSQL system. If
the INSERT command succeeds, nothing should display (not too exciting). You can then
look at the auction.bidders table and see the newly added record. If you run the
program a second time, you should get an error message, indicating that the Bidder table
key value already exists:
C:\Program Files\PostgreSQL\8.2\bin>update
Problem with command: ERROR: duplicate key violates unique
constraint "bidder_pkey"
C:\Program Files\PostgreSQL\8.2\bin>
In the function, you must specify the result set result, the desired record number
record within the result set, and also the column number column within the record to
retrieve a value from. Records and column numbers both start at 0. For commands that
produce only one record, that is always record 0.
The getvals.c program demonstrates how to retrieve and convert data from a
result set of a SQL function:
#include <stdio.h>
#include <stdlib.h>
#include "libpq-fe.h"
PQfinish(conn);
return 0;
}
After establishing the connection, the PQexec() function is used to send a standard
SELECT SQL command to the server. Notice that to check the status of the result set,
you must use the PGRES_TUPLES_OK value, since the query returns a result set. The
PQgetvalue() function is used to retrieve the result set, using record 0 and column
0 since there is only one data value returned. In the first instance, the timeofday()
PostgreSQL function returns a string value, which can be directly placed in a character
string variable.
After using the PQclear() function to reset the PGresult value, another query
is made with the PQexec() function. Again, the PQgetvalue() function is used to
retrieve the text result set from the query. This time, since the desired data type is a
floating-point value, the atof() C function is used to convert the string value to a float-
ing point variable.
336 PostgreSQL 8 for Windows
C:\Program Files\PostgreSQL\8.2\bin>getvals
Time of day: Sun Oct 01 16:37:29.092000 2006 EDT
The value of pi is: 3.141593
C:\Program Files\PostgreSQL\8.2\bin>
int recs;
int cols;
result = PQexec(conn, "SELECT * from auction.items");
recs = PQntuples(result);
cols = PQnfields(result);
Once you know the number of records and columns in a result set, it is a snap to loop
through the result set, extracting individual data items:
There is one thing you must be careful about, though, when extracting the column
data. Remember, the PQexec() function returns all data items in text format. This means
that if you intend to use the table data within your application program as another data
type, you must use standard C functions to convert the string to the appropriate data
type:
conn = PQconnectdb(conninfo);
if (PQstatus(conn) != CONNECTION_OK)
{
printf("Unable to establish connection: %s",
PQerrorMessage(conn));
return 1;
} else
{
result = PQexec(conn, "Select * from auction.itemswon");
if (PQresultStatus(result) != PGRES_TUPLES_OK)
{
printf("Problem with query: %s",
PQerrorMessage(conn));
return 1;
} else
{
printf("Item Won by Final Bid\n");
printf("----------------------------\n");
for(i = 0; i < PQntuples(result); i++)
{
itemkey = PQgetvalue(result, i, 0);
lastname = PQgetvalue(result, i, 1);
firstname = PQgetvalue(result, i, 2);
finalbid = PQgetvalue(result, i, 3);
floatbid = atof(finalbid);
printf("%s - %s,%s $%.2lf\n", itemkey, lastname,
firstname, floatbid);
}
PQclear(result);
}
}
PQfinish(conn);
return 0;
}
The getitemswon.c program starts off as usual, connecting to the sample database
and using the PQexec() function to send a simple query. The number of records in the
result set is obtained using the PQntuples() function, and a for loop is started to loop
through the result set records.
338 PostgreSQL 8 for Windows
Within the for loop, each iteration is a separate record in the result set. For each
record, the individual column data values are extracted using the PQgetvalue() func-
tion. The function references the result set, the record number (controlled by the for
loop value), and the individual column number for each data element.
Of course, this requires that you know what columns are produced by the query, and
in what order. If you do not know this information, you can use the PQfname() function
to find out which column number is which column name.
Each individual column data value is assigned to a character string variable, since
PQexec() returns only string values. Since the finalbid column value is a floating-
point data type, the atof() C function is used to convert the string representation to a
floating-point value:
C:\Program Files\PostgreSQL\8.2\bin>getitemswon
SC001 - Blum,Katie $120.00
TR001 - Blum,Rich $3400.00
RR003 - Blum,Rich $10.00
CC023 - Blum,Barbara $15.00
C:\Program Files\PostgreSQL\8.2\bin>
Using Parameters
In many cases, it is necessary to perform multiple queries using the same SQL command
but different data elements. Instead of having to rewrite each PQexec() command, you
can use the PQexecParams() function and use a feature called parameters. Parameters
allow you to use variables in the place of normal data values in the SQL command. Each
variable is preceded by a dollar sign ($) and numbered (1 for the first variable, 2 for the
second, and so on). An example of using parameters is
Variables can only be used for data values. You cannot use a variable for the table or
column names. The format of the complete PQexecParams() function is
As expected, you must declare the values used in the variables within the function.
After specifying the connection to use and the string command to send, you must define
the variable values. The first value is nparams, which specifies the number of variables
used. After that is the paramTypes value. This value is an array containing the OID of
Chapter 14: Visual C++ 339
each variable type. You can also assign this value a NULL value. This value forces the Post-
greSQL server to use the default data type for the column the data value is used for.
The next three values specify the variable values (paramValues), their lengths (param-
Lengths), and the format the variable is specified in (paramFormats). Each of these values
is an array. The first element in the array references variable $1, the second variable $2,
and so on.
The format value of each variable can be either text mode (0) or binary mode (1). If
the value is NULL, PostgreSQL assumes all of the variables are in text mode. If the format
of the variable is text mode, PostgreSQL assumes the variable is specified as a text (or
character) data type and converts the variable value to the data type required for the SQL
command specified.
Thus, you can specify an integer or floating-point value as a text string and set the
format value to 0, and PostgreSQL automatically does the conversion for you. If you use
text mode for variables, you can also set the length value to NULL, as PostgreSQL auto-
matically determines the length of the text string variable.
Alternatively, if you set the value format type to binary mode (1), you must specify
the variable values using the appropriate data type (such as float or int) and specify
the length as the byte length of the value.
The result format value (resultFormat) is also interesting. It allows you to set the data
type of the result value. It can be either text mode (0) or binary mode (1). In text mode,
the result set is returned as text values, just as with the PQexec() function. The ap-
propriate data type conversion must be performed for binary values. In binary mode,
the result set data is returned in the appropriate binary format, either integer or floating
point. At this time, the result format value can only be one or the other; you cannot mix
and match text and binary mode data elements.
Setting the PQexecParams() function can be confusing. For beginners, the easiest
thing to do is set both the input and output formats as text mode. This cuts down on the
amount of information you must collect and pass to the function.
The bidderresult.c program demonstrates using parameters with text mode for
both input and output data values:
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include "libpq-fe.h"
conn = PQconnectdb(conninfo);
if (PQstatus(conn) != CONNECTION_OK)
{
printf("Unable to establish connection: %s",
PQerrorMessage(conn));
return 1;
} else
{
query = "SELECT * from auction.item where bidderkey = $1";
while(1)
{
totalbid = 0;
printf("\nEnter bidder: ");
scanf_s("%3s", &bidder, 4);
if (!strcmp(bidder,"0"))
return 0;
PQfinish(conn);
return 0;
}
The paramValues pointer is declared as an array. Since there is only one parameter
used in the SQL command, there needs to be only one value in the array. The string
Chapter 14: Visual C++ 341
used for the query is declared, using a parameter variable $1 for the data element in the
WHERE clause.
Next, an endless While loop is used to continually loop through values until a stop-
ping value is entered by the user. The C scanf_s function is used to allow the user to
enter a bidder value to look up in the database. The scanf_s function is buffer-overflow
safe, to prevent the user from trying any buffer-overflow exploits with the character buf-
fer bidder.
After the desired variable value is entered, it is assigned to the paramValues array,
and the PQexecParams() function is declared. Since the input value is in text mode,
the paramLengths and paramFormats values can be set to NULL. Also, the result-
Format value is set to 0, indicating that we want the output result set data in text mode
as well.
After the PQexecParams() function is executed, the data in the result set is extract-
ed within the for loop. Since the output data format is set to text, the finabid value
must be converted to floating point using the C atof() function.
After compiling the program, you can run it to see the winning bids for a specific
bidder. A value of 0 should be entered to stop the program:
C:\Program Files\PostgreSQL\8.2\bin>bidderresult
Enter bidder: 0
C:\Program Files\PostgreSQL\8.2\bin>
342 PostgreSQL 8 for Windows
SUMMARY
The libpq library provides functions for C and C++ programs to access a PostgreSQL
server. The most popular C++ development environment in the Windows world is Mi-
crosoft Visual C++. This development environment can be created using either the Visual
Studio software suite or the free Visual C++ Express Edition. To create Windows applica-
tions in Visual C++ Express Edition, you must also download and install the Microsoft
Platform SDK package.
The libpq package can be installed from the standard PostgreSQL installation program.
The libpq package includes the library files, C header files, and DLL run-time files neces-
sary for C and C++ applications. After installing these components, they must be config-
ured in the Visual Studio or Visual C++ Express Edition development environment.
The libpq library contains functions to connect to a PostgreSQL server, send SQL
commands, and retrieve the result set of the commands. Once the result set is retrieved,
you can extract individual column data elements.
The last chapter in this book addresses yet another popular Windows development
environment. The Java programming language has taken the world by storm. Many pro-
grammers program portable database applications using Java. Just as with .NET and
Win32, PostgreSQL also includes a development package for Java programmers. The
next chapter demonstrates how to use a PostgreSQL server from a Java development
environment.
15
Java
343
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
344 PostgreSQL 8 for Windows
T
he Java programming language has quickly become the favorite of programmers
who must create applications for multiple platforms. A Java application can run
on Windows, Unix, and Linux operating system platforms without having to
recompile new executables. Because of this feature, many programmers are learning Java
and converting their applications to the Java environment. PostgreSQL supports the Java
programming environment by providing a driver for the Java Database Connectivity
(JDBC) programming interface. This chapter describes the Java development environ-
ment, JDBC, and how to use the PostgreSQL JDBC driver to create Java applications that
can access data on a PostgreSQL server.
X J2SE Java version 2 Standard Edition, for developing workstation and server
applications
R J2EE Java version 2 Enterprise Edition, for developing server-side web appli-
cations
W J2ME Java version 2 Micro Edition, for developing mobile device applications
The Java Runtime Edition (JRE) package is not a development platform, but must be
loaded on a system for Java applications to execute. It provides the Java Virtual Machine
(JVM) core libraries required for Java applications (both desktop and web based) to ex-
ecute. There are different versions of the JRE for each operating system platform. Once
a Java application is compiled, it can run on any JVM on any operating system platform
without having to be recompiled for the operating system.
You cannot develop new applications with just the JRE, though. To develop Java
applications, you must have the J2SE, J2EE, or J2ME environment installed. All three of
these environments include the Java compiler for creating Java applications, along with
the standard Java library files for their specific programming environments. All of these
development environments support JDBC for connecting to database servers.
JDBC is the Java layer that allows Java applications to connect and interact with da-
tabases. JDBC is a standard interface for all Java applications. Database vendors must
supply a compatible JDBC library file for Java applications to interface with their indi-
vidual database products. Java programmers can write code that interfaces with JDBC.
Chapter 15: Java 345
PostgreSQL PostgreSQL
JDBC driver database
This code can then be used to access any database that provides a JDBC driver. This is
demonstrated in Figure 15-1.
After connecting to the database using JDBC, the Java application can access tables
and data using standard JDBC methods and properties. These methods and properties
remain consistent for all databases using the JDBC driver.
To develop Java applications for PostgreSQL, you must have a Java development
environment and the PostgreSQL JDBC driver. The following sections describe how to
download and install a Java development environment to start your Java programming.
Following that, the PostgreSQL JDBC driver is discussed.
Finally, if you are developing applications intended for mobile devices, such as cell
phones and personal digital assistants (PDAs), you need the J2ME SDK. This package
provides a development environment for creating mobile applications for multiple plat-
forms. The development environment includes an emulator that allows you to test and
debug mobile applications on your workstation.
Each of these development environments can be downloaded for free directly from
the Sun Java web site (java.sun.com). The Download link on the main page takes you
to the Downloads web page, which provides you with a wealth of choices for download-
ing the various developer packages.
The examples in this chapter use the command-line compiler available in the J2SE
SDK package. If you choose to download the J2SE SDK package, you have an additional
choice to make. By default, the J2SE SDK package provides only a simple command-
line Java compiler for you to use to compile your Java applications. You can still create
and compile GUI applications, but you must create the Java code manually by using the
command-line utilities. If you prefer to develop applications using fancy graphical inte-
grated development environment (IDE) packages, then the NetBeans program is for you.
The NetBeans application is a free IDE that is combined with the J2SE SDK package
to provide a single graphical development environment for creating, debugging, and
running Java applications. The graphical environment allows you to create Java GUI ap-
plications by dragging and dropping windows controls into a work area that resembles
the completed window form. This allows you to visually see what you are creating.
The NetBeans package can be downloaded as a single bundle with the J2SE SDK. At
the time of this writing, the current production version of this bundle includes the J2SE
SDK version 5.0 Update 9 package and the NetBeans 5.0 package in a single download
file from the Sun Java web site. There are different versions of the download package
for different operating system platforms. The Windows development version is in the
following file:
jdk-1_5_0_09-nb-5_0-win-ml.exe
This file is a self-extracting installation package. There is no need to extract the com-
ponent files into a temporary directory before starting the installation. After download-
ing the package, run it to start the installation.
Figure 15-2. The J2SE SDK and NetBeans installation welcome window
After installing the J2SE SDK, you should include the Java bin directory in your
Windows PATH environment variable. This allows you to compile Java applications from
any directory on the system. To access the PATH environment variable, right-click My
Computer, select Properties, and then click the Advanced tab. Clicking the Environment
Variables button takes you to the Environment Variables editor. The PATH variable is a
Figure 15-3. The J2SE SDK and NetBeans installation completion window
348 PostgreSQL 8 for Windows
system variable, and is shown in the System Variables list box. Just add the path for the
Java bin directory to the end of the existing value (after a semicolon):
C:\Program Files\Java\jdk1.5.0_09\bin
Adding this value to the existing PATH variable allows you to run the Java command-
line compiler program (javac) from any directory on the system. You can now create
Java applications on your system.
To start a new programming project, click File | New Project. The New Project wiz-
ard starts, shown in Figure 15-5.
The New Project wizard walks you through selecting the type of Java application
to build, the location of the project files, and the name of the main Java class in the
application.
After you enter the desired values, the graphical development environment starts.
The right-side window contains the code for the application, shown in Figure 15-6.
The New Project wizard creates a blank class template for your application. By de-
fault, the blank template creates a Java namespace using the project name, and a Main
class that is used to contain the application code. You are now ready to start building
your new Java application.
350 PostgreSQL 8 for Windows
The standard PostgreSQL server installation provides an option to install the JDBC
driver files in the Database Drivers section of the Install Options window, shown in
Figure 15-7.
When choosing to install the PostgreSQL JDBC drivers from the server installation,
they are placed in the jdbc directory under the standard PostgreSQL install directory
(see Chapter 3).
Chapter 15: Java 351
The other way to install the PostgreSQL JDBC drivers is to download them sepa-
rately. The jdbc.postgresql.org web site provides the latest PostgreSQL JDBC drivers for
download. Click the Download link on the left side of the main web page to go to the
Download web page. You can download either precompiled binary versions of the driv-
er or the complete Java source code to compile your own.
Since Java applications run on any operating system, there are no separate down-
loads for separate operating systems. You may notice, however, that there are still three
different JDBC drivers to download, jdbc2, jdbc2ee, and jdbc3.
The different drivers relate to different versions of the Java environment. You must
download the driver associated with the Java JVM version you are using. Different Java
versions use different JDBC versions. Table 15-1 lists the JDBC driver versions and the
Java versions they are used for.
The way Java handles versions can be somewhat confusing. Java versions 1.0 and
1.1 are referred to as the Java 1 platform, which is not supported by Sun anymore. Post-
greSQL does not supply a JDBC driver for these versions. Java version 1.2 introduced
many new concepts and is popularly referred to as the Java 2 platform. Version 1.3 is
known as Java 2 version 3.0, version 1.4 as Java 2 version 4.0, and version 1.5 as Java 2
version 5.0. If you have downloaded and installed the latest J2SE 5.0 version, that is
really Java version 1.5, which requires the jdbc3 driver. If this is confusing, you can al-
ways tell the installed Java JVM version on a system by running the java command-line
command with the -version option:
C:\>java -version
java version "1.5.0_09"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_09-b01)
Java HotSpot(TM) Client VM (build 1.5.0_09-b01, mixed mode, sharing)
C:\>
This system shows that it is using version 1.5.0_09, which is the J2SE 5.0. The _09
part shows that it is update (or patch) 9.
The PostgreSQL JDBC driver file, whether downloaded from the PostgreSQL JDBC
web site or installed from the PostgreSQL server installation, uses a specific filename
format to delineate which version it is. The format of the PostgreSQL JDBC filename is
postgresql-version.format.jar
where version is the driver version, and format is the JDBC type. At the time of this writ-
ing, the latest PostgreSQL JDBC driver version is 8.1-407. Thus, the JDBC driver to use for
the J2SE 5.0 SDK is postgresql-8.1-407.jdbc3.jar. Note that the version avail-
able on the PostgreSQL JDBC web site is often newer than the JDBC drivers supplied in
the PostgreSQL server installation package. You can still use the version supplied with
your server installation package, but there may be bugs that have been fixed in the later
releases.
The PostgreSQL JDBC drivers are created as a Java .jar file. A .jar file is a collec-
tion of Java classes that are grouped together into a single Java namespace and can be
referenced by Java applications. Because the classes in the .jar file are called directly
by Java applications, the .jar file must be present when the application executes, and
the application must know where to find it. Because of this, you must distribute the ap-
propriate PostgreSQL JDBC .jar file with your PostgreSQL Java application (or require
your customers to download the PostgreSQL JDBC driver themselves). The following
sections describe how to include the JDBC driver file in your application.
created .jar file information, showing you the Java command to use to execute the
application.
The NetBeans IDE also allows you to include library files within the project. The
library files can include any .jar files required for the application to run. When your
project is opened in the IDE, the Projects tab in the left window lists the files associated
with the project. The Libraries section should include any .jar files that are required
by the application. By default, the Libraries section includes just the standard JDK 1.5
libraries.
To include the PostgreSQL JDBC driver file, right-click the Libraries heading and se-
lect Add JAR/folder. From the window, select the appropriate PostgreSQL JDBC driver
file required for your Java environment, either from the PostgreSQL jdbc folder or from
the download from the jdbc.postgresql.org web site.
Adding the PostgreSQL JDBC driver file to the project library creates a copy of the
driver file in the project’s lib folder. If you distribute the application’s .jar files to
customers, the lib folder (and the PostgreSQL JDBC driver in it) is automatically in-
cluded in the application .jar file. You do not need to include a separate copy of the
PostgreSQL JDBC driver with your application.
C:\javatest>
The Java runtime engine uses the CLASSPATH environment variable to find the
JavaTest.class class file necessary to run the application. If the application requires
a .jar file (such as the PostgreSQL driver file), you must specifically add it to the
CLASSPATH variable, even if it is in a directory already referenced:
SET CLASSPATH=C:\javatest;C:\javatest\postgresql-8.1-405.jdbc3.jar
The downside to using this method is that if you have multiple Java applications on
your system, you would have to either place them all into the same directory or add each
application’s directory and libraries to the CLASSPATH environment variable.
354 PostgreSQL 8 for Windows
To solve this problem, you can use the -classpath parameter in the Java
command-line run-time engine:
C:\javatest> SET CLASSPATH=C:\nonsense
C:\javatest> java JavaTest
Exception in thread "main" java.lang.NoClassDefFoundError: JavaTest
C:\javatest>
Starting a Connection
Before you can send SQL commands to the PostgreSQL server, you must establish a con-
nection. In JDBC, this is a two-step process.
First, you must declare the JDBC driver your application will be using. This is done
using the Class.forName() method. Each JDBC driver has a different driver name
that is used in this method. For PostgreSQL, the driver name is opt.postgresql
.Driver. You should also place this command within a try-catch block to catch any Java
exceptions that might be thrown if an error occurs:
try {
Class.forName("org.postgresql.Driver");
} catch (Exception e) {
System.out.println("problem with driver: " + e.getMessage());
System.exit(1);
}
After defining the JDBC driver used, you can establish the connection. A JDBC data-
base connection is handled by a Connection object. The Connection object is then
referenced by other JDBC classes to send SQL commands to the server.
Chapter 15: Java 355
The getConnection() method takes three string parameters. The first parameter,
url, specifies the protocol to use, the location of the database server, and what database
to connect to. The format of the parameter is
jdbc:protocol://host/database
For a PostgreSQL server, protocol is set to postgresql. The host value is either the
hostname or numeric IP address of the PostgreSQL server (it can be localhost if the
application is running on the same system as the server). The database value is the name
of the PostgreSQL database you are connecting to.
A complete example of connecting to a PostgreSQL database is shown in the Con-
nect.java program:
import java.sql.*;
The Connect.java program uses the Java import statement to include the java
.sql namespace, which includes all of the necessary JDBC classes to interact with the
database. The program then defines the strings necessary to connect to the sample test
database on the local host, using the sample earl Login Role and password.
356 PostgreSQL 8 for Windows
C:\test>
You do not need to include the JDBC driver file when compiling the application, only
when running it. If there are no errors, the compiler creates the Java class file Connect
.class. To run a Java application, you must use the JVM run-time engine and use the
-classpath parameter to define the location of both the Connect.class file and the
PostgreSQL JDBC driver .jar file.
To run the example in NetBeans, create a console application and place the code in
the main() section into the main() section of the project class template (do not copy
the main() or class lines into the template; those are already created for you by Net-
Beans). You must also include the import statement in the class template before the
Main class constructor. This is shown in Figure 15-8.
Remember to include the PostgreSQL JDBC driver in the Libraries section of the proj-
ect. When you are finished, click the Build icon. The application should build with no
errors. If so, click the Run icon to start the application. The output should appear in the
output window, as shown in Figure 15-8.
Once the Statement class object is created, you can use it to send SQL commands to
the server. There are two methods that are used to do this:
X executeUpdate()
W executeQuery()
Chapter 15: Java 357
Processing Nonqueries
Performing SQL commands that do not return a result set (such as INSERT, DELETE,
and UPDATE) requires using the executeUpdate() method of the Statement class.
The executeUpdate() method has one parameter, the SQL command as a String
value, and returns an integer value indicating the number of records affected by the
command:
The Update.java program uses the standard JDBC methods to load the driver
and connect to the database. Next, a Statement object is created and the executeUp-
date() method is used with a normal String value that contains a SQL INSERT com-
mand. Compiling and running the program produces the following results:
C:\test> javac Update.java
C:\test> java -classpath C:\test;C:\test\postgresql-8.1-405.jdbc3.jar Update
connected to database
added 1 records
C:\test>
The first time the Update.java program is run, the record is added successfully.
You can use the standard pgAdmin III program to check the inserted value. If you try
Chapter 15: Java 359
to run the program again to perform the same INSERT command, a Java exception
will be thrown. The error message produced specifies the cause of the exception. As
expected, the new record was rejected because the primary key value already existed in
the table.
Processing Queries
Performing SQL commands that return either a single data value (such as functions) or
a multirecord result set (such as SELECT queries) requires that you use the execute-
Query() method of the Statement class. In both situations, the method returns data
in a ResultSet object:
The ResultSet object contains the result set data retrieved from the executed SQL
command. You can move back and forth between records in the result set using the
next() method to go forward and the previous() method to go backward in the
ResultSet object. You can also read a specific record in the result set using the abso-
lute() method.
One oddity of the ResultSet object is that when it returns with result set data, it
does not point to the first record in the result set, even if there is only one record in the
result set. Instead it points to “just before” the first record. When the result set returns,
you must perform a next() method to point to the first record in the result set. This is
demonstrated in the GetVals.java program:
import java.sql.*;
try {
Class.forName("org.postgresql.Driver");
sqlcon = DriverManager.getConnection(URL, user, passwd);
System.out.println("connected to database");
stmt = sqlcon.createStatement();
rset = stmt.executeQuery("SELECT timeofday()");
rset.next();
String day = rset.getString(1);
System.out.println("time: " + day);
360 PostgreSQL 8 for Windows
rset.close();
stmt.close();
sqlcon.close();
} catch (Exception e) {
System.out.println("Problem: " + e.getMessage());
}
}
}
C:\test>
the end of the result set, the next() method will return a FALSE value. The easiest way
to loop through result set data is to use a while() loop:
while(rset.next())
{
...
}
When the last record has been processed, the while() loop exits. This is demon-
strated in the GetBidders.java program:
import java.sql.*;
public class GetBidders {
private String URL = "jdbc:postgresql://127.0.0.1/test";
private String user = "earl";
private String passwd = "auction";
public GetBidders() {
String output, lastname, firstname;
int bidderid;
try {
Class.forName("org.postgresql.Driver");
sqlcon = DriverManager.getConnection(URL, user, passwd);
} catch (Exception e) {
System.out.println("Problem: " + e.getMessage());
System.exit(1);
}
try {
sqlstmt = sqlcon.createStatement();
sqlrset = sqlstmt.executeQuery("Select * from auction.bidder");
while (sqlrset.next())
{
bidderid = sqlrset.getInt("bidderkey");
lastname = sqlrset.getString("lastname");
firstname = sqlrset.getString("firstname");
} catch (Exception e) {
System.out.println("Problem: " + e.getMessage());
}
}
public GetBidders {
...
}
Since the program is in the class constructor, the main() method (which is automati-
cally run by the JVM by default) must create an instance of the class:
When the new instance is created, the program code in the constructor is executed.
The program uses the standard methods for loading the JDBC driver and connecting
to the PostgreSQL server. Next, the SELECT query is sent using the executeQuery()
method. Data in the ResultSet object is processed within the while() loop. For each
record, the bidderkey, lastname, and firstname columns are extracted, based on
their column names.
The three variables are each represented using question marks as placeholders. The actu-
al variables are defined using the setxxx() methods of the PreparedStatement class.
Each variable must be set using the proper data type format of the setxxx() method:
ps.setInt(1, biddervalue);
The first parameter is an integer representing the order in which the variable appears in
the PreparedStatement, starting at 1. The second parameter is the value that is assigned
to the variable. This value must be in the same data type as the setxxx() method.
The UpdateItem.java program demonstrates using a PreparedStatement object
with variables to update the Items table with the winning bidder and final bid values:
import java.io.*;
import java.sql.*;
try {
Class.forName("org.postgresql.Driver");
sqlcon = DriverManager.getConnection(URL, user, passwd);
System.out.println("connected to database");
ps = sqlcon.prepareStatement(command);
while (true)
{
System.out.print("Enter item key: ");
itemkey = in.readLine();
if (itemkey.equals("exit"))
break;
364 PostgreSQL 8 for Windows
result = ps.executeUpdate();
System.out.println("updated " + result + " record");
}
ps.close();
sqlcon.close();
} catch (Exception e) {
System.out.println("Problem: " + e.getMessage());
}
}
}
By now you should recognize most of the UpdateItem.java program. After the
usual connectivity stuff, a PreparedStatement object is created. The SQL command
contains three variables: one to identify the updated bidderkey column value, one to
identify the updated finalbid column value, and one to be used to search for the prop-
er itemkey value in the table so that the proper record is updated.
After the PreparedStatement object is created, a while() loop is used to con-
tinually ask the user to enter new information. If the user enters the word exit for the
itemkey value, the while() loop is exited and the program stops.
After retrieving the values input by the user, the program converts the values to the
appropriate data types for insertion into the Item table, assigns them to the appropriate
variables for the PreparedStatement object, then executes the PreparedStatement
object using the executeUpdate() method.
Similar to a normal Statement object, the PreparedStatement object contains
both the executeUpdate() and executeQuery() methods. Since this SQL command
is an UPDATE command, the executeUpdate() method is used.
After compiling the program, enter some data into the Item table using the pgAdmin
III data editor (see Chapter 4). Be sure to leave empty the winning bidder and final bid
column values. Run the UpdateItem program and enter a few values. After entering a
few valid values, try entering values for an invalid itemkey. To end the program, enter
exit as the next itemkey:
C:\test>java -classpath c:\test;c:\test\postgresql-8.1-405.jdbc3.jar UpdateItem
connected to database
Enter item key: SC002
Enter winning bidder: 100
Enter winning bid: 240.00
updated 1 record
Chapter 15: Java 365
C:\test>
As the program executes, enter the appropriate Item itemkey value, the winning
bidder bidderkey value, then the winning bid finalbid value. As seen in the ex-
ample, if the itemkey value is valid, the output indicates that the record was updated.
If the itemkey value is invalid, no records are updated. After entering data for a few
items, type exit to quit the program. You can use the pgAdmin III program to view the
data in the table to ensure that the proper data was updated.
SUMMARY
The JDBC library provides a standard set of classes for Java applications to access data-
bases. PostgreSQL provides the necessary JDBC driver for Java applications to access a
PostgreSQL server.
The PostgreSQL JDBC driver comes in three versions, one for older Java 2 Standard
Edition environments (versions 1.2 and 1.3), one for newer Java 2 Standard Edition envi-
ronments (versions 1.4 and 1.5), and one for the Java 2 Enterprise Edition environment.
All of these JDBC drivers can be downloaded from the PostgreSQL JDBC web site or
installed with the PostgreSQL server installation package.
Once the PostgreSQL JDBC driver is included with a Java application, you can use
standard JDBC database classes to connect to the PostgreSQL server, insert, update, or
delete data, or retrieve query information. Query information can be read as a single re-
sult set returned from the server. Individual columns can be accessed from each result set
record using either the column name or the column index value. SQL commands can also
be sent to the PostgreSQL server as prepared statements. The server will prepare an ex-
ecution plan for the statement before the statement is actually executed. This can greatly
increase the performance of applications. This feature also allows you to create SQL com-
mands with variables that can be replaced at run time as the program is executed.
At this point you should be comfortable installing, configuring, and programming
a PostgreSQL server in a Microsoft Windows environment. Many different topics have
been discussed in the last 15 chapters, but there is always more to learn. The Open Source
database world in general, and PostgreSQL in particular, is constantly evolving. New
features are being added with each new update and version. I strongly encourage you to
frequently visit the PostgreSQL web site (www.postgresql.org) and actively participate
in the many user forums available.
This page intentionally left blank
INDEX
367
Copyright © 2007 by The McGraw-Hill Companies. Click here for terms of use.
368 PostgreSQL 8 for Windows
X C
—command statement option, 124
commandText object, 309
CommandText property, 300, 301, 303, 308
C# programming language, 288, 298, 314 CommandTimeout property, 300
Cache Size option, 273 CommandType property, 300, 301
CALLED ON NULL INPUT parameter, 221 COMMENT keyword, 145
Cancel as FreeStmt (Exp) option, 273 Comment section, 91, 95
Cancel method, 301 commercial database management system
Cascade Group Roles, 106 (DBMS) products, vs. PostgreSQL, 12–13
CASCADE parameter, 149 COMMIT command, 186, 188, 189, 194
cbrt(x) function, 212 COMMIT keyword, 143
C code, 109 CommLog option, 273
\c dbname command, 128 Common Language Runtime (CLR),
\cd dir command, 128 288, 289, 293, 315
ceiling(x) function, 212 condition statements, PL/pgSQL language, 226
ceil(x) function, 212 configuration files, 53–75
Central Processing Unit (CPU) performance, 33 controlling access via, 239–241
certificates, 244–245 pg_ hba. conf file, 70–74
C function, 314, 335, 338 connection-type field, 71
Change Scope button, 238 database field, 71
char_length(string) function, 207 example pg_ hba. conf records, 74
Character values, 111, 201 login-method field, 72–73
char data type, 97 network-address field, 72
CHECKPOINT keyword, 145 options field, 73–74
Chkpass module, 45 user field, 71–72
chr(int) function, 208 pg_ident.conf file, 74–75
CIDR format, 72 postgresql. conf file, 54–70
City column, 96 Autovacuum Parameters
CLASSPATH environment, 353, 354 section, 66–67
clause section Client Connection Defaults
FROM SELECT command, 172–174 section, 67–68
FOR SELECT command, 177 Connections and Authentication
FROM SELECT command section, 56–58
functions format, 173 Customized Options section, 70
joins format, 173–174 Error Reporting and Logging
standard table names section, 63–66
format, 172–173 File Locations section, 55
sub-select format, 173 Lock Management section, 68–69
Client Connection Defaults section, Query Tuning section, 62–63
postgresql. conf file, 67–68 Resource Usage section, 58–61
Clone method, 301 Runtime Statistics section, 66
CLOSE command, 194, 273 Version/Platform Compatibility
CLOSE keyword, 143 section, 69–70
Close( ) method, 306 Write Ahead Log (WAL)
CLR (Common Language Runtime), section, 61–62
288, 289, 293, 315 connect_timeout function, 326
clusterdb command, 79 Connection class, 356
CLUSTER keyword, 145 ConnectionLifeTime value, 297
Columns object, 97 CONNECTION LIMIT ‘value’ option, 154
Columns tab, 97, 98, 101 Connection object, 354, 356
\! command, 128 Connection property, 300, 301
command-line format, psql program, 122–127 Connections and Authentication section,
connection options, 122–123 postgresql. conf file, 56–58
feature options, 123–126 ConnectionString parameter, 299, 300
using command-line options, 126–127 ConnectionString property, 297, 299
370 PostgreSQL 8 for Windows
X F
FTP Browser link, 267, 293
FTP Browser page, 293
Full Text Index module, 45
\f [string] command, 135 function code, PL/pgSQL language, 225–227
FATAL level, 53 assigning values to variables, 225
features, PostgreSQL, 13–24 condition statements, 226
ACID compliant, 14–18 LOOP statements, 226–227
atomicity, 14–15 FUNCTION function, 218
consistency, 15 functions, 195–214
durability, 17–18 aggregate functions, 211–214
isolation, 15–16 built-in functions, 205–206
Generalized Search Tree ( GiST), 24 date and time functions, 209–211
nested transactions, 18 math functions, 211
roles, 21–22 operators, 196–205
rules, 20 NULL operators, 203–205
sub-selects, 18 numerical operators, 197–198
support for Binary Large Objects string operators, 198–203
(BLOBs), 20–21 PL/pgSQL language, creating, 218–222
table partitioning, 22–24 string functions, 206–209
transaction support, 13–14 types of, 217
triggers, 20 Functions object, 87
user-defined types, 21 Fuzzy String Match module, 45
views, 18–19
Feature Selection window, 268
FETCH command, 190, 192–194
FETCH keyword, 145
FETCH NEXT command, 193 X G
FieldCount property, 306 \g [file] command, 131
—field-separator separator option, 124 Generalized Search Tree (GiST), 24
Fieldsep character parameter, 137 Geographic Information Systems (GIS), 38
File button, 104 GetBoolean(int) method, 306
File DSN tab, 270 GetDataTypeName(int) method, 306
—file filename option, 124 GetDateTime(int) method, 306
File Locations section, postgresql. conf file, 55 GetDecimal(int) method, 306
File window, 104 GetDouble(int) method, 306
filtering output data, 163 Get External Data feature, 284
Finish button, 268, 270, 285 GetFloat(int) method, 306
firewalls, 236–239 GetInt16(int) method, 306
FirstName column, 96, 160, 163, 202–205 GetInt32(int) method, 306
FIRST option, 191 GetName(int) method, 306
float4 data type, 97 GetString(int) method, 306
float8 data type, 97 GetVals.java program, 359, 360
floor(x) function, 212 GIS (Geographic Information Systems), 38
F option, 77, 126, 222 GiST (Generalized Search Tree), 24
FOR clause, 177 GiST method, 24
FOR loop, 225–227, 337, 338, 341 Global button, 272
Format parameter, 137 global directory, 51
formatting meta-commands, 135–136 GRANT command, 156–158
FOR SHARE clause, 177 GRANT keyword, 143
FOR UPDATE clause, 177 GROUP BY clause section, SELECT
FORWARD ALL option, 191 command, 174–175
FORWARD option, 191 GROUP BY expression, 170, 174
FROM clause, 164, 172, 197 Group Roles, 84, 88–90, 106–110, 114, 117,
FROM keyword, 162, 171 133–135, 154–158, 167, 178, 179, 225
FROM option, 161 GUI applications, 345, 346
FROM SQL clause, 69
Index 373
X H
development packages, 39
User Interfaces package, 38
Install button, 317
handling data, 158–162 installing
deleting data, 161–162 Microsoft Platform Software
inserting data, 158–159 Development Kit (SDK), 319–322
modifying data, 159–161 modifying Visual C++ library
hard drive performance, 30–33 dependencies, 320–321
HasRows property, 306, 308 modifying Visual C++ search
HAVING clause section, SELECT command, 175 directories, 319–320
\H command, 135 removing Visual C++ Express
—help option, 125 Edition build restrictions,
high-level applications, 288 321–322
host, PQconnectdb( ) Connection Parameter . NET packages, 290–292
function, 326 Open Database Connectivity (ODBC)
hostaddr function, 326 driver, 266–268
—host hostname option, 124 Visual C++ Express Edition, 315–318
\h statement command, 128 installing PostgreSQL on Windows, 25–48
HTML format, 137 downloading PostgreSQL, 33–35
HTML mode, 125 installation process, 35–45
—html option, 124 Enable Contrib Modules
window, 43–44
Enable Procedural Languages
X I
window, 42–43
finishing install, 44–45
\i file command, 132 Initialise Database Cluster window,
ILIKE operator, 203 40–42
IMMUTABLE parameter, 221 Installation Options window, 36–39
importing data, with psql, 139–140 Service Configuration
Imports statement, 295 window, 39–40
include directory, 51 running PostgreSQL, 46–48
Index method, 183–185 manual method, 47–48
INFO level, 53 service method, 46–47
informational meta-commands, 128, system requirements, 26–33
133–135, 136, 139 Windows servers, 29–33
INHERIT option, 154 Windows workstations, 26–28
initcap(string) function, 208 int2 data type, 97
Initialise Database Cluster window, 40–42 int4 data type, 97
input/output meta-commands, 128, 132–133 Int8 As option, 274
IN ROLE option, 155 int8 data type, 97
IN ROLE rolelist option, 154 Integer Aggregator module, 45
INSERT check box, 232 Integer Array module, 45
INSERT command, 158–160, 167, 186, 187, 189, Inventory column, 99
232, 303, 334, 358, 359 /I parameter, 324
Insert commands check box, 115 Ipconfig Windows command, 247
INSERT event, 231 IS A FUNCTION operator, 196
inserting data, 158–159 ISBN module, 45
INSERT keyword, 143 IsClosed property, 306
INSERT privilege, 156 isfinite(interval) function, 210
Insert Username module, 45 isfinite(timestamp) function, 210
Install as a Service check box, 40 IS NOT NULL operator, 204
Installation Options section, 323 IS NULL operator, 204, 205
Installation Options window, 36–39, 41, 316 ISO C format, 68
Database Drivers package, 38–39 ISO format, 68
Database Server package, 37–38 isolation, 15–16
374 PostgreSQL 8 for Windows
X
log(b, x) function, 212
K Logging section, 63
login-method field, pg_ hba. conf file, 72–73
KSQO option, 273 LOGIN option, 154
Login Role, 134
Login Roles, 89, 109–110, 135, 146–148, 154–155,
X L
157, 159, 178, 240, 241, 249, 266, 271, 277, 280,
284, 297–299, 303, 311, 326–329, 333, 355
LOG level, 53
LAN (Local Area Network), 10, 11
log(x) function, 212
LANGUAGE keyword, 220
LOOP statements, PL/pgSQL language,
languagename parameter, 219
226–227
Language textbox, 229
Index 375
.NET redistributable package, 290, 294, 299, 315 NpgsqlNotificationEventArgs class, 296
network-address field, pg_ hba. conf file, 72 NpgsqlParameter class, 296
network sessions, encrypting, 242–249 NpgsqlParameterCollection class, 296, 309–311
creating SSL encryption key, 245–248 NpgsqlParameterCollection object, 309
enabling SSL, 244 NpgsqlParametersCollection class, 309
encryption keys and certificates, 244–245 NpgsqlTransaction class, 296
testing SSL encryption, 248–249 NTFS format, 27, 28, 29
network users, controlling, 236–241 NULL character, 139
controlling access via configuration files, NULL check box, 97
239–241 NULL field, 137
controlling access via firewalls, 236–239 NULL keyword, 205
testing remote connectivity, 241 NULL operators, 203–205
New button, 281 Null parameter, 137
New Folder icon, 319, 320 numerical operators, 197–198
New Group Role window, 106, 107 numeric data type, 97
New Login Role window, 109, 110, 155
NEW variable, 229
NEXT option, 191
NextResult( ) method, 306
—no-align option, 124
X O
NOLOGIN option, 154 objects, SQL, 146–158
Nonquery SQL command, 302, 330 assigning privileges, 155–158
—nopsqlrc option, 125 granting privileges to database
Notepad application, 54, 138, 321 objects, 156–158
NOTICE level, 53 granting privileges to roles, 158
NOTIFY command, 145 creating Group and Login Roles, 154–155
NOTIFY keyword, 145 database object, 146–147
NOWAIT parameter, 177 schema object, 148–149
now( ) function, 210 tables, 149–154
Npgsql binary package, 294 adding additional table elements,
NpgsqlCommandBuilder class, 296 151–154
NpgsqlCommand class, 296, 300–308, 311 defining base table, 150–151
ExecuteNonQuery( ) method, 302–303 octet_length(string) function, 207
ExecuteReader( ) method, 305–308 ODBC (Open Database Connectivity)
ExecuteScalar( ) method, 304–305 connection, configuring, 269–274
Prepare( ) method, 308 ODBC (Open Database Connectivity) driver, 8,
NpgsqlCommand object, 301, 309, 311 266–272, 278, 281, 286
NpgsqlConnection class, 296–300, 311 ODBC (Open Database Connectivity) session,
NpgsqlConnection object, 299 setting up, 277–278
NpgsqlDataAdapter class, 296 ODBC Data Source Administrative window, 278
NpgsqlDataReader class, 296, 305 ODBC Data Source Administrator window,
NpgsqlDataReader object, 305, 306, 307, 308, 311 269, 270
npgsql directory, 51 ODBC Driver Setup configuration window,
Npgsql driver, 38 270, 271
NpgsqlError class, 296 ODBC Driver Setup dialog box, 278
Npgsql error event, 296 ODBC option, 267
NpgsqlEventLog class, 296 OFFSET parameter, 176, 177, 189
NpgsqlException class, 296 \o file command, 132
Npgsql library, 289, 309, 311, 314, 330 OID column, 69
creating . NET applications with OID field, 93
Npgsql, 294–295 OID Options option, 274
downloading, 293 OLD variable, 229
installing, 294 ON keyword, 174
Npgsql namespace, 295 Open Database Connectivity (ODBC)
NpgsqlNoticeEventArgs class, 296 connection, configuring, 269–274
Index 377
X
using command-line options,
126–127 R
importing data with psql, 139–140
meta-commands, 127–138 Radians(x) function, 212
copy and large object RAID functions, 32
meta-commands, 136–138 RAID Levels, 32
formatting meta-commands, RAM (Random Access Memory)
135–136 performance, 33
informational meta-commands, Random( ) function, 212
133–135 R code, 109
input/output meta-commands, r code, 109
132–133 \r command, 131
380 PostgreSQL 8 for Windows
X
creating new applications, 89–104
creating new database, 89–91 Z
creating new schema, 91–93
creating tables, 94–101 \z [pattern] command, 133
entering and viewing data, 101–104 Zip column, 96
database maintenance, 112–114