Writing SQL Queries Bsics
Writing SQL Queries Bsics
Jackie Goldstein
November 2005
Summary: Learn to be more productive with SQL Server 2005 Express Edition with this quick
introduction to the T-SQL language and the basics of getting information from the database
using the SELECT statement.
Introduction
Fetching Data: SQL SELECT Queries
Conclusion
References
Introduction
With the availability of ever more powerful programming tools and environments such as
Visual Basic and Visual Studio.NET, as well as the availability of powerful database engines
such as the free SQL Server 2005 Express Edition, more and more people find themselves
having to learn the basics of SQL queries and statements. Sometimes they are professional
developers who are experienced in other types of programming, and sometimes they are
individuals whose expertise lies in other areas, but they suddenly find themselves
programming database applications for fun and/or profit. If you fall into one of these
categories, or are just curious about database programming, then this article is for you.
SQL Server 2005 Express offers you the opportunity to dive deeply into advanced databases
and database applications, while still being free of charge. It is the same core database
engine as all of the other versions in the SQL Server 2005, but it allows for easier setup and
distribution all at no cost. It supports all of the advanced database features including, views,
stored procedures, triggers, functions, native XML support, full T-SQL support, and high
performance.
The purpose of this article is to lay out the basic structure and use of SQL SELECT queries
and statements. These statements are part of Transact-SQL (T-SQL) language specification
and are central to the use of Microsoft SQL Server. T-SQL is an extension to the ANSI SQL
standard and adds improvements and capabilities, making T-SQL an efficient, robust, and
secure language for data access and manipulation.
Although many tools are available for designing your queries visually, such as the Visual
Database Tools that are available with Microsoft Visual Studio, it is still worthwhile and
important to understand the SQL language. There is a real benefit to understanding what the
visual tools are doing and why. There are also times when manually writing the necessary
SQL statement is the only, or simply the fastest, way to achieve what you want. It is also an
ideal way to learn how to use the full power of a relational database such as SQL Express.
Although there exist many different types of database, we will focus on the most common
type—the relational database. A relational database consists of one or more tables, where
eachtable consists of 0 or more records, or rows, of data. The data for each row is organized
into discrete units of information, known as fields or columns. When we want to show the
fields of a table, let's say the Customers table, we will often show it like this:
Many of the tables in a database will have relationships, or links, between them, either in a
one-to-one or a one-to-many relationship. The connection between the tables is made by
aPrimary Key – Foreign Key pair, where a Foreign Key field(s) in a given table is the Primary
Key of another table. As a typical example, there is a one-to-many relationship between
Customers and Orders. Both tables have a CustID field, which is the Primary Key of the
Customers table and is a Foreign Key of the Orders Table. The related fields do not need to
have the identical name, but it is a good practice to keep them the same.
A SQL SELECT statement can be broken down into numerous elements, each beginning with
a keyword. Although it is not necessary, common convention is to write these keywords in all
capital letters. In this article, we will focus on the most fundamental and common elements
of a SELECT statement, namely
SELECT
FROM
WHERE
ORDER BY
The most basic SELECT statement has only 2 parts: (1) what columns you want to return and
(2) what table(s) those columns come from.
If we want to retrieve all of the information about all of the customers in the Employees
table, we could use the asterisk (*) as a shortcut for all of the columns, and our query looks
like
If we want only specific columns (as is usually the case), we can/should explicitly specify them
in a comma-separated list, as in
which results in the specified fields of data for all of the rows in the table:
Explicitly specifying the desired fields also allows us to control the order in which the fields
are returned, so that if we wanted the last name to appear before the first name, we could
write
SELECT EmployeeID, LastName, FirstName, HireDate, City FROM Employees
The next thing we want to do is to start limiting, or filtering, the data we fetch from the
database. By adding a WHERE clause to the SELECT statement, we add one (or more)
conditions that must be met by the selected data. This will limit the number of rows that
answer the query and are fetched. In many cases, this is where most of the "action" of a
query takes place.
We can continue with our previous query, and limit it to only those employees living in
London:
resulting in
If you wanted to get the opposite, the employees who do not live in London, you would
write
It is not necessary to test for equality; you can also use the standard equality/inequality
operators that you would expect. For example, to get a list of employees who where hired on
or after a given date, you would write
Of course, we can write more complex conditions. The obvious way to do this is by having
multiple conditions in the WHERE clause. If we want to know which employees were hired
between two given dates, we could write
SELECT EmployeeID, FirstName, LastName, HireDate, City
FROM Employees
WHERE (HireDate >= '1-june-1992') AND (HireDate <= '15-december-1993')
resulting in
Note that SQL also has a special BETWEEN operator that checks to see if a value is between
two values (including equality on both ends). This allows us to rewrite the previous query as
We could also use the NOT operator, to fetch those rows that are not between the specified
dates:
Let us finish this section on the WHERE clause by looking at two additional, slightly more
sophisticated, comparison operators.
What if we want to check if a column value is equal to more than one value? If it is only 2
values, then it is easy enough to test for each of those values, combining them with
the OR operator and writing something like
However, if there are three, four, or more values that we want to compare against, the above
approach quickly becomes messy. In such cases, we can use the IN operator to test against a
set of values. If we wanted to see if the City was either Seattle, Tacoma, or Redmond, we
would write
Wildcard Description
_ (underscore) matches any single character
[] matches any single character within the specified range (e.g. [a-f]) or set (e.g. [a
[^] matches any single character not within the specified range (e.g. [^a-f]) or set (e
WHERE FirstName LIKE '_im' finds all three-letter first names that end with 'im' (e.g.
Jim, Tim).
WHERE LastName LIKE '%stein' finds all employees whose last name ends with 'stein'
WHERE LastName LIKE '%stein%' finds all employees whose last name includes
'stein' anywhere in the name.
WHERE FirstName LIKE '[JT]im' finds three-letter first names that end with 'im' and
begin with either 'J' or 'T' (that is, only Jim and Tim)
WHERE LastName LIKE 'm[^c]%' finds all last names beginning with 'm' where the
following (second) letter is not 'c'.
Here too, we can opt to use the NOT operator: to find all of the employees whose first name
does not start with 'M' or 'A', we would write
resulting in
The ORDER BY Clause
Until now, we have been discussing filtering the data: that is, defining the conditions that
determine which rows will be included in the final set of rows to be fetched and returned
from the database. Once we have determined which columns and rows will be included in
the results of our SELECT query, we may want to control the order in which the rows appear
—sorting the data.
To sort the data rows, we include the ORDER BY clause. The ORDER BY clause includes one
or more column names that specify the sort order. If we return to one of our
first SELECTstatements, we can sort its results by City with the following statement:
By default, the sort order for a column is ascending (from lowest value to highest value), as
shown below for the previous query:
If we want the sort order for a column to be descending, we can include the DESC keyword
after the column name.
The ORDER BY clause is not limited to a single column. You can include a comma-delimited
list of columns to sort by—the rows will all be sorted by the first column specified and then
by the next column specified. If we add the Country field to the SELECT clause and want to
sort by Country and City, we would write:
but this is not necessary and is rarely done. The results returned by this query are
It is important to note that a column does not need to be included in the list of selected
(returned) columns in order to be used in the ORDER BY clause. If we don't need to see/use
the Country values, but are only interested in them as the primary sorting field we could
write the query as
Conclusion
In this article we have taken a look at the most basic elements of a SQL SELECT statement
used for common database querying tasks. This includes how to specify and filter both the
columns and the rows to be returned by the query. We also looked at how to control the
order of rows that are returned.
Although the elements discussed here allow you to accomplish many data access / querying
tasks, the SQL SELECT statement has many more options and additional functionality. This
additional functionality includes grouping and aggregating data (summarizing, counting, and
analyzing data, e.g. minimum, maximum, average values). This article has also not addressed
another fundamental aspect of fetching data from a relational database—selecting data from
multiple tables.
References
Additional and more detailed information on writing SQL queries and statements can be
found in these two books:
McManus, Jeffrey P. and Goldstein, Jackie, Database Access with Visual Basic.NET (Third
Edition), Addison-Wesley, 2003
Hernandez Michael J. and Viescas, John L., SQL Queries for Mere Mortals, Addison-Wesley,
2000.
Jackie Goldstein is the principal of Renaissance Computer Systems, specializing in consulting,
training, and development with Microsoft tools and technologies. Jackie is a Microsoft Regional
Director and MVP, founder of the Israel VB User Group, and a featured speaker at international
developer events including TechEd, VSLive!, Developer Days, and Microsoft PDC. He is also the
author ofDatabase Access with Visual Basic.NET (Addison-Wesley, ISBN 0-67232-3435) and a
member of the INETA Speakers Bureau. In December 2003, Microsoft designated Jackie as a
.NET Software Legend.