SQL Subquery PDF

Download as pdf or txt
Download as pdf or txt
You are on page 1of 15

SQL

SERIES

How to Write
SUBQUERY
Subqueries are a powerful SQL resource, allowing us to combine data
from multiple tables in a single query.

the simplest definition of a SQL subquery is “A query inside a query”.

Although subqueries are generally used in the WHERE clause, you


can use them in other clauses, such as FROM, HAVING, and SELECT.

Basic Subqueries by Example


Before going into subqueries, we need to explain our database
tables. I’ll use examples related to beautiful and relaxing places.

Our sample database will have two tables. The first table is called
best_10_places and it stores the 10 best places for different kinds of
activities (like snorkeling, skiing, and trekking).

The table has columns for the place name, the activity we can do
there, the ranking of this place, and the closest city. Have a look:
If you want to travel to any of these beautiful places, you’ll need a
ticket; the one_way_ticket table has one record for any pair of cities
that are connected by any kind of transportation.

We will use this table to determine how to get from one city to
another.

The columns contain info on the origin city, destination city, ticket
price, travel time, and transportation type (e.g. rail, air, etc.). Below is
a subset of this table:

linkedin.com/in/ileonjose
Now we’re ready for the first example.

Let’s suppose a person in Paris wants to go to the #1 place in the


world for snorkeling.

What type of transportation goes from Paris to this place?

As you probably know, the simplest SQL query is formed by a


SELECT, a FROM, and (optionally) a WHERE clause.

And as we previously mentioned, a subquery is a query inside a


query.

So, in the next example you’ll see two queries: the main query (also
called the outer query) and the subquery (in red):

SELECT city_destination, transportation, ticket_price, travel_time


FROM one_way_ticket
WHERE city_destination = (
SELECT closest_city
FROM best_10_places
WHERE activity_type = 'snorkeling'
AND ranking_position = 1
)
AND city_origin = 'Paris'
The subquery is executed first, returning the closest_city to the best
snorkeling destination (the city of Florianopolis in Brazil). Then the
main query is executed, replacing the subquery with its result
(Florianopolis). The final result is:

When using subqueries:


You must always enclose the subquery in parenthesis.
Pay attention to the operator used to compare the subquery
result. In our previous example, we used “=”; however, this
operator should be used with subqueries that return only one
row and only one column (also known as “scalar” subqueries).

Scalar or Non-Scalar Subqueries


So, a scalar subquery returns only one column with only one row.
What’s a non-scalar subquery? A subquery that returns multiple
rows.

There are many operators we can use to compare a column with a


subquery.

However, a few of them can only be used with scalar subqueries: =, >,
>=, < and <=. If you use one of these operators, your subquery must
be scalar.
Let’s see an example with a scalar subquery.

Suppose you have a customer who wants to go from Paris to


Bariloche. Before buying the ticket, the customer wants to see if
there are any places with a cheaper ticket. The query below will find
those cities:

SELECT city_destination, ticket_price, travel_time, transportation


FROM one_way_ticket
WHERE ticket_price < (
SELECT ticket_price
FROM one_way_ticket
WHERE city_destination = 'Bariloche'
AND city_origin = 'Paris'
)
AND city_origin = 'Paris'

Again, the subquery is executed first; its result (the price of a Paris-
Bariloche ticket, or $970) is compared with the column ticket_price
in the outer query.

This obtains all the records in one_way_ticket with a ticket_price


value less than $970. The result of the query is shown below:
Other operators, like IN, EXISTS or NOT EXISTS, > ALL, = ANY, can be
used with scalar or non-scalar subqueries.

Our next example uses the IN operator.

Let’s suppose the person who asked about the best place for
snorkeling wants to explore other destinations; in fact, they’d like to
see the top three snorkeling places.

The change in our subquery is clear: we only need to change


“ranking_position = 1” with “ranking_position <= 3”. However, our
subquery will return three records and will not be scalar anymore.
We’ll use the IN operator, like so:

SELECT city_destination, transportation, ticket_price, travel_time


FROM one_way_ticket
WHERE city_destination IN (
SELECT closest_city
FROM best_10_places
WHERE activity_type = 'snorkeling'
AND ranking_position <= 3
)
AND city_origin = 'Paris'
As in the previous example, the database first executes the subquery,
which returns a list of three cities (the closest cities to the top 3
snorkeling destinations: Florianopolis, Honolulu, and Melbourne).

Then the outer query is executed with these conditions:

city_destination IN ('Florianopolis', 'Honolulu', 'Melbourne')

The IN operator returns TRUE when the value of city_destination is


one of these three cities.

Thus, the main query returns the following result:


Advanced Subqueries
The subquery concept is easy to understand. But because of SQL’s
flexibility, subqueries can be used in many different forms. Covering
all the possible usages is beyond the scope of this post. What we will
do instead is demonstrate some of the most important uses.

How many different places can you put a subquery?

Subqueries can be used in different places in a SQL query, including


the WHERE, FROM, HAVING, and SELECT clauses; moreover, a
subquery can also be used as part of an UPDATE, DELETE, or INSERT
statement.

In the next example, we will see how to use a subquery in the FROM
clause.

Suppose the travel agency’s owner wants to show every city along
with the ticket cost and the number of “best places” near this city.

To obtain the quantity of “best places” for each city, we will use a
subquery (shown in red) in the FROM clause to create a pseudo table.

Then the outer query will JOIN with one_way_ticket and the pseudo
table.
SELECT city_destination, ticket_price, pseudo_table.quantity
FROM one_way_ticket
JOIN (
SELECT closest_city AS city, count(*) AS quantity
FROM best_10_places
GROUP BY 1
) pseudo_table
ON one_way_ticket.pseudo_table.city

The result of this query is:


EXISTS: A subquery-oriented operator

One of the most powerful operators you can use with subqueries is
the EXISTS operator.

As we can see in the example below, the EXISTS operator must come
before the subquery.

It will return TRUE if the subquery returns at least one row – no


matter what the row content is.

If the subquery returns 0 rows, EXISTS will return FALSE.

For the next example, let’s suppose our customer from Paris wants to
travel to a place where they can do both trekking and snorkeling.

The query below can be used to reply to this customer:

linkedin.com/in/ileonjose
SELECT city_destination, transportation, ticket_price, travel_time
FROM one_way_ticket
WHERE EXISTS (
SELECT closest_city
FROM best_10_places
WHERE activity_type = 'snorkeling'
AND closest_city = one_way_ticket.city_destination
)
AND EXISTS (
SELECT closest_city
FROM best_10_places
WHERE activity_type = 'trekking'
AND closest_city = one_way_ticket.city_destination
)
AND city_origin = 'Paris'

The result shows the records related to cities with trekking and
snorkeling activities:
One interesting point in the previous subquery is the reference to the
one_way_ticket.city_destination column in the outer query.

Subqueries that reference columns in the outer query are called


“correlated subqueries” and have some specific behaviors.

As in the previous example, correlated subqueries tend to be used


with the EXISTS and NOT EXISTS subquery operators.

The ALL and ANY operators

This pair of operators works in conjunction with the =, <>, >, >=, < and
<= operators, adding more expressivity to the language.

Due the high number of possible combinations with ALL and ANY,
I’ve included a table with the most common uses of these operators:
Let’s apply this operator to a real-life example.

Suppose we want to promote all the “world’s best places” you can
visit with a ticket under $1,000.

Each “best place” in the table best_10_places can have many


possible tickets; we are only interested in those places where at least
one ticket costs less than $1000. The query is as follows:

SELECT Place_name, Activity, Ranking_position


FROM best_10_places
WHERE 1000 > ANY (
SELECT ticket_price
FROM one_way_ticket
WHERE city_destination = best_10_places.closest_city
)

The results of the previous query are shown below. You can go to the
best place for any activity (snorkeling, skiing, and trekking) for less
than $1,000!
If you find this helpful, Repost

for more content.

linkedin.com/in/ileonjose

You might also like