0% found this document useful (0 votes)
5 views27 pages

CSIS 3300 W11 QueryOptimization

Uploaded by

rodrigoferraribr
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
0% found this document useful (0 votes)
5 views27 pages

CSIS 3300 W11 QueryOptimization

Uploaded by

rodrigoferraribr
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 27

CSIS 3300 – Database - II

Week 12: Advanced Joins and Query Optimization


Nikhil Bhardwaj
Agenda

 Different type of joins


 Query Optimization
Joins
Inner Join

 Combines the common rows.


 Example: SELECT * FROM productdb.City INNER JOIN
Province ON City.provinceId = Province.provinceId;
 Check that the proviceId is common.

6,7

11,
12,13
Full Join (Cross Join in MySQL)

 Join or cross join is the Cartesian product


 Example: SELECT * FROM City CROSS JOIN Province ;
 Every city is listed with every Province
Left Join
 All the rows from the left table are included.
 Non matching rows from the right table have NULL values
 In MySQL LEFT JOIN is same as LEFT OUTER JOIN
 Example: SELECT * FROM City LEFT JOIN Province ON City.provinceId
= Province.provinceId;
Vs
SELECT * FROM Province LEFT JOIN City ON City.provinceId =
Province.provinceId;
Left Excluding Join

 All the rows from the left table which are not in right table are
included.
 Example: SELECT * FROM Province LEFT OUTER JOIN City ON
City.provinceId = Province.provinceId Where City.provinceId
IS NULL;
Right Join

 All the rows from the right table are included.


 Non matching rows from the left table have NULL values
 In MySQL RIGHT JOIN is same as RIGHT OUTER JOIN
 Example: SELECT * FROM City RIGHT JOIN Province ON
City.provinceId = Province.provinceId;
Right Excluding Join

 All the rows from the left table which are not in right table are
included.
 SELECT * FROM City RIGHT OUTER JOIN Province ON City.provinceId =
Province.provinceId Where City.provinceId IS NULL;
Outer Join

 Not Supported in MySQL


 Can use Union (Not UNION ALL) of Left and Right join to emulate
 Example: SELECT * FROM City LEFT OUTER JOIN Province ON
City.provinceId = Province.provinceId
UNION
SELECT * FROM City RIGHT OUTER JOIN Province ON
City.provinceId = Province.provinceId;
Outer Excluding Join

 Not supported in MySQL


 Can emulate with union of left and right excluding join
 Example: SELECT * FROM City LEFT OUTER JOIN Province ON
City.provinceId = Province.provinceId Where Province.provinceId
IS NULL
UNION
SELECT * FROM City RIGHT OUTER JOIN Province ON
City.provinceId = Province.provinceId
Where City.provinceId IS NULL;
Self Join

 MySQL allows to join a table with itself


 Usually done when a record refers to another record
 Example: In an employee table if you have an attribute reports_to which
is the id of another employee, you can use self join. The syntax is same
as INNER JOIN but the same table name is used
 SELECT m.lastname, m.firstname, e.lastname, e.firstname
FROM employees e
INNER JOIN employees m ON m.employeeNumber = e.reportsto

Ref: http://www.mysqltutorial.org/mysql-self-join/
Lab
 Can you do the similar exercise with Province and
Country tables?

 How about three tables?


 When using LEFT Excluding Join what would you add
in the where clause? Would you use AND or OR to
combine multiple conditions?
Sub Queries

 When a query depends upon another query’s output


 Format
 SELECT * FROM t1 WHERE column1 = (SELECT column1 FROM t2);
 SELECT ... FROM (subquery) [AS] tbl_name ...
 E.g. Find how many people live in the area where the customer who
spent most money lives
 We need to find the most spending customer’s postal code and THEN we need
to find all the customers in that postal code.
Query with a sub query

select count(UserId) as UserCount


from User
where postalCodeFSA = (select postalCodeFSA
from OrderItem as OI
Inner join `Order` as O ON OI.orderId = O.orderId
Inner join Product as P ON OI.productId = P.productId
Inner join User as U ON O.userId = U.userId
group by O.userId
order by sum(quantity * price) DESC
limit 1)
Sub Queries
 MySQL doesn’t allow for limit of more than 1 so if we modify the previous question to
 Find how many people live in the area where the top 3 customer who spent most money lives
 We need to find the top 3 most spending customer’s postal codes and THEN we need to find
all the customers in those postal code.
 To do this we will update the previous subquery to
select postalCodeFSA
from OrderItem as OI
Inner join `Order` as O ON OI.orderId = O.orderId
Inner join Product as P ON OI.productId = P.productId
Inner join User as U ON O.userId = U.userId
group by O.userId
order by sum(quantity * price) DESC
limit 3

 Since we can’t use limit of 3 in the subquery after where clause, we can modify our query to
Query with a sub query

select count(DISTINCT UserId) as UserCount from User


INNER JOIN (select postalCodeFSA
from OrderItem as OI
Inner join `Order` as O ON OI.orderId = O.orderId
Inner join Product as P ON OI.productId = P.productId
Inner join User as U ON O.userId = U.userId
group by O.userId
order by sum(quantity * price) DESC
limit 3) as t1 ON User.postalCodeFSA = t1.postalCodeFSA
Query Optimization

 Explain
 Selecting queries
 Tips
Explain Statement
mysql> explain SELECT first_name, last_name FROM employees.employees where first_name like
'S%';
+----+-------------+-----------+-------+---------------+--------------+---------+------+-------+--------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra
+----+-------------+-----------+-------+---------------+--------------+---------+------+-------+--------------------------+
| 1 | SIMPLE | employees | range | fullname_ind | fullname_ind | 16 | NULL | 70858 | Using
where; Using index |
+----+-------------+-----------+-------+---------------+--------------+---------+------+-------+--------------------------+

 More details on http://www.sitepoint.com/using-explain-to-write-better-mysql-queries/


Explain Statement
 id – a sequential identifier for each SELECT within the query (e.g. nested
subqueries)
 select_type – the type of SELECT query. Possible values are:
 SIMPLE – No subqueries or UNIONs
 PRIMARY – the SELECT is in the outermost query in a JOIN
 DERIVED – the SELECT is part of a subquery within a FROM clause
 SUBQUERY – the first SELECT in a subquery
 DEPENDENT SUBQUERY – subquery which is dependent upon on outer query
 UNCACHEABLE SUBQUERY
 UNION – the SELECT is the second or later statement of a UNION
 DEPENDENT UNION – dependent on an outer query
 UNION RESULT
 table – the table referred to by the row
Explain Statement
 type – how MySQL joins the tables used. Most insightful
 system – the table has only zero or one row
 const –only one matching and this is the fastest type of join
 eq_ref – all parts are used; PRIMARY KEY or UNIQUE NOT NULL
 ref – all of the matching rows of an indexed column are read for each combination of
rows from the previous table
 fulltext – uses FULLTEXT index
 ref_or_null – can contain null value for column
 index_merge – list of indexes
 unique_subquery – 1 result in subquery and uses PK
 index_subquery – 1+ result in subquery
 range –key column is compared to a constant using operators like BETWEEN, IN, >,
>=, etc.
 index – the entire index tree is scanned to find matching rows.
 all – the entire table is scanned, worst join type
Explain Statement

 possible_keys –keys that can be used by MySQL


 Might or might not be used
 key – the actual index used by MySQL.
 may contain an index that is not listed in the possible_key column
 key_len – indicates the length of the index the Query Optimizer chose to use
 ref - columns or constants that are compared to the index named in the key
column
 rows – lists the number of records that were examined to produce the output
 Extra – contains additional information regarding the query execution plan.
Values such as “Using temporary”, “Using filesort”, etc. in this column may
indicate a troublesome query
Select Query Optimization

 Where clause
 Use minimum number for conditions to get the specific result set you want
 Remove unnecessary parentheses
 ((a AND b) AND c OR (((a AND b) AND (c AND d))))
  (a AND b AND c) OR (a AND b AND c AND d)
  (a AND b AND c)
 Are all the conditions same? Yes, think why?

 Constant folding
 (a<b AND b=c) AND a=5

 b>5 AND b=c AND a=5


Tips

 Use Limit 1 (or another limit) when you only need a subset.
 Index columns which are used in where clause.
 Indexing works best in text search if you are providing at least the first
character such as “where first_name like 'S%'”. A search like “where
first_name like ‘%S%'” might miss out on using index unless a fulltext search
is used.
 Avoid Select *, its almost always better to specify column names.
 Use Not NULL if you can (during schema design)
 Fixed length (static) tables are faster to query
 A fixed length table is one which does not use any of VARCHAR, TEXT, BLOB
for column types.
Tips

 Use vertical partitioning


 In a table if you have columns which are not read often, they can be moved to
another table by splitting the main table into two. E.g. if the user table has
address field which is not read often, you can split user table into user and
userDetail (both using userId as the PK, thus making a 1-1 relationship) and
keep the less accessed details in userDetail table.
 If you have columns which gets updated too frequently (such as lastLogin)
you can keep that in a separate table to keep the less frequently updated
data in the cache.
 Use the smallest data type which fulfills the foreseeable future data
needs
Tips

 Storage engine
 MyISAM
 Good for read heavy (analytics kind of applications)
 Doesn’t scale well for write heavy apps
 Doesn’t support referential integrity and transactions

 InnoDB
 Slower than MyISAM but has more feature and can handle writes more efficiently
 Has improved a lot lately so we should use InnoDB most of the times.
References

 SQL join diagram from


http://www.codeproject.com/Articles/33052/Visual-Representation-of-SQL
-Joins

 Graphics from:
http://blog.codinghorror.com/a-visual-explanation-of-sql-joins/

 Explain statement details:


http://www.sitepoint.com/using-explain-to-write-better-mysql-queries/

You might also like