SQL For Data Science
SQL For Data Science
Vincent ISOZ
V6.0 r13
2018-12-09
oUUID 1839
Vincent ISOZ Structured Query Language/SQL
Table Of Contents
1 Useful Links ....................................................................................................................... 8
2 Introduction ........................................................................................................................ 9
2.1 History ...................................................................................................................... 11
2.2 Syntax ....................................................................................................................... 12
2.3 Procedural extensions ............................................................................................... 13
2.4 Standardization ......................................................................................................... 14
2.5 Well Know RDBMS using SQL .............................................................................. 15
2.6 Why IBM Oracle at University? .............................................................................. 16
2.7 Recommended References ....................................................................................... 17
3 Lenszynski-Reddick Naming convention ........................................................................ 20
4 SQL for DML (Data Manipulation Language) ................................................................ 21
4.1 Comments IN SQL ................................................................................................... 23
4.2 SQL Version ............................................................................................................. 24
4.3 SQL SELECT Statement .......................................................................................... 25
4.4 SQL USE Statement ................................................................................................. 28
4.4.1 SQL DESCRIBE ................................................................................................ 28
4.4.2 SQL Aliases........................................................................................................ 29
4.4.3 SQL COLLATION Statement ........................................................................... 31
4.4.4 SQL random sample ........................................................................................... 34
4.5 SQL UNION ............................................................................................................ 35
4.6 SQL SELECT DISTINCT and DISTINCTROWStatement .................................... 38
4.7 SQL WHERE Clause ............................................................................................... 39
4.7.1 WHERE with interactive parameters ................................................................. 39
4.7.2 WHERE using COLLATION ............................................................................ 41
4.7.3 WHERE using IS NULL or IS NOT NULL ...................................................... 42
4.8 SQL AND & OR Operators ..................................................................................... 46
4.9 SQL ORDER BY Keyword ..................................................................................... 47
4.10 SQL INSERT INTO Statement ................................................................................ 48
4.10.1 Insert a Null value .............................................................................................. 48
4.10.2 Copy the rows of a table into another one .......................................................... 49
4.11 SQL UPDATE Statement......................................................................................... 51
4.12 SQL DELETE Statement ......................................................................................... 52
4.13 SQL SELECT TOP (and aka BOTTOM) Clause .................................................... 53
4.14 SQL LIKE Operator ................................................................................................. 56
4.14.1 SQL Wildcards ................................................................................................... 56
4.14.2 SQL REGEX ...................................................................................................... 57
4.15 SQL IN Operator ...................................................................................................... 58
4.16 SQL BETWEEN and NOT BETWEEN Operators ................................................. 59
4.17 SQL Cartesian Product ............................................................................................. 60
4.18 SQL JOIN ................................................................................................................. 61
4.18.1 SQL INNER JOIN statement ............................................................................. 61
4.18.1.1 INNER JOIN with 2 tables ......................................................................... 61
4.18.1.2 INNER JOIN with 4 tables ......................................................................... 63
4.18.2 SQL LEFT JOIN statement (OUTER JOIN Family) ........................................ 65
4.18.3 SQL RIGHT JOIN statement (OUTER JOIN FAMILY) ................................. 66
4.18.4 SQL FULL OUTER JOIN statement (OUTER JOIN FAMILY) ..................... 68
4.18.5 SQL SELF JOIN (circular join) like syntax ...................................................... 71
- 2/350 -
Vincent ISOZ Structured Query Language/SQL
- 3/350 -
Vincent ISOZ Structured Query Language/SQL
- 4/350 -
Vincent ISOZ Structured Query Language/SQL
- 5/350 -
Vincent ISOZ Structured Query Language/SQL
- 6/350 -
Vincent ISOZ Structured Query Language/SQL
- 7/350 -
Vincent ISOZ Structured Query Language/SQL
1 Useful Links
The most important link as it gives you the possibility to use Oracle Enterprise online for free to
train your skills:
http://www.oracle.com/technetwork/database/application-development/livesql/index.html
http://www.google.com
http://www.youtube.com
http://sqlformat.org/
http://www.oracle.com/technetwork/documentation/index.html#database
http://www.dba-oracle.com
http://www.sqlines.com/online
1
B1: first year Bachelor, B2: Second year of Bachelor, B3: Third year of Bachelor, M1: First year Master, M2:
Second Year Master, Phd: " Philosophiæ doctor" level (=M2+[1;4])
- 8/350 -
Vincent ISOZ Structured Query Language/SQL
2 Introduction
This PDF has for purpose to introduce the basics of SQL for Data Scientists in a 5 days
training. The most important chapter for Data Scientists will be the last chapter at page
314. Database files are given only to the people that follow my courses.
SQL (Structured Query Language) is a special-purpose data query language designed for managing
data held in a relational database management system (RDBMS). There are obviously (and sadly…)
other data query languages, for example (for a more exhaustive list refer to Wikipedia):
• XPath
• DAX
• M
• Dplyr
• Data.table
• Panda
• JQuery
• …
Originally based upon relational algebra and tuple relational calculus, SQL consists mainly of a data
definition language and a data manipulation language. The scope of SQL includes data insert,
query, update and delete, schema creation and modification, and data access control. Although
SQL is often described as, and to a great extent is, a declarative language (4GL), it also includes
procedural elements.
- 9/350 -
Vincent ISOZ Structured Query Language/SQL
SQL was one of the first commercial languages for Edgar F. Codd's relational model, as described in
his influential 1970 paper "A Relational Model of Data for Large Shared Data Banks". Despite not
entirely adhering to the relational model as described by Codd, it became the most widely used
database language
SQL became a standard of the American National Standards Institute (ANSI) in 1986, and of the
International Organization for Standards (ISO) in 1987. Since then, the standard has been
enhanced several times with added features. Despite these standards, code is not completely
portable among different database systems, which can lead to vendor lock-in. The different makers
do not perfectly adhere to the standard, for instance by adding extensions, and the standard itself
is sometimes ambiguous.
- 10/350 -
Vincent ISOZ Structured Query Language/SQL
2.1 History
SQL was initially developed at IBM by Donald D. Chamberlin and Raymond F. Boyce in the early
1970s when IBM created the first databases (on the bases of a paper written by the mathematician
Edgar Franck Codd). This version, initially called SEQUEL (Structured English Query Language),
was designed to manipulate and retrieve data stored in IBM's original quasi-relational database
management system, System R, which a group at IBM San Jose Research Laboratory had
developed during the 1970s. The acronym SEQUEL was later changed to SQL because "SEQUEL"
was a trademark of the UK-based Hawker Siddeley aircraft company.
In the late 1970s, Relational Software, Inc. (now Oracle Corporation) saw the potential of the
concepts described by Codd, Chamberlin, and Boyce and developed their own SQL-based RDBMS
with aspirations of selling it to the U.S. Navy, Central Intelligence Agency, and other U.S.
government agencies. In June 1979, Relational Software, Inc. introduced the first commercially
available implementation of SQL, Oracle V2 (Version2) for VAX computers.
After testing SQL at customer test sites to determine the usefulness and practicality of the system,
IBM began developing commercial products based on their System R prototype including
System/38, SQL/DS, and DB2, which were commercially available in 1979, 1981, and 1983,
respectively.
- 11/350 -
Vincent ISOZ Structured Query Language/SQL
2.2 Syntax
• Clauses, which are constituent components of statements and queries. (In some cases,
these are optional.)
• Predicates, which specify conditions that can be evaluated to SQL three-valued logic
(3VL) (true/false/unknown) orBoolean truth values and which are used to limit the effects
of statements and queries, or to change program flow.
• Queries, which retrieve the data based on specific criteria. This is an important element
of SQL.
• Statements, which may have a persistent effect on schemata and data, or which may
control transactions, program flow, connections, sessions, or diagnostics.
• SQL statements also include the semicolon (";") statement terminator. Though not required
on every platform, it is defined as a standard part of the SQL grammar.
- 12/350 -
Vincent ISOZ Structured Query Language/SQL
In addition to the standard SQL/PSM extensions and proprietary SQL extensions, procedural
and object-oriented programmability is available on many SQL platforms via DBMS integration with
other languages. The SQL standard defines SQL/JRT extensions (SQL Routines and Types for the
Java Programming Language) to support Java code in SQL databases. SQL Server 2005 uses
the SQLCLR (SQL Server Common Language Runtime) to host managed .NET assemblies in the
database, while prior versions of SQL Server were restricted to using unmanaged extended stored
procedures that were primarily written in C. PostgreSQL allows functions to be written in a wide
variety of languages including Perl, Python, Tcl, and C
- 13/350 -
Vincent ISOZ Structured Query Language/SQL
2.4 Standardization
SQL was adopted as a standard by the American National Standards Institute (ANSI) in 1986 as
SQL-86 and the International Organization for Standardization (ISO) in 1987. Nowadays the
standard is subject to continuous improvement by the Joint Technical Committee ISO/IEC JTC 1,
Information technology, Subcommittee SC 32, Data management and interchange which affiliate
to ISO as well as IEC. It is commonly denoted by the pattern: ISO/IEC 9075-n:yyyy Part n: title,
or, as a shortcut, ISO/IEC 9075.
Until 1996, the National Institute of Standards and Technology (NIST) data management standards
program certified SQL DBMS compliance with the SQL standard. Vendors now self-certify the
compliance of their products.
The original SQL standard declared that the official pronunciation for SQL is "es queue el". Many
English-speaking database professionals still use the original pronunciation /ˈsiːkwəl/ (like the word
"sequel"), including Donald Chamberlin himself.
- 14/350 -
Vincent ISOZ Structured Query Language/SQL
• HyperFileSQL • Oracle • H2
• Ingres • PostgreSQL
All these systems have some particularities which some are not found in others.
Moreover, it is always interesting to refer to the reference manual RDBMS during special
or complex queries, as well as their optimization.
- 15/350 -
Vincent ISOZ Structured Query Language/SQL
• Oracle has indexes choices that are much more interesting for advanced data management
• The SQL language of Oracle has graduate statistical functions that others don't have
- 16/350 -
Vincent ISOZ Structured Query Language/SQL
- 17/350 -
Vincent ISOZ Structured Query Language/SQL
- 18/350 -
Vincent ISOZ Structured Query Language/SQL
B1-B3 Level
- 19/350 -
Vincent ISOZ Structured Query Language/SQL
http://en.wikipedia.org/wiki/Leszynski_naming_convention
- 20/350 -
Vincent ISOZ Structured Query Language/SQL
http://www.w3schools.com/sql/
Our SQL tutorial will teach you how to use SQL to access and manipulate data in: MySQL,
SQL Server, Access, Oracle, Sybase, DB2, and other database systems.
With our online SQL editor, you can edit the SQL statements, and click on a button to view the
result.
Example:
SELECT * FROM Customers;
In this tutorial we will use the well-known Northwind sample database (included in
MS Access and MS SQL Server).
- 21/350 -
Vincent ISOZ Structured Query Language/SQL
Keep in Mind That... SQL is NOT case sensitive: "SELECT" is the same as "select"
Some database systems require a semicolon at the end of each SQL statement. Semicolon is
the standard way to separate each SQL statement in database systems that allow more than
one SQL statement to be executed in the same call to the server. In this tutorial, we will use
semicolon at the end of each SQL statement.
- 22/350 -
Vincent ISOZ Structured Query Language/SQL
With /*...*/, as in C.
Thus:
-- This is a comment
SELECT * /* and so is this */
FROM R;
- 23/350 -
Vincent ISOZ Structured Query Language/SQL
SELECT @@version;
and in Oracle:
- 24/350 -
Vincent ISOZ Structured Query Language/SQL
And on Oracle:
And still on oracle to get all tables of the actual database use:
- 25/350 -
Vincent ISOZ Structured Query Language/SQL
SELECT column_name,column_name
FROM table_name;
and:
When you select only a few columns, we say we're using an "SQL projection"...
- 26/350 -
Vincent ISOZ Structured Query Language/SQL
4 Around the Horn Thomas Hardy 120 Hanover London WA1 1DP UK
Sq.
The following SQL statement selects the "CustomerName" and "City" columns from the
"Customers" table:
The following SQL statement selects all the columns from the "Customers" table:
- 27/350 -
Vincent ISOZ Structured Query Language/SQL
USE dbNorthwind
Or depending on the technology here you can see an example of the beginning of a query using
two tables of two different tables (SQL Server):
where other user is the name of an another (without quotes!) user who has access to another
scheme.
- 28/350 -
Vincent ISOZ Structured Query Language/SQL
SELECT column_name(s)
FROM table_name AS alias_name;
- 29/350 -
Vincent ISOZ Structured Query Language/SQL
....
10643 1 6 1997-08-25 1
10644 88 3 1997-08-25 2
10645 34 4 1997-08-26 1
...
The following SQL statement specifies two aliases, one for the CustomerName column and one for
the ContactName column.
Tip: It require double quotation marks or square brackets if the column name contains spaces:
It will give:
...
In the following SQL statement, we combine four columns (Address, City, PostalCode, and Country)
and create an alias named "Address":
it will give:
CustomerName Address
- 30/350 -
Vincent ISOZ Structured Query Language/SQL
Ana Trujillo Emparedados y helados Avda. de la Constitución 2222, México D.F., 05021, Mexico
...
The following SQL statement selects all the orders from the customer "Alfreds Futterkiste". We use
the "Customers" and "Orders" tables, and give them the table aliases of "c" and "o" respectively
(Here we have used aliases to make the SQL shorter):
• There are more than one table involved in a query (see later JOINS)
To see what is collation we will focus on Oracle. First create the following table:
- 31/350 -
Vincent ISOZ Structured Query Language/SQL
- 32/350 -
Vincent ISOZ Structured Query Language/SQL
As you can see the order is return given the binary ASCII code of character. To have a more
suitable result corresponding to your language you have to specify your collation using National
Language Support (NLS) statement:
You can see all available collation by running the following query:
- 33/350 -
Vincent ISOZ Structured Query Language/SQL
Here is the syntax to take a random sample of 1000 rows on SQL Server:
SELECT *
FROM (
SELECT *
FROM DEMO_ORDER_ITEMS
ORDER BY
dbms_random.value
)
WHERE rownum <= 10
Here is the syntax to take a random sample of 25% percent of the total number of rows in Oracle:
- 34/350 -
Vincent ISOZ Structured Query Language/SQL
Notice that each SELECT statement within the UNION must have the same number of columns. The
columns must also have similar data types. Also, the columns in each SELECT statement must be
in the same order.
Note: The UNION operator selects only distinct values by default. To allow duplicate values, use
the ALL keyword with UNION.
Note: The column names in the result-set of a UNION are usually equal to the column names in
the first SELECT statement in the UNION.
2 New Orleans Cajun Shelley Burke P.O. Box New 70117 USA
Delights 78934 Orleans
3 Grandma Kelly's Regina Murphy 707 Oxford Ann Arbor 48104 USA
Homestead Rd.
- 35/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement selects all the different cities (only distinct values) from the
"Customers" and the "Suppliers" tables:
Note: UNION cannot be used to list ALL cities from the two tables. If several customers and
suppliers share the same city, each city will only be listed once. UNION selects only distinct values.
Use UNION ALL to also select duplicate values!
The following SQL statement uses UNION ALL to select all (duplicate values also) cities from the
"Customers" and "Suppliers" tables:
The following SQL statement uses UNION ALL to select all (duplicate values also) German cities
from the "Customers" and "Suppliers" tables:
With Oracle you can do something sometimes interesting by adding à separation line that seems
works only with UNION ALL:
- 36/350 -
Vincent ISOZ Structured Query Language/SQL
- 37/350 -
Vincent ISOZ Structured Query Language/SQL
The DISTINCT keyword can be used to return only distinct (different) values.
The following SQL statement selects only the distinct values from the "City" columns from the
"Customers" table:
The following SQL statement, that seems to exist only in Microsoft Access, selects only the distinct
records from the whole table (including also not visible columns from the SELECT statement) the
"City" columns from the "Customers" table:
- 38/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement selects all the customers from the country "Mexico", in the
"Customers" table:
SQL requires single quotes around text values (most database systems will also allow double
quotes).
Operator Description
= Equal
<> Not equal. Note: In some versions of SQL this operator may be written as !=
:OneWord
- 39/350 -
Vincent ISOZ Structured Query Language/SQL
- 40/350 -
Vincent ISOZ Structured Query Language/SQL
- 41/350 -
Vincent ISOZ Structured Query Language/SQL
As you can see the system in now case insensitive (CI) but still sensitive to accents!
To make the query case insensitive and accent insensitive just write:
- 42/350 -
Vincent ISOZ Structured Query Language/SQL
as you can see there an no results. If you try the following you will get the same problem:
- 43/350 -
Vincent ISOZ Structured Query Language/SQL
- 44/350 -
Vincent ISOZ Structured Query Language/SQL
- 45/350 -
Vincent ISOZ Structured Query Language/SQL
The OR operator displays a record if either the first condition OR the second condition is true.
Remark: And direct XOR doesn't exist actually in SQL! You must use a logical workaround
to get it.
The following SQL statement selects all customers from the country "Germany" AND the city
"Berlin", in the "Customers" table:
The following SQL statement selects all customers from the city "Berlin" OR "München", in the
"Customers" table:
You can also combine AND and OR (use parenthesis to form complex expressions).
The following SQL statement selects all customers from the country "Germany" AND the city must
be equal to "Berlin" OR "München", in the "Customers" table:
- 46/350 -
Vincent ISOZ Structured Query Language/SQL
The ORDER BY keyword sorts the records in ascending order by default. To sort the records in a
descending order, you can use the DESC keyword.
SELECT column_name,column_name
FROM table_name
ORDER BY column_name,column_name ASC|DESC;
The following SQL statement selects all customers from the "Customers" table, sorted by the
"Country" column:
The following SQL statement selects all customers from the "Customers" table, sorted
DESCENDING by the "Country" column:
The following SQL statement selects all customers from the "Customers" table, sorted by the
"Country" and the "CustomerName" column:
- 47/350 -
Vincent ISOZ Structured Query Language/SQL
The first form does not specify the column names where the data will be inserted, only their
values:
The second form specifies both the column names and the values to be inserted:
We will see late the INSERT ALL statement to insert multiple rows at once!
Assume we wish to insert a new row in the "Customers" table. We can use the following SQL
statement:
The CustomerID column is automatically updated with a unique number for each record in the table
when you use the INSERT INTO statement.
The following SQL statement will insert a new row, but only insert data in the "CustomerName",
"City", and "Country" columns (and the CustomerID field will of course also be updated
automatically):
- 48/350 -
Vincent ISOZ Structured Query Language/SQL
then you can copy some or all of the rows of the original into the new one:
To create a copy of a table with its data and with its structure then you can simply use:
- 49/350 -
Vincent ISOZ Structured Query Language/SQL
- 50/350 -
Vincent ISOZ Structured Query Language/SQL
Notice the WHERE clause in the SQL UPDATE statement! The WHERE clause specifies which
record or records that should be updated. If you omit the WHERE clause, all records will be
updated!
Assume we wish to update the customer "Alfreds Futterkiste" with a new contact person and city.
UPDATE Customers
SET ContactName='Alfred Schmidt', City='Hamburg'
WHERE CustomerName='Alfreds Futterkiste';
- 51/350 -
Vincent ISOZ Structured Query Language/SQL
Notice the WHERE clause in the SQL DELETE statement! The WHERE clause specifies which
record or records that should be deleted. If you omit the WHERE clause, all records will be deleted!
Assume we wish to delete the customer "Alfreds Futterkiste" from the "Customers" table.
It is possible to delete all rows in a table without deleting the table. This means that the table
structure, attributes, and indexes will be intact:
Note: Be very careful when deleting records. You cannot undo this statement!
- 52/350 -
Vincent ISOZ Structured Query Language/SQL
The SELECT TOP clause can be very useful on large tables with thousands of records. Returning a
large number of records can impact on performance.
Note: Not all database systems support the SELECT TOP clause.
Products that selects the two first records from the "Customers" table
or with percent selects the first 50% of the records from the "Customers" table:
MySQL Syntax:
SELECT column_name(s)
FROM table_name
LIMIT number;
SELECT *
FROM Persons
LIMIT 5;
Oracle Syntax:
SELECT column_name(s)
FROM table_name
WHERE ROWNUM <= number;
- 53/350 -
Vincent ISOZ Structured Query Language/SQL
- 54/350 -
Vincent ISOZ Structured Query Language/SQL
- 55/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement selects all customers with a City starting with the letter "s":
or on MS Access:
The following SQL statement selects all customers with a Country containing the pattern "land":
Using the NOT keyword allows you to select records that does NOT match the pattern.
The following SQL statement selects all customers with a Country NOT containing the pattern
"land":
Wildcard Description
The following SQL statement selects all customers with a City starting with any character, followed
by "erlin":
- 56/350 -
Vincent ISOZ Structured Query Language/SQL
In MS Access the following Statement work for only one letter (but that's not standard SQL):
The following SQL statement selects all customers with a City starting with "b", "s", or "p":
The following SQL statement selects all customers with a City starting with "a", "b", or "c":
The following SQL statement selects all customers with a City NOT starting with "b", "s", or "p":
Or the equivalent:
Will be written:
And:
Will be written:
- 57/350 -
Vincent ISOZ Structured Query Language/SQL
SQL IN Syntax:
SELECT column_name(s)
FROM table_name
WHERE column_name IN (value1,value2,...);
The following SQL statement selects all customers with a City of "Paris" or "London":
MS Access will write the same automatically as following (but previous syntax will still
work)...:
SELECT * FROM Customers
WHERE (City="Alain") OR (City="Albert");
- 58/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement selects all products with a price BETWEEN 10 and 20:
To display the products outside the range of the previous example, use NOT BETWEEN:
The following SQL statement selects all products with a price BETWEEN 10 and 20, but products
with a CategoryID of 1,2, or 3 should not be displayed:
The following SQL statement selects all products with a ProductName beginning with any of the
letter BETWEEN 'C' and 'M':
The following SQL statement selects all products with a ProductName beginning with any of the
letter NOT BETWEEN 'C' and 'M':
The following SQL statement selects all orders with an OrderDate BETWEEN '04-July-1996' and '09-
July-1996':
- 59/350 -
Vincent ISOZ Structured Query Language/SQL
You will then have de cartesian product of all the combinations... for sure this is not what you
are expecting... Then see what's next about JOIN operator.
- 60/350 -
Vincent ISOZ Structured Query Language/SQL
or:
- 61/350 -
Vincent ISOZ Structured Query Language/SQL
SELECT column_name(s)
FROM table1
JOIN table2
ON table1.column_name=table2.column_name;
PS! INNER JOIN is the same as JOIN.
...
10308 2 7 1996-09-18 3
10309 37 3 1996-09-19 1
10310 77 8 1996-09-20 2
...
The following SQL statement will return all customers with orders:
and compare this query with the example of the cartesian product:
- 62/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement will return all customers with orders and the saler name:
- 63/350 -
Vincent ISOZ Structured Query Language/SQL
We will use this query late for the study of CROSS JOIN statement.
It can be seen that the SQL generated by Microsoft Access is far from ideal ... Even if by copying
straight into MySQL or other this code works perfectly (just need to adapt the name of one table
and one of the fields!).
By cons the opposite does not apply! Copying the given SQL at the beginning in Microsoft
Access will not work (even by adapting small differences in names) !!!
- 64/350 -
Vincent ISOZ Structured Query Language/SQL
Remarks: Starting with Oracle9i, the confusing outer join syntax using the ‘(+)' notation has been
superseded by ISO 1999 outer join syntax. As we know, there are three types of outer joins, left,
right, and full outer join. The purpose of an outer join is to include non-matching rows, and the
outer join returns these missing columns as NULL values.
or:
SELECT column_name(s)
FROM table1
LEFT OUTER JOIN table2
ON table1.column_name=table2.column_name;
...
- 65/350 -
Vincent ISOZ Structured Query Language/SQL
10308 2 7 1996-09-18 3
10309 37 3 1996-09-19 1
10310 77 8 1996-09-20 2
...
The following SQL statement will return all customers, and any orders they might have:
The LEFT JOIN keyword will then return all the rows from the left table (Customers), even if there
are no matches in the right table (Orders)
CustnomerName OrderID
...
- 66/350 -
Vincent ISOZ Structured Query Language/SQL
10308 2 7 1996-09-18 3
10309 37 3 1996-09-19 1
10310 77 8 1996-09-20 2
...
...
The following SQL statement will return all employees, and any orders they have sell:
- 67/350 -
Vincent ISOZ Structured Query Language/SQL
OrderID FirstName
Adam
10248 Steven
10249 Michael
10250 Margaret
10251 Janet
...
As you can see Adam did never sell anything but is still visible. Try now:
and you will see that you have then only employees that did sell something
(Adam will not be visible anymore).
You seem to be asking, "If I can rewrite a RIGHT OUTER JOIN using LEFT
OUTER JOIN syntax then why have a RIGHT OUTER JOIN syntax at all?" I think
the answer to this question is, because the designers of the language
didn't want to place such a restriction on users (and I think they would
have been criticized if they did), which would force users to change the
order of tables in the FROM clause in some circumstances when merely
changing the join type.
- 68/350 -
Vincent ISOZ Structured Query Language/SQL
The FULL OUTER JOIN keyword combines the result of both LEFT and RIGHT joins.
MySQL & Microsoft Access lacks support for FULL OUTER JOIN!!!
Below is a selection from the "Customers" table:
...
10308 2 7 1996-09-18 3
10309 37 3 1996-09-19 1
10310 77 8 1996-09-20 2
The following SQL statement selects all customers, and all orders:
- 69/350 -
Vincent ISOZ Structured Query Language/SQL
CustomerName OrderID
Alfreds Futterkiste
10382
10351
...
To do the same in mySQL you will need (enjoy not being on Oracle)...:
- 70/350 -
Vincent ISOZ Structured Query Language/SQL
I give you imagine how to deal with multiple FULL JOINS in mySQL looks like…
For this example, first create the following table structure in Oracle:
INSERT ALL
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1002,'Murphy','Diane','x5800','dmurphy@classicmodelcars.com','1',NULL,'President')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1056,'Patterson','Mary','x4611','mpatterso@classicmodelcars.com','1',1002,'VP Sales')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1076,'Firrelli','Jeff','x9273','jfirrelli@classicmodelcars.com','1',1002,'VP Marketing')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1088,'Patterson','William','x4871','wpatterson@classicmodelcars.com','6',1056,'Sales Manager (APAC)')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1102,'Bondur','Gerard','x5408','gbondur@classicmodelcars.com','4',1056,'Sale Manager (EMEA)')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1143,'Bow','Anthony','x5428','abow@classicmodelcars.com','1',1056,'Sales Manager (NA)')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1165,'Jennings','Leslie','x3291','ljennings@classicmodelcars.com','1',1143,'Sales Rep')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1166,'Thompson','Leslie','x4065','lthompson@classicmodelcars.com','1',1143,'Sales Rep')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1188,'Firrelli','Julie','x2173','jfirrelli@classicmodelcars.com','2',1143,'Sales Rep')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1216,'Patterson','Steve','x4334','spatterson@classicmodelcars.com','2',1143,'Sales Rep')
INTO hier_employees(employeeNumber,lastName,firstName,extension,email,officeCode,reportsTo,jobTitle) values
(1286,'Tseng','Foon Yue','x2248','ftseng@classicmodelcars.com','3',1143,'Sales Rep')
- 71/350 -
Vincent ISOZ Structured Query Language/SQL
In the employees table, we store not only employee's data but also organization structure data.
The REPORTSTO column is used to determine the manager ID of an employee.
In order to get the whole organization structure, we can join the HIER_EMPLOYEES table to itself
using the EMPLOYEENUMBER and REPORTSTO columns.
- 72/350 -
Vincent ISOZ Structured Query Language/SQL
- 73/350 -
Vincent ISOZ Structured Query Language/SQL
• Orgcharts!
• MindMaps!
• Forum threads!
• ...
- 74/350 -
Vincent ISOZ Structured Query Language/SQL
The following query will create the complete structure of employees from the president to the
bottom down employee:
- 75/350 -
Vincent ISOZ Structured Query Language/SQL
or in a prettier way:
- 76/350 -
Vincent ISOZ Structured Query Language/SQL
- 77/350 -
Vincent ISOZ Structured Query Language/SQL
they are other CONNECT BY statement available in Oracle... for more see on Google.
You won't found very interesting example of this query in books for non-statisticians but remember
that we saw in the Stastics coures how to proceed to a chi-2 test of independence using a cross
table and this is the case where such query can be very useful to link the resulting view to a
statistical software.
This type of query can also be used to generate a table with a combinations of vendors names and
sales dates to make statistical forecasting for each vendor with all existing dates (see Quantitative
Finance course).
For an example consider first the following query on the W3 School website:
- 78/350 -
Vincent ISOZ Structured Query Language/SQL
ON Employees.EmployeeID= Orders.EmployeeID
INNER JOIN OrderDetails
ON OrderDetails.OrderID= Orders.OrderID
INNER JOIN Orders
ON Customers.CustomerID = Orders.CustomerID
GROUP BY Customers.CustomerName, Employees.LastName
ORDER BY Customers.CustomerName;
And now to make a contingency table of Customers with Employees and Quantity we have:
- 79/350 -
Vincent ISOZ Structured Query Language/SQL
- 80/350 -
Vincent ISOZ Structured Query Language/SQL
- 81/350 -
Vincent ISOZ Structured Query Language/SQL
- 82/350 -
Vincent ISOZ Structured Query Language/SQL
Each SQL statement within the SQL INTERSECT query must have the same number of fields in the
result sets with similar data types.
As an example, we have using the W3School website the possibility to obtain all customers ID that
have made an order:
in other words: if a CustomId appears in both the Customers and Orders table, it would appear in your
result set.
- 83/350 -
Vincent ISOZ Structured Query Language/SQL
This is equivalent to an INNER JOIN with a GROUP but the INNER JOIN solution is more flexible
because you can take the columns you want!:
SELECT Customers.CustomerID
FROM Customers
INNER JOIN Orders
ON Customers.CustomerID=Orders.CustomerID
GROUP BY Customers.CustomerID;
Each SQL SELECT statement within the SQL MINUS query must have the same number of fields in
the result sets with similar data types.
We can't use the MINUS statement on the W3School webiste. We will then focus with a small example
on Oracle.
- 84/350 -
Vincent ISOZ Structured Query Language/SQL
This can also be done with a LEFT OUTER JOIN (useful for MS Access for example) without the
limitation of MINUS statement (possibility to take the columns you want):
- 85/350 -
Vincent ISOZ Structured Query Language/SQL
- 86/350 -
Vincent ISOZ Structured Query Language/SQL
A subquery is a SELECT statement within another SQL statement. The SQL statement can be
SELECT, WHERE clause, FROM clause, JOIN, INSERT, UPDATE, DELETE, SET, DO, or another
subquery.
The query that contains the subquery is normally called outer query and the subquery itself is
called inner query.
If the subquery returns only one value, we speak about "single value subquery" or "scalar
subquery".
• Subqueries structure a complex query into isolated parts so that a complex query can be
broken down into a series of logical steps for easy understanding and code maintenance.
• Subqueries allow you to use the results of another query in the outer query.
• In some cases, subqueries can replace complex joins and unions and subqueries are easier
to understand.
When subquery is used, the database server (actually the query optimizer) may need to perform
additional steps, such as sorting, before the results from the subquery are used. If a query that
contains subqueries can be rewritten as a join, you should use join rather than subqueries. This is
because using join typically allows the query optimizer to retrieve data in the most efficient way. In
other words, the optimizer is more mature for MySQL for joins than for subqueries, so in many
cases a statement that uses a subquery can be executed more efficiently if you rewrite it as a join.
- 87/350 -
Vincent ISOZ Structured Query Language/SQL
This query returns data for all customers and their orders where the orders were shipped on
the most recent recorded day.
SELECT OrderID, CustomerID
FROM Orders
WHERE OrderDate=(SELECT MAX(OrderDate) FROM ORDERS);
This query returns all products whose unit price is greater than average unit price.
SELECT DISTINCT ProductName, Price
FROME Products
WHERE Price>(SELECT AVG(UnitPrice) FROM Products)
ORDER BY UnitPrice DESC;
This query retrieves a list of customers that made purchases after the date 1997-02-05.
The query below returns the same result (on the W3 School website!) as query above because the
list of CustomerIDs are used rather than the subquery:
- 88/350 -
Vincent ISOZ Structured Query Language/SQL
For sure the same result can be obtained using the JOIN statement (often, a query that contains
subqueries can be rewritten as a join). Using inner join allows the query optimizer to retrieve data
in the most efficient way:
SELECT *
FROM Customers
WHERE (City,Country) IN (
SELECT City, Country FROM Suppliers
)
This example won't work on the W3 School website due to implementation limitation of the web
interface (see the alternative below with the green background). We also won't lose time to import
a database to test this in Oracle.
SELECT *
FROM Customers
WHERE ROW(City,Country) IN (
SELECT City, Country FROM Suppliers
)
If none of the above works you cans use the EXIST statement (see later) with the following syntax
(this will work on the W3 School website):
SELECT *
FROM Customers
WHERE EXISTS (
SELECT * FROM Suppliers
WHERE Customers.City= Suppliers.City AND Customers.Country =
Suppliers.Country
)
- 89/350 -
Vincent ISOZ Structured Query Language/SQL
A correlated subquery can usually be rewritten as a join query. Using joins enables the database
engine to use the most efficient execution plan. The query optimizer is more mature for joins than
for subqueries, so in many cases a statement that uses a subquery should normally be rephrased
as a join to gain the extra speed in performance.
Note that alias must be used to distinguish table names in the SQL query that contains correlated
subqueries.
SELECT *
FROM Customers
WHERE EXISTS (
SELECT * FROM Suppliers
WHERE Customers.City= Suppliers.City AND Customers.Country =
Suppliers.Country
)
belongs to the family of correlated subqueries because the subquery use the Customers.City and
Customers.Country attributes of the outer query.
The query below query finds out a list of orders and their customers who ordered more than 20
items of ProductID 6 on a single order.
SELECT a.OrderID,
a.CustomerID
FROM Orders AS a
WHERE
(
SELECT Quantity
FROM OrderDetails as b
WHERE a.OrderID = b.OrderID and b.ProductID = 6
) > 20;
Because EXISTS are used with correlated subqueries, the subquery executes once for every row in
the outer query. In other words, for each row in outer query, by using information from the outer
query, the subquery checks if it returns TRUE or FALSE, and then the value is returned to outer
query to use.
Remember we already saw such an example (all Customers that have a Supplier in the same City
and Country as their home address):
- 90/350 -
Vincent ISOZ Structured Query Language/SQL
SELECT *
FROM Customers
WHERE EXISTS (
SELECT * FROM Suppliers
WHERE Customers.City= Suppliers.City AND Customers.Country =
Suppliers.Country
)
But don't forget that this can also be done with a JOIN statement!
Because NO EXISTS are used with correlated subqueries, the subquery executes once for every
row in the outer query. In other words, for each row in outer query, by using information from the
outer query, the subquery checks if it returns TRUE or FALSE, and then the value is returned to
outer query to use.
- 91/350 -
Vincent ISOZ Structured Query Language/SQL
SELECT *
FROM Customers
WHERE NOT EXISTS (
SELECT * FROM Suppliers
WHERE Customers.City= Suppliers.City AND Customers.Country =
Suppliers.Country
)
For the examples below we will use the following EMP Oracle table:
- 92/350 -
Vincent ISOZ Structured Query Language/SQL
4.19.7.1 ALL
The ALL comparison condition is used to compare a value to a list or subquery. It must be
preceded by =, !=, >, <, <=, >= and followed by a list or subquery.
When the ALL condition is followed by a list, the optimizer expands the initial condition to all
elements of the list and strings them together with AND operators, as shown below.
- 93/350 -
Vincent ISOZ Structured Query Language/SQL
When the ALL condition is followed by a subquery, the optimizer performs a two-step
transformation as shown below.
- 94/350 -
Vincent ISOZ Structured Query Language/SQL
- 95/350 -
Vincent ISOZ Structured Query Language/SQL
4.19.7.2 ANY
The ANY comparison condition is used to compare a value to a list or subquery. It must be
preceded by =, !=, >, <, <=, >= and followed by a list or subquery.
When the ANY condition is followed by a list, the optimizer expands the initial condition to all
elements of the list and strings them together with OR operators, as shown below.
- 96/350 -
Vincent ISOZ Structured Query Language/SQL
When the ANY condition is followed by a subquery, the optimizer performs a single transformation
as shown below:
- 97/350 -
Vincent ISOZ Structured Query Language/SQL
- 98/350 -
Vincent ISOZ Structured Query Language/SQL
4.19.7.3 SOME
The SOME and ANY comparison conditions do exactly the same thing and are completely
interchangeable!
- 99/350 -
Vincent ISOZ Structured Query Language/SQL
- 100/350 -
Vincent ISOZ Structured Query Language/SQL
The SELECT INTO statement copies data from one table and inserts it into a new table.
Examples:
SELECT *
INTO newtable [IN externaldb]
FROM table1;
Or we can copy only the columns we want into the new table:
SELECT column_name(s)
INTO newtable [IN externaldb]
FROM table1;
Tip: The new table will be created with the column-names and types as defined in the SELECT
statement. You can apply new names using the AS clause.
The examples below won't work on W3 Schools website or even in Oracle (see the lasts queries in
the screenshots to see how to do this in Oracle) or MySQL but will directly work with MS Access
SELECT *
INTO CustomersBackup2013
FROM Customers;
SELECT *
INTO CustomersBackup2013 IN 'Backup.mdb'
FROM Customers;
SELECT *
INTO CustomersBackup2013
FROM Customers
WHERE Country='Germany';
Copy data from more than one table into the new table:
- 101/350 -
Vincent ISOZ Structured Query Language/SQL
Tip: The SELECT INTO statement can also be used to create a new, empty table using the schema
of another. Just add a WHERE clause that causes the query to return no data:
SELECT *
INTO newtable
FROM table1
WHERE 1=0;
In Oracle you will have to run if the table does not already exist:
- 102/350 -
Vincent ISOZ Structured Query Language/SQL
We can copy all columns from one table to another, existing table:
Or we can copy only the columns we want to into another, existing table:
2 New Orleans Shelley Burke P.O. Box New 70117 USA (100)
Cajun Delights 78934 Orleans 555-
4822
- 103/350 -
Vincent ISOZ Structured Query Language/SQL
- 104/350 -
Vincent ISOZ Structured Query Language/SQL
It will not be possible to create a database in Oracle Express because the first and only database is
created during installation with the CREATE DATABASE Statement. In MS Access and on W3 School
you can also not use CREATE DATABASE statement.
This example query create a database named as db_DemoTest in this case I omitted PRIMARY
option and the first file is assumed as a primary file. The logical name of this file is
DB_DemoTestData as I mentioned in query. File name parameter is for specify physical location for
the database file *.mdf in Local disk C:\ in my hard drive.
The original size of this file is 20MB, Additional 20MB from disk may allocated by the system if it
needed (FILEGROWTH).
If MAXSIZE option is not specified or it set to unlimited the file will dynamically use all space in disk
as it grows.
- 105/350 -
Vincent ISOZ Structured Query Language/SQL
You close an reopen Microsoft SQL Server Management Studio and then you will see:
5.3.2 On mySQL
For the example we will download and install XAMP:
http://www.apachefriends.org/fr/xampp.html
After installation:
- 106/350 -
Vincent ISOZ Structured Query Language/SQL
Clic on phpMyAdmin:
- 107/350 -
Vincent ISOZ Structured Query Language/SQL
Clic on SQL:
Now type:
- 108/350 -
Vincent ISOZ Structured Query Language/SQL
- 109/350 -
Vincent ISOZ Structured Query Language/SQL
Tables are organized into rows and columns; and each table must have a name.
The column_name parameters specify the names of the columns of the table.
The data_type parameter specifies what type of data the column can hold (e.g. varchar, integer,
decimal, date, etc.). See tables after queries examples for data types list for various DB.
The size parameter specifies the maximum length of the column of the table.
Now we want to create an empty table called "Persons" that contains five columns: PersonID,
LastName, FirstName, Address, and City. On the W3 School website type:
- 110/350 -
Vincent ISOZ Structured Query Language/SQL
SQL developers have to decide what types of data will be stored inside each and every table
column when creating a SQL table. The data type is a label and a guideline for SQL to understand
what type of data is expected inside of each column, and it also identifies how SQL will interact
with the stored data.
TIMESTAMP Stores year, month, day, hour, minute, and second values
- 111/350 -
Vincent ISOZ Structured Query Language/SQL
nchar(size) Where size is the number of characters to Maximum size of 2000 bytes.
store. Fixed-length NLS string Space
padded.
nvarchar2(size) Where size is the number of characters to Maximum size of 4000 bytes.
store. Variable-length NLS string.
varchar2(size) Where size is the number of characters to Maximum size of 4000 bytes.
store. Variable-length string. Maximum size of 32KB in PLSQL.
Number types:
- 112/350 -
Vincent ISOZ Structured Query Language/SQL
Date types:
For example:
timestamp(5) with time zone
timestamp (fractional Includes year, month, day, hour, minute, fractional seconds precision
seconds precision) with and seconds; with a time zone must be a number between 0 and
local time zone expressed as the session time zone. 9. (default is 6)
For example:
timestamp(4) with local time zone
interval year Time period stored in years and months. year precision is the number of
(year precision) digits in the year. (default is 2)
to month For example:
interval year(4) to month
interval day Time period stored in days, hours, day precision must be a number
(day precision) minutes, and seconds. between 0 and 9. (default is 2)
to second (fractional
seconds precision) For example: fractional seconds precision
interval day(2) to second(6) must be a number between 0 and
9. (default is 6)
- 113/350 -
Vincent ISOZ Structured Query Language/SQL
Row ID Datatypes:
Text Use for text or combinations of text and numbers. 255 characters
maximum
Currency Use for currency. Holds up to 15 digits of whole dollars, plus 4 8 bytes
decimal places.Tip: You can choose which country's currency to use
AutoNumber AutoNumber fields automatically give each record its own number, 4 bytes
- 114/350 -
Vincent ISOZ Structured Query Language/SQL
usually starting at 1
Ole Object Can store pictures, audio, video, or other BLOBs (Binary Large up to
OBjects) 1GB
Lookup Wizard Let you type a list of options, which can then be chosen from a 4 bytes
drop-down list
Tableau 11 Microsoft Access Data Types
Text types:
CHAR(size) Holds a fixed length string (can contain letters, numbers, and special
characters). The fixed size is specified in parenthesis. Can store up to 255
characters
VARCHAR(size) Holds a variable length string (can contain letters, numbers, and special
characters). The maximum size is specified in parenthesis. Can store up to
255 characters. Note: If you put a greater value than 255 it will be converted
to a TEXT type
BLOB For BLOBs (Binary Large OBjects). Holds up to 65,535 bytes of data
MEDIUMBLOB For BLOBs (Binary Large OBjects). Holds up to 16,777,215 bytes of data
LONGBLOB For BLOBs (Binary Large OBjects). Holds up to 4,294,967,295 bytes of data
ENUM(x,y,z,etc.) Let you enter a list of possible values. You can list up to 65535 values in an
ENUM list. If a value is inserted that is not in the list, a blank value will be
inserted.
Note: The values are sorted in the order you enter them.
SET Similar to ENUM except that SET may contain up to 64 list items and can
- 115/350 -
Vincent ISOZ Structured Query Language/SQL
Number types:
TINYINT(size) -128 to 127 normal. 0 to 255 UNSIGNED*. The maximum number of digits
may be specified in parenthesis
FLOAT(size,d) A small number with a floating decimal point. The maximum number of digits
may be specified in the size parameter. The maximum number of digits to the
right of the decimal point is specified in the d parameter
DOUBLE(size,d) A large number with a floating decimal point. The maximum number of digits
may be specified in the size parameter. The maximum number of digits to the
right of the decimal point is specified in the d parameter
DECIMAL(size,d) A DOUBLE stored as a string, allowing for a fixed decimal point. The
maximum number of digits may be specified in the size parameter. The
maximum number of digits to the right of the decimal point is specified in the
d parameter
*The integer types have an extra option called UNSIGNED. Normally, the integer goes from an
negative to positive value. Adding the UNSIGNED attribute will move that range up so it starts at
zero instead of a negative number.
Date types:
TIMESTAMP() *A timestamp. TIMESTAMP values are stored as the number of seconds since
the Unix epoch ('1970-01-01 00:00:00' UTC). Format: YYYY-MM-DD
HH:MM:SS
- 116/350 -
Vincent ISOZ Structured Query Language/SQL
*Even if DATETIME and TIMESTAMP return the same format, they work very differently. In an
INSERT or UPDATE query, the TIMESTAMP automatically set itself to the current date and time.
TIMESTAMP also accepts various formats, like YYYYMMDDHHMMSS, YYMMDDHHMMSS, YYYYMMDD,
or YYMMDD.
char(n) Fixed width character string. Maximum 8,000 characters Defined width
text Variable width character string. Maximum 2GB of text 4 bytes + number
data of chars
nchar Fixed width Unicode string. Maximum 4,000 characters Defined width x 2
- 117/350 -
Vincent ISOZ Structured Query Language/SQL
Number types:
float(n) Floating precision number data from -1.79E + 308 to 1.79E + 308. 4 or 8
bytes
The n parameter indicates whether the field should hold 4 or 8
bytes. float(24) holds a 4-byte field and float(53) holds an 8-byte
field. Default value of n is 53.
- 118/350 -
Vincent ISOZ Structured Query Language/SQL
Date types:
datetime From January 1, 1753 to December 31, 9999 with an accuracy of 8 bytes
3.33 milliseconds
datetime2 From January 1, 0001 to December 31, 9999 with an accuracy of 6-8
100 nanoseconds bytes
smalldatetime From January 1, 1900 to June 6, 2079 with an accuracy of 1 minute 4 bytes
date Store a date only. From January 1, 0001 to December 31, 9999 3 bytes
datetimeoffset The same as datetime2 with the addition of a time zone offset 8-10
bytes
timestamp Stores a unique number that gets updated every time a row gets
created or modified. The timestamp value is based upon an internal
clock and does not correspond to real time. Each table may have
only one timestamp variable
sql_variant Stores up to 8,000 bytes of data of various data types, except text, ntext,
and timestamp
The following table shows some of the common names of data types between the various database
platforms:
- 119/350 -
Vincent ISOZ Structured Query Language/SQL
If there is any violation between the constraint and the data action, the action is aborted by the
constraint.
Constraints can be specified when the table is created (inside the CREATE TABLE statement) or
after the table is created (inside the ALTER TABLE statement).
• UNIQUE - Ensures that each row for a column must have a unique value
• PRIMARY KEY - A combination of a NOT NULL and UNIQUE. Ensures that a column (or
combination of two or more columns) have an unique identity which helps to find a
particular record in a table more easily and quickly
• FOREIGN KEY - Ensure the referential integrity of the data in one table to match values in
another table
• DEFAULT - Specifies a default value when specified none for this column
The best way to study all these options is to use a real RDBMS. We will also use Oracle...!
- 120/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL enforces the "P_Id" column and the "LastName" column to not accept NULL
values:
If you try then to insert a row without the LastName you will receive an error:
- 121/350 -
Vincent ISOZ Structured Query Language/SQL
- 122/350 -
Vincent ISOZ Structured Query Language/SQL
The UNIQUE and PRIMARY KEY constraints both provide a guarantee for uniqueness for a column
or set of columns.
Note that you can have many UNIQUE constraints per table, but only one PRIMARY KEY constraint
per table.
If the creation of a UNIQUE Constraint fails this is because you already have duplicates data
existing in your table in the chosen field.
MySQL:
- 123/350 -
Vincent ISOZ Structured Query Language/SQL
MySQL:
- 124/350 -
Vincent ISOZ Structured Query Language/SQL
- 125/350 -
Vincent ISOZ Structured Query Language/SQL
Primary keys must contain unique values and primary key column cannot contain NULL
values. Each table should also have at least one primary key.
If the creation of a PRIMARY KEY fail this is because you already have duplicates data
existing in your table in the chosen field.
as you can see this result in an horrible Index Name. The better is then to use:
CREATE TABLE Persons
(
P_Id int NOT NULL,
LastName varchar(255) NOT NULL,
FirstName varchar(255),
Address varchar(255),
City varchar(255),
CONSTRAINT pkPerson PRIMARY KEY (P_Id)
)
- 126/350 -
Vincent ISOZ Structured Query Language/SQL
MySQL:
MySQL:
- 127/350 -
Vincent ISOZ Structured Query Language/SQL
FirstName varchar(255),
Address varchar(255),
City varchar(255),
PRIMARY KEY (LastName,FirstName)
)
MySQL:
- 128/350 -
Vincent ISOZ Structured Query Language/SQL
you can't with Oracle without PL/SQL disable multiple constraints. With SQL Server there is a nice
query to disable all at once (see on Google).
- 129/350 -
Vincent ISOZ Structured Query Language/SQL
To do this we will run the follwing SQL in Oracle (this code must also work for mySQL, Access and
others...):
with:
- 130/350 -
Vincent ISOZ Structured Query Language/SQL
and:
- 131/350 -
Vincent ISOZ Structured Query Language/SQL
on MySQL:
- 132/350 -
Vincent ISOZ Structured Query Language/SQL
A foreign key with a cascade deletion can be defined in either a CREATE TABLE statement or an
ALTER TABLE statement.
Then you can try... If you delete a customer, the related FidelityCard will be removed. Same thing if
you remove only the sale!
- 133/350 -
Vincent ISOZ Structured Query Language/SQL
If you define a CHECK constraint on a single column it allows only certain values for this column.
If you define a CHECK constraint on a table it can limit the values in certain columns based on
values in other columns in the row.
And:
- 134/350 -
Vincent ISOZ Structured Query Language/SQL
with success!
- 135/350 -
Vincent ISOZ Structured Query Language/SQL
and when you run the SQL code you will see the Check removed:
- 136/350 -
Vincent ISOZ Structured Query Language/SQL
The default value will be added to all new records, if no other value is specified.
where you can see the important SYSDATE statement used a lot also sometimes with the USER
statement!
Note: On mySQL, Access, SQL Server you have to replace the sysdate with getdate().
- 137/350 -
Vincent ISOZ Structured Query Language/SQL
But if we use the GUI to insert rows, the standard values do not appear:
we get:
- 138/350 -
Vincent ISOZ Structured Query Language/SQL
... it works!
MySQL:
MySQL:
- 139/350 -
Vincent ISOZ Structured Query Language/SQL
Oracle:
- 140/350 -
Vincent ISOZ Structured Query Language/SQL
http://docs.oracle.com/cd/B19306_01/server.102/b14231/indexes.htm
Indexes allow the database application to find data fast; without reading the whole table.
The users cannot see the indexes, they are mainly just used to speed up searches/queries.
Indexes are normally created only and only if the users say that the database begins to retrieve
information too slowly. Create them only after table creation and on users requests otherwise you
use disk space for nothing!
If the creation of a UNIQUE INDEX fails this is because you already have duplicates data
existing in your table in the chosen field.
Note: Updating a table with indexes takes more time than updating a table without (because the
indexes also need an update). So you should only create indexes on columns (and tables) that will
be frequently searched against.
Note: The syntax for creating indexes varies amongst different databases. Therefore: Check the
syntax for creating indexes in your database.
- 141/350 -
Vincent ISOZ Structured Query Language/SQL
- 142/350 -
Vincent ISOZ Structured Query Language/SQL
It is easier to manage than creating and Nonunique INDEX with after a UNIQUE CONSTRAINT.
Note: On MS Access, when you create a Primary Key, on unique Index is automatically created on
the primary key column.
Depending on the scenario and storage availability and also update frequency of the table you can
have cluster index on the 2-uplet ('CardNumber','fkSaler') + two index on respectively the same
fields.
The best solution is not always easy. The best thing is to study usage statistics and compare
results using statistical tools (student T-test typically).
To create a multiple (clustered) non-unique Index on Oracle on an existing table use the following:
and for sure you can also create a multiple (clustered) UNIQUE INDEX.
- 143/350 -
Vincent ISOZ Structured Query Language/SQL
You don't need to specify the table because index names are unique across the whole server.
- 144/350 -
Vincent ISOZ Structured Query Language/SQL
- 145/350 -
Vincent ISOZ Structured Query Language/SQL
We get:
- 146/350 -
Vincent ISOZ Structured Query Language/SQL
- 147/350 -
Vincent ISOZ Structured Query Language/SQL
as you can see the virtual column is not visible in the table structure but if we look in the SQL
structure, we can see TaxAmount:
- 148/350 -
Vincent ISOZ Structured Query Language/SQL
- 149/350 -
Vincent ISOZ Structured Query Language/SQL
it works!
ALTER TABLE
Employees
MODIFY
(
FirstName varchar(30),
LastName varchar(30)
);
- 150/350 -
Vincent ISOZ Structured Query Language/SQL
ALTER TABLE
Employees
RENAME CONSTRAINT
(
pkPerson TO pkPersonId
);
And now if you try to run and DML query you will get an error:
- 151/350 -
Vincent ISOZ Structured Query Language/SQL
and if you change it again in READ/WRITE you will be able to run the DML:
- 152/350 -
Vincent ISOZ Structured Query Language/SQL
Then to remove a database, when you have the rights and the possibility, the syntax is simply:
DROP DATABASE database_name
ALTER TABLE
table_name
DROP
(col_name1, col_name2);
ALTER TABLE
table_name
SET UNUSED
(col_name1, col_name2);
You can later remove columns that are marked as unused by issuing an ALTER TABLE...DROP
UNUSED COLUMNS statement. Unused columns are also removed from the target table whenever
an explicit drop of any particular column or columns of the table is issued.
ALTER TABLE
table_name
DROP UNUSED
(col_name1, col_name2);
- 153/350 -
Vincent ISOZ Structured Query Language/SQL
It is no longer possible to retrieve marked columns when clearing a table to make them operational
again. Only the DROP UNUSED COLUMNS directive is allowed to handle such columns. It destroys
all the columns of a table that are marked at erasure.
The if you know how to remove a constraint you know how to remove and NOT NULL. For
this you just type:
- 154/350 -
Vincent ISOZ Structured Query Language/SQL
DROP INDEX Syntax for DB2/Oracle (you do not need to specify table name because index name
are unique across the whole server):
- 155/350 -
Vincent ISOZ Structured Query Language/SQL
By default, the starting value for AUTO_INCREMENT is 1, and it will increment by 1 for each new
record.
To let the AUTO_INCREMENT sequence start with another value, use the following SQL statement:
To insert a new record into the "Persons" table, we will NOT have to specify a value for the "ID"
column (a unique value will be added automatically):
The SQL statement above would insert a new record into the "Persons" table. The "ID" column
would be assigned a unique value. The "FirstName" column would be set to "Lars" and the
"LastName" column would be set to "Monsen".
The MS SQL Server uses the IDENTITY keyword to perform an auto-increment feature.
- 156/350 -
Vincent ISOZ Structured Query Language/SQL
In the example above, the starting value for IDENTITY is 1, and it will increment by 1 for each new
record.
Tip: To specify that the "ID" column should start at value 10 and increment by 5, change it to
IDENTITY(10,5).
To insert a new record into the "Persons" table, we will NOT have to specify a value for the "ID"
column (a unique value will be added automatically):
The SQL statement above would insert a new record into the "Persons" table. The "ID" column
would be assigned a unique value. The "FirstName" column would be set to "Lars" and the
"LastName" column would be set to "Monsen".
By default, the starting value for AUTOINCREMENT is 1, and it will increment by 1 for each new
record.
Tip: To specify that the "ID" column should start at value 10 and increment by 5, change the
autoincrement to AUTOINCREMENT(10,5).
To insert a new record into the "Persons" table, we will NOT have to specify a value for the "ID"
column (a unique value will be added automatically):
The SQL statement above would insert a new record into the "Persons" table. The "P_Id" column
would be assigned a unique value. The "FirstName" column would be set to "Lars" and the
"LastName" column would be set to "Monsen".
- 157/350 -
Vincent ISOZ Structured Query Language/SQL
You will have to create an auto-increment field with the sequence object (this object generates a
number sequence).
The code above creates a sequence object called seq_person, that starts with 1 and will increment
by 1. It will also cache up to 10 values for performance. The cache option specifies how many
sequence values will be stored in memory for faster access.
To insert a new record into the "Persons" table, we will have to use the nextval function (this
function retrieves the next value from seq_person sequence) :
- 158/350 -
Vincent ISOZ Structured Query Language/SQL
The SQL statement above would insert a new record into the "Persons" table. The "ID" column
would be assigned the next number from the seq_person sequence. The "FirstName" column would
be set to "Vincent" and the "LastName" column would be set to "ISOZ".
- 159/350 -
Vincent ISOZ Structured Query Language/SQL
- 160/350 -
Vincent ISOZ Structured Query Language/SQL
6 SQL VIEWS
In SQL, a view is a virtual table based on the result-set of an SQL statement.
A view contains rows and columns, just like a real table. The fields in a view are fields from one or
more real tables in the database.
You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the
data were coming from one single table.
If a view contains the primary key and all others NOT NULL columns, the view can be used to insert
datas or even override the original table constraints (by adding complementary constraints to the
view). Here we will focus only on basic read-only views because this is the most common case for
end-users (and we have only one week to study SQL...).
- 161/350 -
Vincent ISOZ Structured Query Language/SQL
Note: A view always shows up-to-date data! The database engine recreates the data, using the
view's SQL statement, every time a user queries a view.
- 162/350 -
Vincent ISOZ Structured Query Language/SQL
- 163/350 -
Vincent ISOZ Structured Query Language/SQL
- 164/350 -
Vincent ISOZ Structured Query Language/SQL
No, you can't ALTER VIEW to add or remove columns! The syntax is the following (we don't want
the cust_first_name column anymore):
- 165/350 -
Vincent ISOZ Structured Query Language/SQL
- 166/350 -
Vincent ISOZ Structured Query Language/SQL
7 SQL Functions
SQL has many built-in functions (almost ~150 for Oracle) for performing calculations on data. We
will see here only 19 functions that have to be known by undergraduate students.
For more:
http://docs.oracle.com/cd/B19306_01/server.102/b14200/functions001.htm
http://www.techonthenet.com/oracle/functions/index.php
For this let us consider the following demo table of Oracle Express:
- 167/350 -
Vincent ISOZ Structured Query Language/SQL
- 168/350 -
Vincent ISOZ Structured Query Language/SQL
Obviously we can also convert to INT (integer), to VARCHAR(…) (string) and so on… corresponding to all
standard column types in Oracle.
An important example is the following of CAST is also the ratio of two integers. Some Database will not return a
result for the ration of two integers. This is why by security we will always write something like:
Instead of:
- 169/350 -
Vincent ISOZ Structured Query Language/SQL
For example:
- 170/350 -
Vincent ISOZ Structured Query Language/SQL
or for fun:
- 171/350 -
Vincent ISOZ Structured Query Language/SQL
Always put the WHERE statement before the GROUP BY otherwise you may filter on a column that
doesn't exist anymore because of the grouping!
10248 90 5 1996-07-04 3
10249 81 6 1996-07-05 1
10250 34 4 1996-07-08 2
...
...
...
- 172/350 -
Vincent ISOZ Structured Query Language/SQL
ShipperName NumberOfOrders
Federal Shipping 68
Speedy Express 54
United Package 74
We can also use the GROUP BY statement on more than one column, like this:
- 173/350 -
Vincent ISOZ Structured Query Language/SQL
....
- 174/350 -
Vincent ISOZ Structured Query Language/SQL
10248 90 5 1996-07-04 3
10249 81 6 1996-07-05 1
10250 34 4 1996-07-08 2
...
...
The following SQL statement finds if any of the employees has registered more than 10 orders:
LastName NumberOfOrders
Buchanan 11
Callahan 27
Davolio 29
- 175/350 -
Vincent ISOZ Structured Query Language/SQL
Fuller 20
King 14
Leverling 31
Peacock 40
Suyama 18
Now we want to find if the employees "Davolio" or "Fuller" have more than 25 orders
- 176/350 -
Vincent ISOZ Structured Query Language/SQL
and don't forget the parenthesis after the AND logical operator otherwise the result won't be the
same.
tName NumberOfOrders
Davolio 29
- 177/350 -
Vincent ISOZ Structured Query Language/SQL
To see how the GROUP BY ROLLUP works, we will focus once again on Oracle.
We complexify a little bit this query by adding a GROUP BY and ORDER BY statement and a sum( )
on the order total:
- 178/350 -
Vincent ISOZ Structured Query Language/SQL
- 179/350 -
Vincent ISOZ Structured Query Language/SQL
As you can see this do the same as a GROUP BY but adds sub-totals rows and at the end at grand-
total! This is especially useful for invoices automation purposes.
And if you have the time you can mix all the stuff study we saw until now:
SELECT state,
round( sum( mens ), 2 ) "Mens",
round( sum( womens ), 0 ) "Womens",
round( sum( accessories ), 0 ) "Accessories"
FROM ( SELECT demo_customers.cust_state state,
CASE
WHEN demo_product_info.category = 'Mens'
THEN
demo_order_items.quantity *
demo_order_items.unit_price
ELSE
0
END
mens,
CASE
WHEN demo_product_info.category = 'Womens'
THEN
demo_order_items.quantity *
demo_order_items.unit_price
ELSE
0
END
womens,
CASE
WHEN demo_product_info.category = 'Accessories'
THEN
- 180/350 -
Vincent ISOZ Structured Query Language/SQL
demo_order_items.quantity *
demo_order_items.unit_price
ELSE
0
END
accessories
FROM demo_order_items,
demo_product_info,
demo_customers,
demo_orders
WHERE demo_order_items.product_id = demo_product_info.product_id
AND demo_order_items.order_id = demo_orders.order_id
AND demo_orders.customer_id = demo_customers.customer_id )
GROUP BY ROLLUP( state )
- 181/350 -
Vincent ISOZ Structured Query Language/SQL
- 182/350 -
Vincent ISOZ Structured Query Language/SQL
- 183/350 -
Vincent ISOZ Structured Query Language/SQL
- 184/350 -
Vincent ISOZ Structured Query Language/SQL
- 185/350 -
Vincent ISOZ Structured Query Language/SQL
to compare with:
nice trap... this is why triple check is important when you manage billion dollars!
- 186/350 -
Vincent ISOZ Structured Query Language/SQL
In SQL Server the function is a little bit more interesting because you can choose what to return if all
argument are null:
- 187/350 -
Vincent ISOZ Structured Query Language/SQL
- 188/350 -
Vincent ISOZ Structured Query Language/SQL
The descriptive statistics functions can for sure be mixed with GROUP BY, WHERE, JOINS, ...
SQL statements and especially subqueries!
The following SQL statement finds the sum of all the Quantity fields for the Order Items table:
- 189/350 -
Vincent ISOZ Structured Query Language/SQL
etc...
Or an interesting one to get the Total Number of Records in ALL TABLES of a schema (see page
273 for other metadata queries):
- 190/350 -
Vincent ISOZ Structured Query Language/SQL
- 191/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement gets the average value of the price column from the products table:
The following SQL statement selects the Product Name and Price records that have an above
average price:
- 192/350 -
Vincent ISOZ Structured Query Language/SQL
- 193/350 -
Vincent ISOZ Structured Query Language/SQL
The COUNT(DISTINCT column_name) function returns the number of distinct values of the
specified column:
Note: COUNT(DISTINCT) works with ORACLE and Microsoft SQL Server, but not with Microsoft
Access.
The following SQL statement counts the number of orders from CustomerID 7 from the Orders
table:
The following SQL statement counts the total number of orders in the Orders table:
- 194/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement counts the number of unique customers in the Orders table:
- 195/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement gets the largest value of the Price column from the Products table:
and just replace MAX( ) by MIN( ) in the above queries to see how MIN( ) works.
- 196/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement gets the average value of the price column from the products table:
The following SQL statement selects the Product Name and Price records that have an above
median price:
- 197/350 -
Vincent ISOZ Structured Query Language/SQL
- 198/350 -
Vincent ISOZ Structured Query Language/SQL
- 199/350 -
Vincent ISOZ Structured Query Language/SQL
- 200/350 -
Vincent ISOZ Structured Query Language/SQL
It is interesting to take for example the previous one but now with the discrete percentile:
- 201/350 -
Vincent ISOZ Structured Query Language/SQL
This is the same as a simple subquery likes w show in the next screenshot:
- 202/350 -
Vincent ISOZ Structured Query Language/SQL
- 203/350 -
Vincent ISOZ Structured Query Language/SQL
The following SQL statement gets the average value of the price column from the products table:
The following SQL statement selects the Product Name and Price records that have an above modal
price:
- 204/350 -
Vincent ISOZ Structured Query Language/SQL
- 205/350 -
Vincent ISOZ Structured Query Language/SQL
- 206/350 -
Vincent ISOZ Structured Query Language/SQL
Note: In statistics we know (see Statistics course) that de regression matrix is more interesting
but the variance-covariance matrix is still useful in financial modeling.
Then here to see an example we will first create a two columns table with the following script:
and here is the corresponding text (for copy/paste purpose during the training):
This gives us a part of the table used in Minitab, Tanagra, SPSS and R training:
- 207/350 -
Vincent ISOZ Structured Query Language/SQL
- 208/350 -
Vincent ISOZ Structured Query Language/SQL
Everything is fine!
- 209/350 -
Vincent ISOZ Structured Query Language/SQL
The table we will use here is the same as the one for the Sample Covariance (see above). Then we
just run the following query:
Everything is fine!
- 210/350 -
Vincent ISOZ Structured Query Language/SQL
and here is the corresponding text (for copy/paste purpose during the training):
- 211/350 -
Vincent ISOZ Structured Query Language/SQL
- 212/350 -
Vincent ISOZ Structured Query Language/SQL
- 213/350 -
Vincent ISOZ Structured Query Language/SQL
To see how works regression functions with Oracle we will first create a table with the following
script:
and here is the corresponding text (for copy/paste purpose during the training):
This gives us the table used in Minitab, Tanagra, SPSS and R training:
- 214/350 -
Vincent ISOZ Structured Query Language/SQL
- 215/350 -
Vincent ISOZ Structured Query Language/SQL
for sure a statistical software gives more results but otherwise what we get back we Oracle seems
OK!
- 216/350 -
Vincent ISOZ Structured Query Language/SQL
To see how works regression functions with Oracle we will first create a table with the following
script:
and here is the corresponding text (for copy/paste purpose during the training):
- 217/350 -
Vincent ISOZ Structured Query Language/SQL
This is correct. It gives the exact probability of having 5 Mens under the hypothesis that founding a
Man or a Women is equal (=50%). This is corresponding with our calculation made with Microsoft
Office Excel in the Statistical course:
- 218/350 -
Vincent ISOZ Structured Query Language/SQL
This is correct. It gives the exact probability of having 5 Mens or less than under the hypothesis
that founding a Man or a Women is equal (=50%). This is corresponding with our calculation made
with Microsoft Office Excel in the Statistical course:
and we see that the result does not correspond to our Statistical softwares for example like
Minitab:
- 219/350 -
Vincent ISOZ Structured Query Language/SQL
What happened? It seems that Oracle makes the following mistakes or choice... as you can see
below on the Microsoft Excel screenshot:
As you can see it does not take the case where Mens=7... to follow
- 220/350 -
Vincent ISOZ Structured Query Language/SQL
and here is the corresponding text (for copy/paste purpose during the training):
- 221/350 -
Vincent ISOZ Structured Query Language/SQL
to compare with the result obtained with Minitab during the Statistics course:
- 222/350 -
Vincent ISOZ Structured Query Language/SQL
The following example determines the significance of the difference between the average Pipeline1
and Pipeline2 flow where the distributions are assumed to have similar (pooled) variances:
To do such a test we need to create first a table with the following script:
and here is the corresponding full text (for copy/paste purpose during the training):
- 223/350 -
Vincent ISOZ Structured Query Language/SQL
and then run the following query (the 1 in the third argument specifies which Pipeline is the
reference for the calculation!):
- 224/350 -
Vincent ISOZ Structured Query Language/SQL
to compare with the result obtained with Minitab during the Statistics course:
- 225/350 -
Vincent ISOZ Structured Query Language/SQL
Because Crosstabs creates a row for each value in one variable and a column for each value in the
other, the procedure is not suitable for continuous variables that assume many values.
To see how works regression functions with Oracle we will first create a table with the following
script:
and here is the corresponding full text (for copy/paste purpose during the training):
- 226/350 -
Vincent ISOZ Structured Query Language/SQL
corresponding the following crosstab table corresponding to what we used for the Minitab, Tanagra,
SPSS and R training:
- 227/350 -
Vincent ISOZ Structured Query Language/SQL
- 228/350 -
Vincent ISOZ Structured Query Language/SQL
The CASE ... WHEN statement can be used for multiple IF simplifications. Here an example with the
W3School database:
.....
Or consider the more complete example with Oracle mixing different tables and SQL statements
and functions:
- 229/350 -
Vincent ISOZ Structured Query Language/SQL
- 230/350 -
Vincent ISOZ Structured Query Language/SQL
- 231/350 -
Vincent ISOZ Structured Query Language/SQL
SELECT state,
sum(mens) "Mens",
sum(womens) "Womens",
sum(accessories) "Accessories"
FROM (SELECT demo_customers.cust_state state,
CASE
WHEN demo_product_info.category = 'Mens'
THEN
demo_order_items.quantity *
demo_order_items.unit_price
ELSE
0
END
mens,
CASE
WHEN demo_product_info.category = 'Womens'
THEN
demo_order_items.quantity *
demo_order_items.unit_price
ELSE
0
END
womens,
CASE
WHEN demo_product_info.category = 'Accessories'
THEN
demo_order_items.quantity *
demo_order_items.unit_price
ELSE
0
END
accessories
FROM demo_order_items,
demo_product_info,
demo_customers,
demo_orders
WHERE demo_order_items.product_id = demo_product_info.product_id
AND demo_order_items.order_id = demo_orders.order_id
AND demo_orders.customer_id = demo_customers.customer_id )
GROUP BY ROLLUP( state );
- 232/350 -
Vincent ISOZ Structured Query Language/SQL
CASE: DECODE:
• Can work with logical operators other • Works with only '=' / like operator
than '='
• Expressions are scalar values only
• Can work with predicated and
searchable queries • Data consistency is not needed
And other exemple that highlights the fact that one cares about consistency and the other not:
- 233/350 -
Vincent ISOZ Structured Query Language/SQL
- 234/350 -
Vincent ISOZ Structured Query Language/SQL
- 235/350 -
Vincent ISOZ Structured Query Language/SQL
To study MERGE INTO we will use first de default EMP table available in Oracle:
and create also a Bonus table (now using a script instead of INSERT ALL for fun):
- 236/350 -
Vincent ISOZ Structured Query Language/SQL
- 237/350 -
Vincent ISOZ Structured Query Language/SQL
As you can see there a two more rows and the existing one have the bonus updated.
- 238/350 -
Vincent ISOZ Structured Query Language/SQL
- 239/350 -
Vincent ISOZ Structured Query Language/SQL
- 240/350 -
Vincent ISOZ Structured Query Language/SQL
• MySQL: CONCAT( )
• Oracle: CONCAT( ) or ||
• SQL Server: +
In Oracle CONCAT takes only two arguments. Then if you need three you have at least two
choices:
Or:
- 241/350 -
Vincent ISOZ Structured Query Language/SQL
- 242/350 -
Vincent ISOZ Structured Query Language/SQL
Oracle doesn't have some of the handy short-hand functions that Microsoft has embedded into it's
VB programming languages and into SQL Server but, of course, provides a similar way to return
the same result.
In Microsoft's SQL Server, and in Visual Basic, you have the following:
LEFT(YourStringHere,NumCharsToGrab)
LEFT("birthday",5) = "birth"
LEFT("birthday",1) = "b"
RIGHT(YourStringHere,NumCharsToGrab)
RIGHT("birthday",3) = "day"
RIGHT("birthday",1) = "y"
Oracle's SUBSTR function works much the same as the MID function:
SUBSTR(YourStringHere,StartFrom,NumCharsToGrab)
SUBSTR("birthday",1,2) = "bi"
SUBSTR("birthday",-2,2) = "ay" the -2 indicates started from the end of the word
- 243/350 -
Vincent ISOZ Structured Query Language/SQL
- 244/350 -
Vincent ISOZ Structured Query Language/SQL
- 245/350 -
Vincent ISOZ Structured Query Language/SQL
- 246/350 -
Vincent ISOZ Structured Query Language/SQL
and the same again with number formatting (that will cause in Microsoft Excel the numbers
to be in text format!!!!) to obtain thousand separators using American representation:
- 247/350 -
Vincent ISOZ Structured Query Language/SQL
- 248/350 -
Vincent ISOZ Structured Query Language/SQL
If we take the previous example but in the main to obtain thousand seperator using European
representation, we get:
- 249/350 -
Vincent ISOZ Structured Query Language/SQL
- 250/350 -
Vincent ISOZ Structured Query Language/SQL
Remember the chapter about CONNECT BY of page 71 with the following orghchart ;-)
- 251/350 -
Vincent ISOZ Structured Query Language/SQL
- 252/350 -
Vincent ISOZ Structured Query Language/SQL
It may be interesting to see how to insert such a timestamp in a database! For this purpose, let us
first create a table:
- 253/350 -
Vincent ISOZ Structured Query Language/SQL
- 254/350 -
Vincent ISOZ Structured Query Language/SQL
- 255/350 -
Vincent ISOZ Structured Query Language/SQL
- 256/350 -
Vincent ISOZ Structured Query Language/SQL
- 257/350 -
Vincent ISOZ Structured Query Language/SQL
- 258/350 -
Vincent ISOZ Structured Query Language/SQL
- 259/350 -
Vincent ISOZ Structured Query Language/SQL
−, 0 , 0,1'000
, 1'000, 2'000
, 2 '000,3'000
, 3'000, 4 '000
, 4 '000,5'000
, 6'000, +
=0 =1 =2 =3 =5 =6 =7
this give us for each employee the number of the group he belongs to:
- 260/350 -
Vincent ISOZ Structured Query Language/SQL
- 261/350 -
Vincent ISOZ Structured Query Language/SQL
- 262/350 -
Vincent ISOZ Structured Query Language/SQL
and now run the following query doing also the same but without using the GROUP BY function:
- 263/350 -
Vincent ISOZ Structured Query Language/SQL
In absence of any PARTITION or <window_clause> inside the OVER( ) portion, the function acts
on entire record set returned by the where clause.
- 264/350 -
Vincent ISOZ Structured Query Language/SQL
- 265/350 -
Vincent ISOZ Structured Query Language/SQL
- 266/350 -
Vincent ISOZ Structured Query Language/SQL
This is especially useful for Mann-Withney and Wilcoxon statistical rank tests!
(see graduate training)
- 267/350 -
Vincent ISOZ Structured Query Language/SQL
where:
• <offset> is the index of the leading row relative to the current row (positive integer with
default 1)
• <default> is the value to return if the <offset> points to a row outside the partition range.
- 268/350 -
Vincent ISOZ Structured Query Language/SQL
The following example selects, for each employee in all departments, the name of the employee
with the lowest salary:
or the opposite:
- 269/350 -
Vincent ISOZ Structured Query Language/SQL
- 270/350 -
Vincent ISOZ Structured Query Language/SQL
Now we can run the following non-trivial query the get growth in percentage:
- 271/350 -
Vincent ISOZ Structured Query Language/SQL
we get:
with the famous time consistent yield used a lot in finance (we just have to change the sign but
this is a detail)!
- 272/350 -
Vincent ISOZ Structured Query Language/SQL
- 273/350 -
Vincent ISOZ Structured Query Language/SQL
- 274/350 -
Vincent ISOZ Structured Query Language/SQL
- 275/350 -
Vincent ISOZ Structured Query Language/SQL
- 276/350 -
Vincent ISOZ Structured Query Language/SQL
- 277/350 -
Vincent ISOZ Structured Query Language/SQL
- 278/350 -
Vincent ISOZ Structured Query Language/SQL
- 279/350 -
Vincent ISOZ Structured Query Language/SQL
- 280/350 -
Vincent ISOZ Structured Query Language/SQL
- 281/350 -
Vincent ISOZ Structured Query Language/SQL
- 282/350 -
Vincent ISOZ Structured Query Language/SQL
For the moment, if CODD tries to query our Demo_Customers table from ISOZ database he will
get:
- 283/350 -
Vincent ISOZ Structured Query Language/SQL
And Codd will be able to query ISOZ tables using only SELECT statement:
- 284/350 -
Vincent ISOZ Structured Query Language/SQL
You can then try and update SQL statement in Codd session as for example:
Il y a bien une petite centaine d'accès qu'on peut donner à un utilisateur. Pour voir ces derniers
concernant Oracle, le lecteur peut se référer:
https://docs.oracle.com/cd/B19306_01/server.102/b14200/statements_9013.htm
- 285/350 -
Vincent ISOZ Structured Query Language/SQL
- 286/350 -
Vincent ISOZ Structured Query Language/SQL
Privilege Description
SELECT Ability to query the table with a select statement.
INSERT Ability to add new rows to the table with the insert statement.
UPDATE Ability to update rows in the table with the update statement.
DELETE Ability to delete rows from the table with the delete statement.
REFERENCES Ability to create a constraint that refers to the table.
ALTER Ability to change the table definition with the alter table statement.
INDEX Ability to create an index on the table with the create index statement.
EXECUTE Ability to compile the function/procedure.
Ability to execute the function/procedure directly
As you can see in the example below, it is possible to create some nice procedures!:
- 287/350 -
Vincent ISOZ Structured Query Language/SQL
9 PL-SQL
We will study PL-SQL (Procedural Language/Structured Query Language) in the next
course... but we can still see the basics!
- 288/350 -
Vincent ISOZ Structured Query Language/SQL
Then create the following procedure that has for only base purpose to insert a unique row given
two input values:
When this PL-SQL code is run you will get in the Object Browser of Oracle in the category
Procedures the following:
- 289/350 -
Vincent ISOZ Structured Query Language/SQL
If you click on Save & Compile you can then check if there is an error or not before using the
procedure:
- 290/350 -
Vincent ISOZ Structured Query Language/SQL
and after you just play with your imagination to create what you want for procedures using all SQL
statements and functions that we have study until now.
- 291/350 -
Vincent ISOZ Structured Query Language/SQL
- 292/350 -
Vincent ISOZ Structured Query Language/SQL
- 293/350 -
Vincent ISOZ Structured Query Language/SQL
- 294/350 -
Vincent ISOZ Structured Query Language/SQL
- 295/350 -
Vincent ISOZ Structured Query Language/SQL
- 296/350 -
Vincent ISOZ Structured Query Language/SQL
• C stands for Consistency, transaction must leave database in consistent state even if it
succeeds or rollback.
• I is for Isolation, Two database transactions happening at same time should not affect
each other and has consistent view of database. This is achieved by using isolation levels in
database.
What is a Transaction?
A transaction is a set of SQL statements which Oracle treats as a Single Unit. i.e. all the statements
should execute successfully or none of the statements should execute.
To control transactions Oracle does not made permanent any DML statements unless you commit
it. If you don’t commit the transaction and power goes off or system crashes then the transaction
is roll backed.
- 297/350 -
Vincent ISOZ Structured Query Language/SQL
We then give a name to the script in the field Script Name and we write our script (it's a little bit
false because it's simplified for pedagogical reasons!!):
- 298/350 -
Vincent ISOZ Structured Query Language/SQL
It works for sure but in reality, a lot of troubles can occur! During the deletion for example:
- 299/350 -
Vincent ISOZ Structured Query Language/SQL
To make the changes done in a transaction permanent use the COMMIT statement.
As you can see below, if there is for example an error in the script, the beginning will be committed
(row 2) but the rest won't be executed:
but because of the ROLLBACK, everything that is related to the customer 5 is still here in reality:
- 300/350 -
Vincent ISOZ Structured Query Language/SQL
Locks may be used to emulate transactions or to get more speed when updating tables. This is
explained in more detail later in this section.
LOCK TABLES explicitly acquire table locks for the current client session!
For the example, remember that at page 277 we have created two sessions:
- 301/350 -
Vincent ISOZ Structured Query Language/SQL
And:
- 302/350 -
Vincent ISOZ Structured Query Language/SQL
lock_mode Explanation
Allows concurrent access to the table, but users are prevented
ROW SHARE
from locking the entire table for exclusive access.
SHARE ROW Users can view records in table, but are prevented from
EXCLUSIVE updating the table or from locking the table in SHARE mode.
- 303/350 -
Vincent ISOZ Structured Query Language/SQL
- 304/350 -
Vincent ISOZ Structured Query Language/SQL
- 305/350 -
Vincent ISOZ Structured Query Language/SQL
- 306/350 -
Vincent ISOZ Structured Query Language/SQL
- 307/350 -
Vincent ISOZ Structured Query Language/SQL
- 308/350 -
Vincent ISOZ Structured Query Language/SQL
9.4 Triggers
Like a stored procedure, a trigger is a named PL/SQL unit that is stored in the database and can be
invoked repeatedly. Unlike a stored procedure, you can enable and disable a trigger, but you
cannot explicitly invoke it. While a trigger is enabled, the database automatically invokes it—that
is, the trigger fires—whenever its triggering event occurs. While a trigger is disabled, it does not
fire.
You create a trigger with the CREATE TRIGGER statement. You specify the triggering event in
terms of triggering statements and the item on which they act. The trigger is said to be created on
or defined on the item, which is either a table, a view, a schema, or the database. You also specify
the timing point, which determines whether the trigger fires before or after the triggering
statement runs and whether it fires for each row that the triggering statement affects.
- 309/350 -
Vincent ISOZ Structured Query Language/SQL
Then fire the trigger with for example the following code:
- 310/350 -
Vincent ISOZ Structured Query Language/SQL
- 311/350 -
Vincent ISOZ Structured Query Language/SQL
- 312/350 -
Vincent ISOZ Structured Query Language/SQL
User Name:
Password:
sql = "SELECT * FROM Users WHERE Name ='" + uName + "' AND Pass ='" + uPass
+ "'"
A smart hacker might get access to user names and passwords in a database by simply inserting "
or ""=" into the user name or password text box.
The code at the server will create a valid SQL statement like this:
SELECT * FROM Users WHERE Name ="" or ""="" AND Pass ="" or ""=""
The result SQL is valid. It will return all rows from the table Users, since WHERE ""="" is always
true.
- 313/350 -
Vincent ISOZ Structured Query Language/SQL
Let's start:
- 314/350 -
Vincent ISOZ Structured Query Language/SQL
- 315/350 -
Vincent ISOZ Structured Query Language/SQL
- 316/350 -
Vincent ISOZ Structured Query Language/SQL
- 317/350 -
Vincent ISOZ Structured Query Language/SQL
- 318/350 -
Vincent ISOZ Structured Query Language/SQL
- 319/350 -
Vincent ISOZ Structured Query Language/SQL
we get:
- 320/350 -
Vincent ISOZ Structured Query Language/SQL
We can fall back on almost all results using the R statistical software (excepted that last one
that I was not able to found how Oracle calculates it…):
- 321/350 -
Vincent ISOZ Structured Query Language/SQL
As you can see, this has nothing to do with a real binomial test:
- 322/350 -
Vincent ISOZ Structured Query Language/SQL
- 323/350 -
Vincent ISOZ Structured Query Language/SQL
- 324/350 -
Vincent ISOZ Structured Query Language/SQL
- 325/350 -
Vincent ISOZ Structured Query Language/SQL
- 326/350 -
Vincent ISOZ Structured Query Language/SQL
- 327/350 -
Vincent ISOZ Structured Query Language/SQL
In fact, whatever the table used or the data, Oracle returns always 0 for Cohen's kappa. This
mean that there is almost surely an issue or a bug with this parameter.
- 328/350 -
Vincent ISOZ Structured Query Language/SQL
- 329/350 -
Vincent ISOZ Structured Query Language/SQL
- 330/350 -
Vincent ISOZ Structured Query Language/SQL
- 331/350 -
Vincent ISOZ Structured Query Language/SQL
- 332/350 -
Vincent ISOZ Structured Query Language/SQL
- 333/350 -
Vincent ISOZ Structured Query Language/SQL
- 334/350 -
Vincent ISOZ Structured Query Language/SQL
- 335/350 -
Vincent ISOZ Structured Query Language/SQL
- 336/350 -
Vincent ISOZ Structured Query Language/SQL
- 337/350 -
Vincent ISOZ Structured Query Language/SQL
- 338/350 -
Vincent ISOZ Structured Query Language/SQL
- 339/350 -
Vincent ISOZ Structured Query Language/SQL
- 340/350 -
Vincent ISOZ Structured Query Language/SQL
- 341/350 -
Vincent ISOZ Structured Query Language/SQL
- 342/350 -
Vincent ISOZ Structured Query Language/SQL
- 343/350 -
Vincent ISOZ Structured Query Language/SQL
On peut obtenir la valeur de Z proche de celle que renvoie Oracle avec la commande suivante:
- 344/350 -
Vincent ISOZ Structured Query Language/SQL
12 List of Figures
Figure 1 Northwind Database "star schema" ........................................................................... 22
- 345/350 -
Vincent ISOZ Structured Query Language/SQL
13 List of Tables
Tableau 1Common Databases Technologies ........................................................................... 13
- 346/350 -
Vincent ISOZ Structured Query Language/SQL
14 Index
ACID properties ....................................297 CREATE DATABASE ........................ 105
ADD ......................................................147 CREATE FUNCTION.......................... 295
ADD_MONTHS( ) ................................259 CREATE INDEX ................................. 141
Aliases .....................................................29 CREATE PROCEDURE ...................... 289
ALL .........................................................93 CREATE SEQUENCE ....................... 158
ALL_USERS .........................................282 CREATE TABLE ................................. 110
ALTER CHECK ............................................. 120
ADD ..................................................147 Data Types ........................................ 110
ALTER INDEX.................................151 DEFAULT ........................................ 120
MODIFY ...........................................150 FOREIGN KEY ................................ 120
READ ONLY ....................................151 NOT NULL ...................................... 120
RENAME COLUMN ........................150 PRIMARY KEY ............................... 120
RENAME CONSTRAINT ................151 UNIQUE ........................................... 120
RENAME TO ....................................146 CREATE TRIGGER ............................ 309
ALTER INDEX.....................................151 CREATE VIEW ................................... 162
ALTER TABLE ....................................146 CROSS JOIN .......................................... 78
ALTER VIEW.......................................165 crosstab queries............................. 178, 182
AND ........................................................46 CURRENT ROW ................................. 212
ANY ........................................................96 data query language ......................... 9
Auto-increment column.........................158 Data Science ......................................... 314
AVG( )...................................................192 Data Types ............................................ 110
AVG_ROW_LEN .................................273 DATABASE( ) ....................................... 25
BEGIN...END .......................................290 DECODE ...................................... 233, 235
BETWEEN ..............................................59 DEFAULT .................................... 120, 137
Binomial probability .............................320 DELETE ......................................... 52, 133
Binomial test .........................................217 DELETE CASCADE............................ 133
BOTTOM ................................................53 DENSE_RANK( ) ................................ 267
Cartesian Product ....................................60 DISABLE single or multiple PRIMARY
CASE WHEN ......................................229 KEY Constraint ................................ 128
CAST().................................................167 DISTINCT .............................................. 38
CHECK .................................................120 DISTINCTROW ..................................... 38
CHECK Constraint ................................134 DROP
single or multiple CHECK Constraint DROP CONSTRAINT ..................... 154
.......................................................134 DROP DATABASE ......................... 153
Chi-2 crosstab test .................................226 DROP INDEX .................................. 155
Chi-square adequation test ....................324 DROP CHECK Constraint ................... 135
COALESCE( ).......................................187 DROP constraints ................................. 124
Cohens Kappa .......................................327 DROP FOREIGN KEY Constraint ...... 132
COLLATION ..........................................31 DROP single or multiple PRIMARY KEY
COLUMN_NAME ................................273 Constraint.......................................... 128
Comments................................................23 DROP UNUSED ................................ 153
COMMIT ..............................................297 DROP USER ........................................ 278
CONCAT( ) ...........................................241 DROP VIEW ........................................ 166
CONNECT BY .......................................74 DUAL ................................................... 170
CORR( ) ................................................210 EXCEPTION ................................ 293, 307
COUNT( ) .............................................194 EXIST ..................................................... 90
COVAR_SAMP( ) ................................208
- 347/350 -
Vincent ISOZ Structured Query Language/SQL
- 348/350 -
Vincent ISOZ Structured Query Language/SQL
- 349/350 -
Vincent ISOZ Structured Query Language/SQL
- 350/350 -