Database 2

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 85

UNIT II

•Relational Model
Relational Model
• The relational model represents how data is stored
in Relational Databases. A relational database
consists of a collection of tables, each of which is
assigned a unique name.
• The relational model for database management is an
approach to logically represent and manage the data
stored in a database. In this model, the data is
organized into a collection of two-dimensional
inter-related tables, also known as relations. Each
relation is a collection of columns and rows, where
the column represents the attributes of an entity and
the rows (or tuples) represents the records.
Basic Structure
• Given sets D1, D2, …. Dn a relation r is a subset of
D1 x D2 x … x Dn
Thus a relation is a set of n-tuples (a1, a2, …, an) where
each ai  Di
• Ex: if customer-name = {Jones, Smith, Curry, Lindsay}
customer-street = {Main, North, North, Park}
customer-city = {Harrison, Rye, Ray, Pittsfield}
Then r = { (Jones, Main, Harrison),
(Smith, North, Rye),
(Curry, North, Rye),
(Lindsay, Park, Pittsfield)}
is a relation over customer-name x customer-street x
customer-city
Attributes
(or columns)
customer-name customer-street customer-city

Jones Main Harrison


Smith North Rye Tuples
Curry North Rye (or rows)
Lindsay Park Pittsfield
Schema
• Schema is the overall description of the database. The
basic structure of how the data will be stored in the
database is called schema.
• Database systems consist of complex data structures.
Thus, to make the system efficient for retrieval of data
and reduce the complexity of the users, developers
use the method of Data Abstraction.
• There are mainly three levels of data abstraction:
• Physical Schema – It is Internal Level describes the
database designed at physical level.
• Conceptual or Logical Level: Structure and
constraints for the entire database
• External or View level: Describes various user views
• Internal or Physical Level/Schema
• The internal schema is the lowest level of data
abstraction.
• It helps you to keeps information about the actual
representation of the entire database. Like the actual
storage of the data on the disk in the form of records
• The internal view tells us what data is stored in the
database and how.
• It never deals with the physical devices.
• Conceptual schema:
• Defines all database entities, their attributes, and
their relationships.
• Security and integrity information.
• In Conceptual level, the data available to a user must
be contained in or derivable from the physical level.
• External schema:
• An external level is only related to the data which is viewed
by specific end users.
• This level includes some external schemas.
• External schema level is nearest to the user
• External schema describes segment of the database which is
needed for certain user group & hides the remaining details
from the database from the specific user group
• Objectives of using Three schema Architecture:
• Every user should be able to access the same data but able
to see a customized view of the data.
• User need not deal directly with physical database storage
detail.
• The DBA should be able to change the database storage
structure without disturbing the user’s views
• The internal structure of the database should remain
unaffected when changes made to the physical aspects of
storage.
Table Student

ROLL_NO NAME ADDRESS PHONE AGE


1 RAM DELHI 9455123451 18
2 RAMESH GURGAON 9652431543 18
3 SUJIT ROHTAK 9156253131 20
4 SURESH DELHI 18
• Consider a relation STUDENT with attributes (Columns)
ROLL_NO, NAME, ADDRESS, PHONE, and AGE
shown in the table.
• Important Terminologies
• Attribute: Attributes are the properties that define an
entity. e.g.; ROLL_NO, NAME, ADDRESS etc
• Tuple: Each row in the relation is known as a tuple. The
above relation contains 4 tuples.
• Degree: The number of attributes in the relation is known
as the degree of the relation.
• The STUDENT relation defined above has degree ----.
• Relation Schema: A relation schema defines the
structure of the relation and represents the name of the
relation with its attributes. e.g.; STUDENT (ROLL_NO,
NAME, ADDRESS, PHONE, and AGE) is the relation
schema for STUDENT. If a schema has more than 1
relation, it is called Relational Schema.
• Relation Instance: The set of tuples of a relation at a
particular instance of time is called a relation instance.
Table shows the relation instance of STUDENT at a
particular time. It can change whenever there is an
insertion, deletion, or update in the database.
• Cardinality: The number of tuples in a relation is known
as cardinality. The STUDENT relation defined above has
cardinality 4.
• Column: The column represents the set of values for a
particular attribute. The column ROLL_NO is extracted
from the relation STUDENT.
• NULL Values: The value which is not known or
unavailable is called a NULL value. It is represented
by blank space. e.g.; PHONE of STUDENT having
ROLL_NO 4 is NULL.
• Tables – In the Relational model the, relations are
saved in the table format. It is stored along with its
entities. A table has two properties rows and
columns. Rows represent records and columns
represent attributes.
• Attribute domain – Every attribute has some pre-
defined value and scope which is known as attribute
domain
• Column: The column represents the set of values
for a specific attribute.
KEYS
• Keys are one of the basic requirements of
a relational database model. It is widely used to
identify the tuples(rows) uniquely in the table.
We also use keys to set up relations amongst
various columns and tables of a relational
database.
• Different Types of Keys in the Relational
Model
• Super Key, Candidate Key, Primary Key,
Alternate Key, Foreign Key, Composite Key
• Candidate Key: The minimal set of attributes
that can uniquely identify a tuple is known as a
candidate key. For Example, STUD_NO in
STUDENT relation.
• It is a minimal or subset of super key.
• It is a super key with no repeated data is called a
candidate key.
• The minimal set of attributes that can uniquely
identify a record.
• It must contain unique values.
• It can contain NULL values.
• Candidate Key cont…
• Every table must have at least a single candidate key.
• A table can have multiple candidate keys but only one
primary key (the primary key cannot have a NULL value,
so the candidate key with a NULL value can’t be the
primary key).
• The value of the Candidate Key is unique and may be null
for a tuple.
• There can be more than one candidate key in a relationship.
• Consider the following Student schema-
• Student ( roll ,name ,sex ,age ,address ,class ,section )
• Given below are the examples of candidate keys-
• ( class , section , roll )
• ( name , address )
• Super Key: The set of attributes that can uniquely identify a
tuple is known as Super Key. For Example, STUD_NO,
(STUD_NO, STUD_NAME), etc. A super key is a group of
single or multiple keys that identifies rows in a table. It supports
NULL values.
• Adding zero or more attributes to the candidate key generates
the super key.
• A candidate key is a super key but vice versa is not true.
• Consider the following Student schema-
• Student ( roll , name , sex , age , address , class , section )
• Given below are the examples of super keys since each set can
uniquely identify each student in the Student table-
• ( roll , name , sex , age , address , class , section )
• ( class , section , roll )
• (class , section , roll , sex )
• Primary Key: There can be more than one candidate key
in relation out of which one can be chosen as the primary
key. For Example, STUD_NO, as well as
STUD_PHONE, are candidate keys for relation
STUDENT but STUD_NO can be chosen as the primary
key (only one out of many candidate keys).
• It is a unique key.
• It can identify only one tuple (a record) at a time.
• It has no duplicate values, it has unique values.
• It cannot be NULL.
• Primary keys are not necessarily to be a single column;
more than one column can also be a primary key for a
table.
• Alternate Key: The candidate key other than
the primary key is called an alternate key.
• All the keys which are not primary keys are
called alternate keys.
• It is a secondary key.
• It contains two or more fields to identify two or
more records.
• These values are repeated.
• Eg:- SNAME, and ADDRESS is Alternate keys
• Foreign keys are the column of the table used to
point to the primary key of another table.
• Every employee works in a specific department
in a company, and employee and department are
two different entities. So we can't store the
department's information in the employee table.
That's why we link these two tables through the
primary key of one table.
• We add the primary key of the DEPARTMENT
table, Department_Id, as a new attribute in the
EMPLOYEE table.
• In the EMPLOYEE table, Department_Id is the
foreign key, and both the tables are related.
Employee Department
• Composite Key: Sometimes, a table might not have a
single column/attribute that uniquely identifies all the
records of a table. To uniquely identify rows of a table,
a combination of two or more columns/attributes can
be used. It still can give duplicate values in rare cases.
So, we need to find the optimal set of attributes that
can uniquely identify rows in a table.
• It acts as a primary key if there is no primary key in a
table
• Two or more attributes are used together to make a
composite key.
• Different combinations of attributes may give different
accuracy in terms of identifying the rows uniquely.
• Composite Key cont…
• For ex. in employee relations,
we assume that an employee
may be assigned multiple roles,
and an employee may work on
multiple projects
simultaneously. So the primary
key will be composed of all
three attributes, namely
Emp_ID, Emp_role, and
Proj_ID in combination. So
these attributes act as a
composite key since the
primary key comprises more
than one attribute.
Primary Key Foreign Key
Helps you to uniquely It is a field in the table that is
identify a record in the the primary key of another
table. table.
Primary Key never accept A foreign key may accept
null values. multiple null values.
A foreign key cannot
Primary key is a clustered
automatically create an index,
index and data in the DBMS
clustered or non-clustered.
table are physically
However, you can manually
organized in the sequence of
create an index on the foreign
the clustered index.
key.
You can have the single You can have multiple foreign
Primary key in a table. keys in a table.
Super Key Primary Key
Super Key is an attribute/s that Primary Key is a minimal set of
is used to uniquely identifies all attribute/s that is used to uniquely
attributes in a relation. identifies all attributes

All super keys can’t be primary Primary key is a minimal super


keys. key.

Various super keys together


We can choose any of the minimal
makes the criteria to select the
candidate key to be a Primary key.
candidate keys.

Num. of super keys are more Number of primary keys are less
than number of primary key than number of super keys.

Super key’s attributes can Primary key’s attributes cannot


contain NULL values. contain NULL values.
Super Key Candidate Key
Super Key is used to identify Candidate key is a
all the records in a relation. subset of Super Key.
All super keys can't be All candidate keys are
candidate keys. super keys.
Super keys are combined Candidate keys are
together to create a combined together to
candidate key. create a primary key.
Super keys are more than Candidate keys are less
Candidate keys. than Super Keys.
Schema Diagrams
• A database schema, along with primary key and
foreign key dependencies, can be represented by
schema diagrams. Figure shows the schema
diagram for our university organization. Each
relation appears as a box, with the relation name
at the top in blue, and the attributes listed inside
the box. Primary key attributes are shown
underlined. Foreign key dependencies appear as
arrows from the foreign key attributes of the
referencing relation to the primary key of the
referenced relation.
Relational Query Languages
• A query language is a language in which a user requests
information from the database. These languages are usually
on a level higher than that of a standard programming
language. Query languages can be categorized as either
procedural or nonprocedural.
• In a procedural language, the user instructs the system to
perform a sequence of operations on the database to
compute the desired result. All the instructions must be
written in order and the user has to follow.
• In a nonprocedural language, the user describes the
desired information without giving a specific procedure for
obtaining that information. All the instructions are not
written in a specific order.
• Basic Types
• The SQL standard supports a variety of built-in
types, including:
• char (n): The char data type is used to store the
character values. It is a fixed-length data type i.e.
once initialized we cannot change the size at
execution time. Hence, it is also called a Static
datatype.
• varchar(n): The VarChar data type is used to
store the character values. It is a variable-length
data type i.e. we can change the size of the
character at the time of the execution. Hence, it is
also called a Dynamic datatype.
• VARCHAR2 :
VARCHAR2 is the same as VARCHAR in the
oracle database. The main difference is that
VARCHAR is ANSI (American National
Standards Institute) Standard and VARCHAR2 is
Oracle standard. The VarChar2 data type is used to
store the character values. It is a variable-length
data type i.e. we can change the size of the
character variable at execution time. Hence, it is
also called a Dynamic datatype.
• a='rahul', b='krishna', c='loki‘
• a char(10) , b varchar(10) , c varchar2(10)
• Len as 10, 7 and 4
Char VarChar/VarChar2
Char stands VarChar/VarChar2 stands
for “Character” for Variable Character

It is used to store character It is used to store character


string of fixed length string of variable length

It has a Maximum Size of It has a Maximum Size


2000 Bytes of 4000 Bytes

Char will pad the spaces to VarChar will not pad the
the right side to fill the spaces to the right side to fill
length specified during the the length specified during
Declaration Declaration.
It is not required to
It is required to specify the
specify the size at the
size at the time of
time of declaration. It will
declaration
take 1 Byte default

It is Static Datatype(i.e It is Dynamic Datatype(i.e


Fixed Length) Variable Length)

It can lead to memory It manages Memory


wastage efficiently

It is 50% much faster than It is relatively slower as


VarChar/VarChar2 compared to Char
• INT ANSI specific integer type with maximum
precision of 38 decimal digits
• INTEGER
• ANSI and IBM specific integer type with maximum
precision of 38 decimal digits
• smallint: A small integer (a machine-dependent
subset of the integer type).
• numeric(p, d): A fixed-point number with user-
specified precision. The number consists of p digits
(plus a sign), and d of the p digits are to the right of
the decimal point. Thus numeric(3,1) allows 44.5 to
be stored exactly, but neither 444.5 or 0.32 can be
stored exactly in a field of this type.
• REAL Floating-point type with maximum
precision of 63 binary digits (approximately 18
decimal digits)
• FLOAT ANSI and IBM specific floating-point
type with maximum precision of 126 binary
digits (approximately 38 decimal digits)
• DOUBLE PRECISION ANSI specific
floating-point type with maximum precision of
126 binary digits (approximately 38 decimal
digits)
• Boolean Type The data type boolean comprises
the distinct truth values true and false
The Relational Algebra
• Relational Algebra is a procedural query
language. Relational algebra mainly provides a
theoretical foundation for relational databases
and SQL. The main purpose of using Relational
Algebra is to define operators that transform one
or more input relations into an output relation.
Given that these operators accept relations as
input and produce relations as output, they can be
combined and used to express potentially complex
queries that transform potentially many input
relations into a single output relation.
Fundamental Operators
• Selection (σ) sigma
• Projection (π) Pi
• Union (U)
• Set Difference (-)
• Set Intersection (∩)
• Rename (ρ) Rho
• Cartesian Product (X)
Selection (σ) sigma
• Selection(σ): It is used to select required tuples of
the relation. The select operation selects tuples that
satisfy a given predicate.
• It is denoted by sigma (σ). Notation: σ p(r)
• Where:
• σ is used for selection prediction
r is used for relation
p is used as a propositional logic formula which
may use connectors like: AND OR and NOT.
These relational can use as relational operators
like =, ≠, ≥, <, >, ≤.
• For the above R A B C
relation, σ(c>3)R
will select the 1 2 4
2 2 3
tuples which have 3 2 3
value of c more LOAN 4 3 4
than 3
BRANCH_NAME LOAN_NO AMOUNT
• σ BRANCH_NAME= Downtown L-17 1000
"pune" (LOAN) Redwood L-23 2000

• Select rows from Pune L-15 1500

loan where Downtown L-14 1500


Mianus L-13 500
branch_name is
Roundhill L-11 900
pune. Pune L-16 1300
• σ topic = "Database" and author = "guru"( Tutorials)
• Output – Selects tuples from Tutorials
where the topic is ‘Database’ and author
is ‘guru’.
• σ sales > 50000 (Customers)
• Output – Selects tuples from
Customers where sales is greater than
50000
Project Operation ∏ (Pi).
• This operation shows the list of those attributes
that we wish to appear in the result. Rest of the
attributes are eliminated from the table.
• It is denoted by ∏ (Pi).
• Notation: ∏ A1, A2, An (r)
• Where
• A1, A2, A3 is used as an attribute name of
relation r.
• Note: By Default, projection removes duplicate
data.
R A B C

1 2 4
2 2 3
3 2 3
4 3 4

B C
2 4
2 3
3 4
• By Default, projection removes duplicate data.
Union Operation (∪)
• Suppose there are two tuples R and S. The union
operation contains all the tuples that are either in
R or S or both in R & S.
• A union operation must hold the following
condition:
• R and S must have the attribute of the same
number.
• Duplicate tuples are eliminated automatically.
Example

Consider the following tables.

Table A Table B
column 1 column 2 column 1 column 2
1 1 1 1
1 2 1 3
A ∪ B gives

Table A ∪ B
column 1 column 2
1 1
1 2
1 3
Intersection (∩)
• An intersection is defined as all tuples (rows) that
are present in both of two union-compatible (same
columns and same type) relation A and B.
• The following definition are also equivalent.
• all elements of A that also belong to B
• all elements of B that also belong to A
Example

Consider the following tables.

Table A Table B
column 1 column 2 column 1 column 2
1 1 1 1
1 2 1 3
A ∪ B gives

Table A ∩ B
column 1 column 2
1 1
Rename Operation rho (ρ)
• The rename operation is used to rename the output
relation. It is denoted by rho (ρ).
• In relational algebra, a rename is a unary
operation written as ρ (a/b)R where:
• R is a relation, a and b are attribute names of R

Employee ρEmployeeName/Name(Employee)

Name EmployeeId EmployeeName EmployeeId


Harry 3415 Harry 3415
Sally 2241 Sally 2241
Cartesian Product (X)
• Cartesian Product (Cross-product) between two
relations. Let’s say A and B, so the cross product
between A X B will result in all the attributes of
A followed by each attribute of B. Each record of
A will pair with every record of B.
• If A has ‘n’ tuples and B has ‘m’ tuples then A X
B will have ‘ n*m ‘ tuples.
Set Difference (–)
• – Symbol denotes it. The result of A – B, is a
relation which includes all tuples that are in A
but not in B.
• The attribute name of A has to match with the
attribute name in B.
• The two-operand relations A and B should be
either compatible or Union compatible.
• It should be defined relation consisting of the
tuples that are in relation A, but not in B.
Example

Consider the following tables.

Table A Table B
column 1 column 2 column 1 column 2
1 1 1 1
1 2 1 3
A ∪ B gives

Table A - B
column 1 column 2
1 2
Structured Query Language (SQL)
• SQL was the first commercial language introduced for E.F
Codd's Relational model of database.
• Structured Query Language is a standard Database
language which is used to create, maintain and retrieve the
relational database
• SQL is case insensitive. But it is a recommended practice
to use keywords (like SELECT, UPDATE, CREATE, etc)
in capital letters and use user defined things (liked table
name, column name, etc) in small letters.
• SQL is the programming language for relational databases
like MySQL, Oracle, Sybase, SQL Server, Postgre, etc.
Other non-relational databases (also called NoSQL)
databases like MongoDB, DynamoDB, etc do not use SQL
Data Definition Language(DDL)
• DDL or Data Definition Language actually
consists of the SQL commands that can be used to
define the database schema. It simply deals with
descriptions of the database schema and is used to
create and modify the structure of database objects
in the database. DDL is a set of SQL commands
used to create, modify, and delete database
structures but not data. These commands are
normally not used by a general user, who should
be accessing the database via an application.
• Following are the five DDL
commands in SQL:
• CREATE Command
• DROP Command
• ALTER Command
• TRUNCATE Command
• RENAME Command
• COMMENT Command
CREATE Command
• CREATE is a DDL command used to create
databases, tables, triggers and other database
objects.
• Syntax to Create a Database:
• CREATE Database Database_Name;
• Suppose, you want to create a Books database in
the SQL database. To do this, you have to write
the following DDL Command:
• Create Database Books;
• Example describes how to create a new table using
the CREATE DDL command.
• CREATE TABLE table_name (column_Name1 data_ty
pe (size of the column ) ,column_Name2 data_type(size
of the column), column_NameN data_type(sizeof the col
umn)) ;
• Suppose, you want to create a Student table with five
columns in the SQL database. DDL command:
• CREATE TABLE Student (Roll_No. Int ,First_Name V
archar(20),Last_Name Varchar(20),Age Int,Marks Int;

• DESCRIBE As the name suggests, DESCRIBE is used


to describe something. Since in database we have tables,
that’s why we use DESCRIBE or DESC(both are same)
• Restrictions / Rules for creating a table :
• 1. Table names and column names must begin with a
letter.
• 2. Table names and column names can be 1 to 30
characters long.
• 3. Table names must contain only the characters A -
Z , a - z , 0 - 9 , underscore _, $ and #
• 4. Table names should not be same of another
database object.
• 5. Table name must not be an ORACLE reserved
word.
• 6. Column names should not be duplicate within a
table definition.
• CREATE TABLE product (id INT
PRIMARY KEY, name
VARCHAR(70) NOT NULL, producer
VARCHAR(100) NOT NULL, price
DECIMAL(7,2));
DROP Command
• DROP is a DDL command used to delete/remove
the database objects from the SQL database. We
can easily remove the entire table, view, or index
from the database using this DDL command.
• DROP DATABASE Database_Name;
• DROP DATABASE Books;
• DROP TABLE Table_Name;
• DROP TABLE Student;
ALTER Command
• ALTER is a DDL command which changes or modifies the existing
structure of the database, and it also changes the schema of database
objects.
• We can also add and drop constraints of the table using the ALTER
command.
• ALTER TABLE name_of_table ADD column_name column_defini
tion;
• ALTER TABLE Student ADD Father's_Name Varchar(60);
• ALTER TABLE name_of_table DROP Column_Name_1 , column
_Name_2 , ….., column_Name_N;
• ALTER TABLE Student DROP Age, Marks;
• ALTER TABLE table_name MODIFY ( column_name column_dat
atype(size));
• ALTER TABLE table_name MODIFY(Last_Name varchar(25));
TRUNCATE Command
• TRUNCATE is another DDL command which
deletes or removes all the records from the table.
• This command also removes the space allocated
for storing the table records.
• TRUNCATE TABLE Student;
RENAME Command
• ENAME is a DDL command which is used to
change the name of the database table.
• RENAME TABLE Old_Table_Name TO New_
Table_Name;
• RENAME TABLE Student TO Student_Details ;
SQL Comments
• SQL Comments are used to explain the sections
of the SQL statements, and used to prevent the
statements of SQL.
• There are three types of comments, which are
given below:
• Single line comments.
• Multi-line comments
• Inline comments
• Single Line Comment
• Comments starting and ending with a single line are
said as individual line comments. The line which
starts with '–' is a single line comment, and that
particular line is not executed.
• The text between -- and end of the line is ignored
and cannot be executed.
• --SELECT * FROM Employees;
• SELECT * FROM Customers --
WHERE City='London';
• Multi-line Comments
• Comments that start in one line and end in different
front are said as multi-line comments. The text
between /* and */ is ignored in the code part.
• The line starting with '/*' is considered as a starting
point of comment and terminated when '*/' lies at the
end.
• /*SELECT * FROM Customers;
• SELECT * FROM Products;
• SELECT * FROM Orders;
• SELECT * FROM Categories;*/
• SELECT * FROM Suppliers;
• Inline comments:
• Inline comments are an extension of multi-line
comments, and comments can be stated between
the statements and are enclosed in between '/*' and
'*/.'
• Syntax:
• SELECT * FROM /*Employees; */
• SELECT * /*Selects all columns and rows*/
FROM customer;
DML (Data Manipulation
Language)
• The DML commands in Structured Query
Language change the data present in the SQL
database. We can easily access, store, modify,
update and delete the existing records from the
database using DML commands.
• Following are the four main DML commands
in SQL:
• INSERT Command
• SELECT Command
• UPDATE Command
• DELETE Command
• INSERT command can be used to insert data into a
row of a table. INSERT INTO would insert the values
that are mentioned Syntax:
• INSERT INTO NAME_OF_TABLE(column1,
column2,column3,….columnN) VALUES(value1,
value2,value2, ….valueN); OR
• INSERT INTO NAME_OF_TABLE VALUES
(value1, value2, value3, …. valueN);
• Example:
• INSERT INTO Student(Stu_Name, DOB, Phone,
Mail) VALUES(‘Ram’, ‘1998-05-26’, 7812865845,
[email protected]’);
• INSERT INTO Student VALUES(‘Ram’, ‘1998-05-
26’, 7812865845, ‘[email protected]’);
• SELECT is the most important data manipulation
command in Structured Query Language. The
SELECT command shows the records of the
specified table. It also shows the particular record of
a particular column by using the WHERE clause.
• SELECT * FROM table_name;
• SELECT * FROM Student;
• SELECT Emp_Id, Emp_Salary FROM Employee;
• SELECT * FROM Student WHERE Stu_Marks =8
0;
• SELECT * FROM Student WHERE Stu_Marks >7
5;
• UPDATE command allows users to update or
modify the existing data in database tables.
• UPDATE Table_name SET [column_name1= val
ue_1, ….., column_nameN = value_N] WHERE
CONDITION;
• UPDATE Product SET Product_Price = 80
WHERE Product_Id = 'P102' ;
• UPDATE Student SET Stu_Marks = 80, Stu_Age
= 21 WHERE Stu_Id = 103 AND Stu_Id = 202;
• DELETE is a DML command which allows
SQL users to remove single or multiple existing
records from the database tables.
• DELETE FROM Table_Name WHERE condi
tion;
• DELETE FROM Product WHERE Product_Id
= 'P202' ;
• DELETE FROM Student WHERE Stu_Marks
> 70 ;
SQL Set Operation
• Types of Set Operation
• 1) Union, 2) UnionAll , 3) Intersect , 4) Minus
• 1. Union
• The SQL Union operation is used to combine the
result of two or more SQL SELECT queries.
• In the union operation, all the number of datatype
and columns must be same in both the tables on
which UNION operation is being applied.
• The union operation eliminates the duplicate
rows from its resultset.
• Syntax
• SELECT column_name FROM table1 UNION
SELECT column_name FROM table2;
• SELECT * FROM First UNION SELECT * FROM
Second;
• Union All operation is equal to the Union operation.
It returns the set without removing duplication and
sorting the data.
• SELECT * FROM First UNION ALL SELECT *
FROM Second;
• Intersect It is used to combine two SELECT
statements. The Intersect operation returns the
common rows from both the SELECT statements.
• In the Intersect operation, the number of datatype
and columns must be the same.
• It has no duplicates and it arranges the data in
ascending order by default.
• SELECT * FROM First INTERSECT SELECT
* FROM Second;
• Minus Minus operator is used to display the rows
which are present in the first query but absent in
the second query.
• It has no duplicates and data arranged in
ascending order by default.
• SELECT * FROM First MINUS SELECT *
FROM Second;
Null Values
• The term NULL in SQL is used to specify that a data
value does not exist in the database. It is not the same as
an empty string or a value of zero, and it signifies the
absence of a value or the unknown value of a data field.
• Some common reasons why a value may be NULL
• The value may not be provided during the data entry.
• The value is not yet known.
• CREATE TABLE CUSTOMERS( ID INT NOT NULL,
NAME VARCHAR (20) NOT NULL, AGE INT NOT
NULL, ADDRESS CHAR (25) , SALARY DECIMAL
(18, 2));
• NOT NULL signifies that column should always accept
an explicit value of the given data type.
Aggregate Functions
• Aggregate function is used to perform calculations on
multiple values and return the result in a single value like
the average of all values, the sum of all values, and
maximum & minimum value among certain groups of
values. Use the aggregate functions with SELECT
statements in the data query languages. Syntex
• function_name (DISTINCT | ALL expression)
• First, we need to specify name of the aggregate function.
• Second, we use the DISTINCT modifier when we want to
calculate the result based on distinct values or
ALL modifiers when we calculate all values, including
duplicates. The default is ALL.
• Third, we need to specify the expression that involves
columns and arithmetic operators.
Function Descriptions
count() It returns the number of rows, including
rows with NULL values in a group.
sum() It returns the total summed values in a set.
average() It returns the average value of an expression.
min() It returns the minimum value in a set.
max() It returns the maximum value in a set.
group_co It returns a concatenated string.
ncat()
first() It returns the first value of an expression.
last() It returns the last value of an expression.
• count()
This COUNT() function is used to count the number of
rows in a table or a result set. It can also be used with
a specific column to count the number of non-null
values in that column.
• Example of COUNT() Function
• SELECT COUNT(*) FROM TABLE;
• SELECT COUNT(*) FROM TABLE WHERE
RATE>=20;
• SELECT COUNT(DISTINCT COMPANY) FROM
TABLE;
• SELECT COMPANY, COUNT(*) FROM TABLE
GROUP BY COMPANY;
• sum()
• The SUM() function in DBMS accepts a column name
as an input and returns the total of all non-NULL
values in that column. It only works on numeric fields
(i.e the columns contain only numeric values). If this
function is applied to columns that include both non-
numeric (like, strings) and numeric values, it only
considers the numeric values. If there are no numeric
values, the method returns 0.
• SELECT SUM(COST) FROM TABLE;
• SELECT SUM(COST) FROM TABLE WHERE
QTY>3;
• SELECT SUM(COST) FROM TABLE WHERE QTY>3
GROUP BY COMPANY;
• average()
• The AVG() aggregate function in DBMS takes the
column name as an input and returns the average of
all non-NULL values in that column. It only works on
numeric fields (i.e the columns contain only numeric
values).
• SELECT AVG(COST) FROM PREP_TABLE;
• min()
• The MIN() function accepts the column name as a
parameter and returns the minimum value in the
column. When no row is specified, MIN() Function
returns NULL as result.
• SELECT MIN(RATE) FROM TABLE;
• mix()
• The MAX() function accepts the column name
as a parameter and returns the maximum value in
the column. When no row is specified, MAX()
function returns NULL.
• SELECT MAX(RATE) FROM TABLE;
• first()
• FIRST(): The FIRST() function returns the first
value of the selected column.
• SELECT FIRST(MARKS) FROM Students;
• SELECT FIRST(AGE) FROM Students;
• Last() LAST(): The LAST() function returns the
last value of the selected column
• SELECT LAST(MARKS) FROM Students;
• group_concat()
• The GROUP_CONCAT() function is used to
concatenate string from multiple rows into a
single string using various clauses. If the group
contains at least one non-null value, it always
returns a string value. Otherwise, you will get a
null value.
• SELECT emp_id, fname, lname, dept_id, GRO
UP_CONCAT(designation) FROM employee
Nested Subqueries
• A nested query is a query that has another query
embedded within it. The embedded query is called a
subquery.
• A subquery typically appears within the WHERE
clause of a query. It can sometimes appear in the
FROM clause or HAVING clause.
• Select AVG(noofstudents) from class where teacherID
IN( Select id from teacher Where subject=’science’
OR subject=’maths’);
• SELECT * FROM student WHERE classID =
(SELECT id FROM class WHERE noofstudents =
(SELECT MAX(noofstudents) FROM class));
JOIN
• As the name shows, JOIN means to combine
something. Join in DBMS is a binary operation
which allows you to combine join product and
selection in one single statement. The goal of
creating a join condition is that it helps you to
combine the data from two or more DBMS tables.
The tables in DBMS are associated using the
primary key and foreign keys.
• Types of Join
• There are mainly two types of joins in DBMS:
• Inner Joins: Theta, Natural, EQUI
• Outer Join: Left, Right, Full
INNER JOIN 1.Theta Join
• Theta Join allows you to merge two tables based on the
condition represented by theta. Theta joins work for all
comparison operators. It is denoted by symbol θ. The
general case of JOIN operation is called a Theta join.
• Syntax A ⋈θ B
Table A Table B
column 1 column 2 column 1 column 2
1 1 1 1
1 2 1 3

A ⋈ A.column 2 > B.column 2 (B)

You might also like