Module 1 Sudha

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 129

Database Management Systems

Module 1
Introduction to Database
Modelling and Relational
Algebra
Contents
•Introduction
•Simplified database system environment
•Typical DBMS Functionality
•Main Characteristics of the Database Approach
•Database Users
•Advantages of Using the Database Approach
•Data Models
•History of Data Models
•Three-Schema Architecture
•DBMS Languages
•DBMS Interfaces
•Database system environment

Slide 1-2
Basic Definitions
• Database: A collection of related data.
• Data: Known facts that can be recorded and have an implicit
meaning.
• Mini-world: Some part of the real world about which data is
stored in a database. For example, student grades and
transcripts at a university.
• Database Management System (DBMS): A software package/
system to facilitate the creation and maintenance of a
computerized database.
• Database System: The DBMS software together with the data
itself. Sometimes, the applications are also included.

Slide 1-3
What is a File system?
• A file system is a technique of arranging the files in a storage
medium like a hard disk, pen drive, DVD, etc. It helps you to
organizes the data and allows easy retrieval of files when they are
required. It mostly consists of different types of files like mp3,
mp4, txt, doc, etc. that are grouped into directories.
• A file system enables you to handle the way of reading and writing
data to the storage medium. It is directly installed into the
computer with the Operating systems such as Windows and Linux.

4
KEY DIFFERENCES
• A file system is a software that manages and organizes the files in a
storage medium, whereas DBMS is a software application that is used
for accessing, creating, and managing databases.
• The file system doesn't have a crash recovery mechanism on the other
hand, DBMS provides a crash recovery mechanism.
• Data inconsistency is higher in the file system. On the contrary Data
inconsistency is low in a database management system.
• File system does not provide support for complicated transactions, while
in the DBMS system, it is easy to implement complicated transactions
using SQL.
• File system does not offer concurrency, whereas DBMS provides a
concurrency facility.

5
Simplified database system environment

6
Typical DBMS Functionality
• Define a database : in terms of data types, structures and
constraints
• Construct or Load the Database on a secondary storage medium
• Manipulating the database : querying, generating reports,
insertions, deletions and modifications to its content
• Concurrent Processing and Sharing by a set of users and
programs – yet, keeping all data valid and consistent
Other features:
– Protection or Security measures to prevent
unauthorized access
– “Active” processing to take internal actions on data
– Presentation and Visualization of data

Slide 1-7
Example of a Database (with a Conceptual Data Model)
• Mini-world for the example: Part of a UNIVERSITY
environment.
• Some mini-world entities:
– STUDENTs
– COURSEs
– SECTIONs (of COURSEs)
– (academic) DEPARTMENTs
– INSTRUCTORs
Note: The above could be expressed in the ENTITY-RELATIONSHIP
data model.

Slide 1-8
Example of a Database (with a Conceptual Data Model)
• Some mini-world relationships:
– SECTIONs are of specific COURSEs
– STUDENTs take SECTIONs
– COURSEs have prerequisite COURSEs
– INSTRUCTORs teach SECTIONs
– COURSEs are offered by DEPARTMENTs
– STUDENTs major in DEPARTMENTs
Note: The above could be expressed in the ENTITY-RELATIONSHIP
data model.

Slide 1-9
Main Characteristics of the Database Approach

• Self-describing nature of a database system: A DBMS catalog stores


the description of the database. The description is called meta-
data). This allows the DBMS software to work with different
databases.
• Insulation between programs and data: Called program-data
independence. Allows changing data storage structures and
operations without having to change the DBMS access programs.
• Data Abstraction: A data model is used to hide storage details and
present the users with a conceptual view of the database.

Slide 1-10
Main Characteristics of the Database Approach

• Support of multiple views of the data: Each user may see a


different view of the database, which describes only
the data of interest to that user.
• Sharing of data and multiuser transaction processing :
allowing a set of concurrent users to retrieve and to
update the database. Concurrency control within the
DBMS guarantees that each transaction is correctly
executed or completely aborted. OLTP (Online Transaction
Processing) is a major part of database applications.

Slide 1-11
Types of Databases and Database Applications
• Numeric and Textual Databases
• Multimedia Databases
• Geographic Information Systems (GIS)
• Data Warehouses
• Real-time and Active Databases

Slide 1-12
Database Users
Users may be divided into those who actually use and control the content
(called “Actors on the Scene”) and those who enable the database to be
developed and the DBMS software to be designed and implemented (called
“Workers Behind the Scene”).
Actors on the scene
– Database administrators: responsible for authorizing access to the
database, for coordinating and monitoring its use, acquiring software, and
hardware resources, controlling its use and monitoring efficiency of
operations.
– Database Designers: responsible to define the content, the structure, the
constraints, and functions or transactions against the database. They must
communicate with the end-users and understand their needs.
– End-users: they use the data for queries, reports and some of them
actually update the database content.

Slide 1-13
Categories of End-users
• Casual : access database occasionally when needed
• Naïve or Parametric : they make up a large section of the end-user
population. They use previously well-defined functions in the form of
“canned transactions” against the database. Examples are bank-tellers or
reservation clerks who do this activity for an entire shift of operations.
• Sophisticated : these include business analysts, scientists, engineers,
others thoroughly familiar with the system capabilities. Many use tools in
the form of software packages that work closely with the stored
database.
• Stand-alone : mostly maintain personal databases using ready-to-use
packaged applications. An example is a tax program user that creates his
or her own internal database.

Slide 1-14
Workers behind the Scene
• DBMS system designers and implementers design and
implement the DBMS modules and interfaces as a software
package.
• Tool developers design and implement tools—the software
packages that facilitate database modeling and design,
database system design, and improved performance.
• Operators and maintenance personnel (system
administration personnel) are responsible for the actual
running and maintenance of the hardware and software
environment for the database system.

15
Advantages of Using the Database Approach
• Controlling redundancy in data storage and in development and
maintenance efforts.
• Sharing of data among multiple users.
• Restricting unauthorized access to data.
• Providing persistent storage for program Objects .
• Providing Storage Structures for efficient Query Processing
• Providing backup and recovery services.
• Providing multiple interfaces to different classes of users.
• Representing complex relationships among data.
• Enforcing integrity constraints on the database.
• Drawing Inferences and Actions using rules

Slide 1-16
Additional Implications of Using the Database Approach
• Potential for enforcing standards: this is very crucial for the success of
database applications in large organizations Standards refer to data item
names, display formats, screens, report structures, meta-data
(description of data) etc.
• Reduced application development time: incremental time to add each
new application is reduced.
• Flexibility to change data structures: database structure may evolve as
new requirements are defined.
• Availability of up-to-date information – very important for on-line
transaction systems such as airline, hotel, car reservations.
• Economies of scale: by consolidating data and applications across
departments wasteful overlap of resources and personnel can be
avoided.

Slide 1-17
Extending Database Capabilities
• New functionality is being added to DBMSs in the
following areas:
– Scientific Applications
– Image Storage and Management
– Audio and Video data management
– Data Mining
– Spatial data management
– Time Series and Historical Data
Management

Slide 1-18
When not to use a DBMS
• Main inhibitors (costs) of using a DBMS:
– High initial investment and possible need for additional hardware.
– Overhead for providing generality, security, concurrency control, recovery, and
integrity functions.
• When a DBMS may be unnecessary:
– If the database and applications are simple, well defined, and not expected to change
– If there are stringent real-time requirements that may not be met because of DBMS
overhead.
– If access to data by multiple users is not required.
• When no DBMS may suffice:
– If the database system is not able to handle the complexity of data because of modeling
limitations
– If the database users need special operations not supported by the DBMS.

Slide 1-19
Data Models
• Data Model: A set of concepts to describe the structure of a
database, and certain constraints that the database should
obey.
• Data Model Operations: Operations for specifying database
retrievals and updates by referring to the concepts of the data
model. Operations on the data model may include basic
operations and user-defined operations.

Slide 2-20
Categories of data models
• Conceptual (high-level, semantic) data models: Provide
concepts that are close to the way many users perceive
data. (Also called entity-based or object-based data
models.)
• Physical (low-level, internal) data models: Provide
concepts that describe details of how data is stored in the
computer.
• Implementation (representational) data models: Provide
concepts that fall between the above two, balancing user
views with some computer storage details.

Slide 2-21
History of Data Models
Relational Model:
• proposed in 1970 by E.F. Codd
(IBM), first commercial system in
1981-82. Now in several
commercial products (DB2, ORACLE,
SQL Server, SYBASE, INFORMIX).
• The relational model represents the
database as a collection of
relations. A relation is nothing but a
table of values. Every row in the
table represents a collection of
related data values. These rows in
the table denote a real-world entity
or relationship.

22
Network model
• Is a database model that is designed as
a flexible approach to representing
objects and their relationships. A
unique feature of the network model is
its schema, which is viewed as a graph
where relationship types are arcs and
object types are nodes.
• the first one to be implemented by
Honeywell in 1964-65 (IDS System).
Adopted heavily due to the support by
CODASYL (CODASYL - DBTG report of
1971). Later implemented in a large
variety of systems - IDMS (Cullinet -
now CA), DMS 1100 (Unisys), IMAGE
(H.P.), VAX -DBMS (Digital Equipment
Corp.).

23
Network Model
• ADVANTAGES:
• Network Model is able to model complex relationships and represents
semantics of add/delete on the relationships.
• Can handle most situations for modeling using record types and
relationship types.
• Language is navigational; uses constructs like FIND, FIND member, FIND
owner, FIND NEXT within set, GET etc. Programmers can do optimal
navigation through the database.
• DISADVANTAGES:
• Navigational and procedural nature of processing
• Database contains a complex array of pointers that thread through a set of
records. Little scope for automated "query optimization”

Slide 2-24
Hierarchical database model
• is a data model in which the data are • implemented in a joint effort by IBM
organized into a tree-like structure. and North American Rockwell
The data are stored as records which around 1965. Resulted in the IMS
are connected to one another family of systems. The most popular
through links. A record is a collection of
fields, with each field containing only
model.
one value. The type of a record defines
which fields the record contains.
• The hierarchical database model
mandates that each child record has
only one parent, whereas each parent
record can have one or more child
records. In order to retrieve data from
a hierarchical database the whole tree
needs to be traversed starting from the
root node.

25
• ADVANTAGES:
• It promotes data sharing. Hierarchical Model
• Parent/child relationship
• Promotes conceptual simplicity.
• Database security is provided and enforced by DBMS.
• Parent/child relationship promotes data integrity.
• It is efficient with 1:M relationships.
DISADVANTAGES:
• Complex implementation requires knowledge of physical data storage characteristics.
• Navigational system yields complex application development, management, and use;
requires knowledge of hierarchical path.
• Changes in structure require changes in all application programs.
• There are implementation limitations (no multiparent or M:N relationships).
• There is no data definition or data manipulation language in the DBMS.
• There is a lack of standards.

Slide 2-26
Object Oriented (OO) Data Model
• Increasingly complex real-world
problems demonstrated a need
for a data model that more
closely represented the real
world. In the object
oriented data model (OODM),
both data and their
relationships are contained in a
single structure known as
an object.

27
Object relational model
• is a combination of a Object oriented database model and a Relational
database model. So, it supports objects, classes, inheritance etc. just
like Object Oriented models and has support for data types, tabular
structures etc. like Relational data model.
• One of the major goals of Object relational data model is to close the
gap between relational databases and the object oriented practices
frequently used in many programming languages such as C++, C#, Java
etc.
• Both Relational data models and Object oriented data models are very
useful. But it was felt that they both were lacking in some
characteristics and so work was started to build a model that was a
combination of them both. Hence, Object relational data model was
created as a result of research that was carried out in the 1990’s.

Slide 2-28
Schemas versus Instances
• Database Schema: A database schema is the skeleton structure that represents the
logical view of the entire database. It defines how the data is organized and how
the relations among them are associated. It formulates all the constraints that are
to be applied on the data.
• The description of a database. Includes descriptions of the database structure and
the constraints that should hold on the database.
• Schema Diagram: A diagrammatic display of (some aspects of) a database schema.
• Schema Construct: A component of the schema or an object within the schema,
e.g., STUDENT, COURSE.
• Database Instance: The actual data stored in a database at a particular moment in
time. Also called database state (or occurrence).

Slide 2-29
Database Schema Vs. Database State
• Database State: Refers to the content of a database at a moment in time.
• Initial Database State: Refers to the database when it is loaded
• Valid State: A state that satisfies the structure and constraints of the
database.
• Distinction
• The database schema changes very infrequently. The database state
changes every time the database is updated.
• Schema is also called intension, whereas state is called extension.

Slide 2-30
Example of a Database Schema

31
Example of a database state

32
Three-Schema Architecture
• Proposed to support DBMS characteristics of:

• Program-data independence.
Data Independence is the property of DBMS that
helps you to change the Database schema at
one level of a database system without
requiring to change the schema at the next
higher level.

• Support of multiple views of the data.

Slide 2-33
Three-Schema Architecture

34
Ex:
Type of Schema Implementation

External Schema View 1: Course


info(cid:int,cname:string)
View 2: studeninfo(id:int.
name:string)

Conceptual Shema Students(id: int, name: string,


login: string, age: integer)
Courses(id: int, cname.string,
credits:integer) Enrolled(id:
int, grade:string)

Physical Schema •Relations stored as


unordered files.
•Index on the first column of
Students.

35
Three-Schema Architecture
• Defines DBMS schemas at three levels:
• Internal schema at the internal level to describe
physical storage structures and access paths. Typically
uses a physical data model.
• Conceptual schema at the conceptual level to describe
the structure and constraints for the whole database for
a community of users. Uses a conceptual or an
implementation data model.
• External schemas at the external level to describe the
various user views. Usually uses the same data model as
the conceptual level.

Slide 2-36
Three-Schema Architecture
Mappings among schema levels are needed to transform requests
and data. Programs refer to an external schema, and are mapped
by the DBMS to the internal schema for execution.

Data Independence:
• Logical Data Independence: The capacity to change the
conceptual schema without having to change the external
schemas and their application programs.
• Physical Data Independence: The capacity to change the internal
schema without having to change the conceptual schema.

Slide 2-37
Data Independence

When a schema at a lower level is changed, only the mappings


between this schema and higher-level schemas need to be
changed in a DBMS that fully supports data independence.
The higher-level schemas themselves are unchanged. Hence,
the application programs need not be changed since they
refer to the external schemas.

Slide 2-38
DBMS Languages
• Data Definition Language (DDL): Used by the DBA and database
designers to specify the conceptual schema of a database. In many
DBMSs, the DDL is also used to define internal and external schemas
(views). In some DBMSs, separate storage definition language (SDL)
and view definition language (VDL) are used to define internal and
external schemas.
• Data Manipulation Language (DML): Used to specify database
retrievals and updates.
• DML commands (data sublanguage) can be embedded in
a general-purpose programming language (host
language), such as COBOL, C or an Assembly Language.
• Alternatively, stand-alone DML commands can be
applied directly (query language).

Slide 2-39
DBMS Languages

Slide 2-40
DBMS Interfaces
• Stand-alone query language interfaces.
• Programmer interfaces for embedding DML in programming languages:
• Pre-compiler Approach
• Procedure (Subroutine) Call Approach
• User-friendly interfaces:
• Menu-based, popular for browsing on the web
• Forms-based, designed for naïve users
• Graphics-based (Point and Click, Drag and Drop etc.)
• Natural language: requests in written English
• Combinations of the above

Slide 2-41
Other DBMS Interfaces

• Speech Input and Output


• Web Browser as an interface
• Parametric interfaces (e.g., bank tellers) using
function keys.
• Interfaces for the DBA:
• Creating accounts, granting authorizations
• Setting system parameters
• Changing schemas or access path

Slide 2-42
Database System Utilities
• To perform certain functions such as:
• Loading data stored in files into a database.
Includes data conversion tools.
• Backing up the database periodically on tape.
• Reorganizing database file structures.
• Report generation utilities.
• Performance monitoring utilities.
• Other functions, such as sorting, user monitoring,
data compression, etc.

Slide 2-43
Database system environment

1) DBMS Component Modules


2) Centralized and Client-Server Architectures

44
Typical DBMS Component Modules

45
Centralized and Client-Server Architectures

• Centralized DBMS: combines everything into single system


including- DBMS software, hardware, application programs
and user interface processing software.

Slide 2-46
•Basic Client-Server Architectures:

Specialized Servers with Specialized functions


Clients
DBMS Server

Specialized Servers with Specialized functions:


• File Servers
• Printer Servers
• Web Servers
• E-mail Servers

Slide 2-47
Clients:
•Provide appropriate interfaces and a client-version of the system to
access and utilize the server resources.
•Clients maybe diskless machines or PCs or Workstations with disks
with only the client software installed.
•Connected to the servers via some form of a network
(LAN: local area network, wireless network, etc.)
DBMS Server
• Provides database query and transaction services to the
clients
• Sometimes called query and transaction servers

Slide 2-48
Two Tier Client-Server Architecture
•User Interface Programs and Application Programs run on the
client side
•Interface called ODBC (Open Database Connectivity – see Ch 9)
provides an Application program interface (API) allow client side
programs to call the DBMS. Most DBMS vendors provide ODBC
drivers.
• A client program may connect to several DBMSs.
• Other variations of clients are possible: e.g., in some DBMSs, more
functionality is transferred to clients including data dictionary
functions, optimization and recovery across multiple servers, etc.
In such situations the server may be called the Data Server.

49
Three Tier Client-Server Architecture
• Common for Web applications
• Intermediate Layer called Application Server or Web Server:
• stores the web connectivity software and the rules and
business logic (constraints) part of the application used to
access the right amount of data from the database server
• acts like a conduit for sending partially processed data
between the database server and the client.
• Additional Features- Security:
• encrypt the data at the server before transmission
• decrypt data at the client

Slide 2-50
51
Classification of DBMSs
• Based on the data model used:
• Traditional: Relational, Network, Hierarchical.
• Emerging: Object-oriented, Object-relational.
• Other classifications:
• Single-user (typically used with micro- computers) vs. multi-
user (most DBMSs).
• Centralized (uses a single computer with one database) vs.
distributed (uses multiple computers, multiple databases)

– Distributed Database Systems have now come to be known


as client server based database systems because they do not
support a totally distributed environment, but rather a set of
database servers supporting a set of clients.
» Etc…….

Slide 2-52
Data Modelling using Entities
and Relationships

53
ER Diagram
• The ER or (Entity Relational Model) is a high-level conceptual data model
diagram. Entity-Relation model is based on the notion of real-world entities and
the relationship between them.
• Helps you to define terms related to entity relationship modeling
• Provide a preview of how all your tables should connect, what fields are going to
be on each table
• Helps to describe entities, attributes, relationships
• ER diagrams are translatable into relational tables which allows you to build
databases quickly
• ER diagrams can be used by database designers as a blueprint for implementing
data in specific software applications
• The database designer gains a better understanding of the information to be
contained in the database with the help of ERP diagram
• ERD is allowed you to communicate with the logical structure of the database to
users

54
Components of the ER Diagram
This model is based on three basic concepts:
•Entities Attributes Relationships

55
Example COMPANY Database
We need to create a database schema design based on the following (simplified)
Requirements of the COMPANY Database:
•The company is organized into DEPARTMENTs. Each department has a name,
number and an employee who manages the department. We keep track of the
start date of the department manager. A department may have several
locations.
• Each department controls a number of PROJECTs. Each project has a unique
name, unique number and is located at a single location.
•We store each EMPLOYEE’s social security number, address, salary, sex, and
birthdate.
– Each employee works for one department but may work on several projects.
– We keep track of the number of hours per week that an employee currently works
on each project.
– We also keep track of the direct supervisor of each employee.

56
• Each employee may have a number of DEPENDENTs.
– For each dependent, we keep track of their name, sex,
birthdate, and relationship to the employee.
• Entities and Attributes
• Entities are specific objects or things in the mini-world that are represented in the
database.
– For example the EMPLOYEE John Smith, the Research DEPARTMENT, the ProductX PROJECT
• Attributes are properties used to describe an entity.
– For example an EMPLOYEE entity may have the attributes Name, SSN, Address, Sex, BirthDate
• A specific entity will have a value for each of its attributes.
– For example a specific employee entity may have Name='John Smith', SSN='123456789', Address
='731, Fondren, Houston, TX', Sex='M', BirthDate='09-JAN-55‘
• Each attribute has a value set (or data type) associated with it – e.g. integer, string,
subrange, enumerated type, …

57
Types of Attributes

58
Types of Attributes Examples
• Simple
– Each entity has a single atomic value for the attribute. For example,
SSN or Sex.
• Composite
– The attribute may be composed of several components. For
example, Address (Apt#, House#, Street, City, State, ZipCode,
Country) or Name (FirstName, MiddleName, LastName).
Composition may form a hierarchy where some components are
themselves composite.
• Multi-valued
– An entity may have multiple values for that attribute. For example,
Color of a CAR or PreviousDegrees of a STUDENT. Denoted as {Color}
or {PreviousDegrees}.

Chapter 3-59
Types of Attributes (2)

• In general, composite and multi-valued attributes may be


nested arbitrarily to any number of levels although this is
rare.
• For example, PreviousDegrees of a STUDENT is a composite
multi-valued attribute denoted by {PreviousDegrees
(College, Year, Degree, Field)}.

Chapter 3-60
Entity Types and Key Attributes
• Entities with the same basic attributes are grouped or typed into an
entity type.
– For example, the EMPLOYEE entity type or the PROJECT entity type.
• An attribute of an entity type for which each entity must have a unique
value is called a key attribute of the entity type.
– For example, SSN of EMPLOYEE.
• A key attribute may be composite.
– For example, VehicleTagNumber is a key of the CAR entity type with
components (Number, State).
• An entity type may have more than one key.
– For example, the CAR entity type may have two keys:
– VehicleIdentificationNumber (popularly called VIN) and
– VehicleTagNumber (Number, State), also known as license_plate number.

Chapter 3-61
Entity Type CAR with two keys and a corresponding Entity Set

62
Entity Set
• Each entity type will have a collection of
entities stored in the database
– Called the entity set
• Previous slide shows three CAR entity
instances in the entity set for CAR
• Same name (CAR) used to refer to both the
entity type and the entity set
• Entity set is the current state of the entities of
that type that are stored in the database
63
Initial Design of Entity Types for the COMPANY Database
Schema
• Based on the requirements, we can identify four initial entity types
in the COMPANY database:
– DEPARTMENT
– PROJECT
– EMPLOYEE
– DEPENDENT
• Their initial design is shown on the following slide
• The initial attributes shown are derived from the requirements
description

64
Initial Design of Entity Types:
EMPLOYEE, DEPARTMENT, PROJECT, DEPENDENT

65
Refining the initial design by introducing relationships
• ER model has three main concepts:
– Entities (and their entity types and entity sets)
– Attributes (simple, composite, multivalued)
– Relationships (and their relationship types and relationship sets)
• A relationship relates two or more distinct entities
with a specific meaning.

For example, EMPLOYEE John Smith works on the ProductX PROJECT, or EMPLOYEE
Franklin Wong manages the Research DEPARTMENT.
• Relationships of the same type are grouped or typed
into a relationship type.
– For example, the WORKS_ON relationship type in which EMPLOYEEs and PROJECTs
participate, or the MANAGES relationship type in which EMPLOYEEs and DEPARTMENTs
participate.

66
Relationship instances of the WORKS_FOR N:1 relationship between
EMPLOYEE and DEPARTMENT

67
Relationship instances of the M:N WORKS_ON
relationship between EMPLOYEE and PROJECT

68
• Relationship Type:
– Is the schema description of a relationship
– Identifies the relationship name and the participating entity
types
– Also identifies certain relationship constraints
• Relationship Set:
– The current set of relationship instances represented in the
database
– The current state of a relationship type

• In ER diagrams, we represent the relationship type as follows:


– Diamond-shaped box is used to display a relationship type
– Connected to the participating entity types via straight lines

69
Refining the COMPANY database schema by introducing
relationships
• By examining the requirements, six relationship types are identified
• All are binary relationships( degree 2)
• Listed below with their participating entity types:
– WORKS_FOR (between EMPLOYEE, DEPARTMENT)
– MANAGES (also between EMPLOYEE, DEPARTMENT)
– CONTROLS (between DEPARTMENT, PROJECT)
– WORKS_ON (between EMPLOYEE, PROJECT)
– SUPERVISION (between EMPLOYEE (as subordinate),
– EMPLOYEE (as supervisor))
– DEPENDENTS_OF (between EMPLOYEE, DEPENDENT)

70
ER DIAGRAM

71
Recursive Relationship Type
• An relationship type whose with the same participating
entity type in distinct roles
– Example: the SUPERVISION relationship
• EMPLOYEE participates twice in two distinct roles:
– supervisor (or boss) role
– supervisee (or subordinate) role
• Each relationship instance relates two distinct
EMPLOYEE entities:
– One employee in supervisor role
– One employee in supervisee role

72
Weak Entity Types
• An entity that does not have a key attribute
• A weak entity must participate in an identifying relationship type
with an owner or identifying entity type
• Entities are identified by the combination of:
– A partial key of the weak entity type
– The particular entity they are related to in the identifying entity type
• Example:
– A DEPENDENT entity is identified by the dependent’s first name, and the specific
EMPLOYEE with whom the dependent is related
– Name of DEPENDENT is the partial key
– DEPENDENT is a weak entity type
– EMPLOYEE is its identifying entity type via the identifying relationship type
DEPENDENT_OF

73
Constraints on Relationships
• Constraints on Relationship Types(Also known as ratio constraints)
• Cardinality Ratio (specifies maximum participation)
– One-to-one (1:1)
– One-to-many (1:N) or Many-to-one (N:1)
– Many-to-many (M:N)
• Existence Dependency Constraint (specifies minimum
• participation) (also called participation constraint)
– zero (optional participation, not existence-dependent)
– one or more (mandatory participation, existence-
dependent)

74
Many-to-one (N:1) RELATIONSHIP

Chapter 3-75
Many-to-many (M:N) RELATIONSHIP

Chapter 3-76
A RECURSIVE RELATIONSHIP SUPERVISION

© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition

Chapter 3-77
Alternative (min, max) notation for relationship structural constraints:
• Specified on each participation of an entity type E in a relationship type R
• Specifies that each entity e in E participates in at least min and at most max relationship
instances in R
• Default(no constraint): min=0, max=n (signifying no limit)
• Must have min<=max, min>=0, max >=1
• Derived from the knowledge of mini-world constraints
• Examples
– A department has exactly one manager and an employee can manage at most one
department.
• Specify (0,1) for participation of EMPLOYEE in MANAGES
• Specify (1,1) for participation of DEPARTMENT in MANAGES
– An employee can work for exactly one department but a department can have any
number of employees.
• Specify (1,1) for participation of EMPLOYEE in WORKS_FOR
• Specify (0,n) for participation of DEPARTMENT in WORKS_FOR

78
The (min,max) notation for relationship constraints

79
COMPANY ER Schema Diagram using (min,
max) notation

80
Summary of notation for ER diagrams

81
82
83
ER DIAGRAM FOR A BANK
DATABASE

© The Benjamin/Cummings Publishing Company, Inc. 1994, Elmasri/Navathe, Fundamentals of Database Systems, Second Edition

Chapter 3-84
Relationships of Higher Degree
 Relationship types of degree 2 are called binary
 Relationship
types of degree 3 are called ternary
and of degree n are called n-ary
 Ingeneral, an n-ary relationship is not equivalent
to n binary relationships
 Higher-order relationships discussed further in
Chapter 4

Chapter 3-85
Example of a ternary relationship

86
Some of the Currently Available Automated Database
Design Tools

87
Query Languages
• Language in which user requests information from the
database.
• Categories of languages
• Procedural
• Non-procedural, or declarative
• “Pure” languages:
• Relational algebra
• Tuple relational calculus
• Domain relational calculus
• Pure languages form underlying basis of query languages
that people use.

88
Relational Algebra
• The relational algebra is a procedural query language. Relational
algebra is the basic set of operations for the relational model
• It consists of a set of operations that take one or two relations as
input and produce a new relation as their result.
• Six basic operators
– select: 
– project: 
– union: 
– set difference: –
– Cartesian product: x
– rename: 

89
Relational Algebra Overview
• Relational Algebra consists of several groups of operations
• Unary Relational Operations
– SELECT (symbol: (sigma))
– PROJECT (symbol:  (pi))
– RENAME (symbol:  (rho))
• Relational Algebra Operations From Set Theory
– UNION (), INTERSECTION ( ), DIFFERENCE (or MINUS, – )
– CARTESIAN PRODUCT ( x )
• Binary Relational Operations
– JOIN (several variations of JOIN exist)
– DIVISION
• Additional Relational Operations
– OUTER JOINS, OUTER UNION
– AGGREGATE FUNCTIONS (These compute summary of information: for example, SUM,
COUNT, AVG, MIN, MAX)

90
Select Operation
• The select operation selects tuples that satisfy a given predicate.
• Notation:  p(r)
• p is called the selection predicate
• The selection condition acts as a filter
• comparisons are done using =, ≠, <, ≤, >, and ≥ in the selection
predicate.
• we can combine several predicates into a larger predicate by using the connectives and
(∧), or (∨), and not ( ¬ )
Example of selection:
1. To select tuples of the instructor who is in the “Physics” department
 dept_name=“Physics”(instructor)
2. find all instructors with salary greater than 90,000
 salary>90000(instructor)
3. to find the instructors in Physics with a salary greater than $90,000
 dept name =“Physics”∧salary>90000 (instructor )

91
Project Operation
• Project is used to display the required attributes from a relation.
• Notation:  A , A ,, A ( r )
1 2 k
where A1, A2 are attribute names and r is a relation name.
• The result is defined as the relation of k columns obtained by erasing the columns
that are not listed
• Duplicate rows removed from result, since relations are sets
Example:
• To list all instructors’ ID, name, and salary attributes of instructor
ID, name, salary (instructor)
• Find the name of all instructors in the Physics department
name ( dept name =“Physics” (instructor))

92
RENAME
• The RENAME operator is denoted by (rho)
• The general RENAME operation can be expressed
by any of the following forms:

 S (B1, B2, …, Bn )(R) changes both:

 the relation name to S, and

 the column (attribute) names to B1, B1, …..Bn

 S(R) changes:

 the relation name only to S

 (B1, B2, …, Bn )(R) changes:

 the column (attribute) names only to B1, B1, …..Bn

93
Union Operation
• Notation: r  s
• Defined as:
r  s = {t | t  r or t  s}
• The result of this operation, denoted by R ∪ S, is a relation that includes all tuples that are
either in R or in S or in both R and S. Duplicate tuples are eliminated.
• For r  s to be valid (r and s should be union compatible).
1. r, s must have the same arity (same number of attributes)
2. The attribute domains must be compatible (example: 2nd column
of r deals with the same type of values as does the 2 nd
column of s)
• Example: to find all courses taught in the Fall 2009 semester, or in the Spring 2010
semester, or in both
course_id ( semester=“Fall” Λ year=2009 (section)) 

course_id ( semester=“Spring” Λ year=2010 (section))

94
Set Difference Operation
• Notation r – s
• Defined as:
• r – s = {t | t  r and t  s}

• Set differences must be taken between compatible relations.


• r and s must have the same arity
• attribute domains of r and s must be compatible
• Example: to find all courses taught in the Fall 2009 semester, but not in the
Spring 2010 semester

course_id ( semester=“Fall” Λ year=2009 (section)) −

course_id ( semester=“Spring” Λ year=2010 (section))

95
Set-Intersection Operation

• Notation: r  s
• Defined as:
• r  s = { t | t  r and t  s }
• The result of this operation, denoted by r∩ s, is a relation that
includes all tuples that are in both r and s.
• Assume:
– r, s have the same arity
– attributes of r and s are compatible
• Example: to find all courses taught in the Fall 2009 semester, or in
the Spring 2010 semester, or in both
course_id ( semester=“Fall” Λ year=2009 (section)) 
course_id ( semester=“Spring” Λ year=2010 (section))

96
Set operations example

97
Cartesian-Product Operation

The Cartesian-product operation, denoted by a cross (×), allows us


to combine information from any two relations.
Cartesian product of relations r1 and r2 is written as r1 × r2.
Ex: to find the names of all instructors in the Physics department
together with the course id of all courses they taught.

98
Join / Cartesian Product
• Binary Operation between two relation A and B
• The operator generates all possible combination between all
tuples of A and B
• Denoted by ‘× ‘
• Synonym as ‘cross join’ . e.g
A B A×B
P Q M N
P Q M N p1 q1 m1 n1
m1 n1
p1 q1 p1 q1 m2 n2
m2 n2
p1 q1 m3 n3
p2 q2 m3 n3
p2 q2 m1 n1
p2 q2 m2 n2
p2 q2 m3 n3

99
Join / Cartesian Product
• A B A×B
p1
p1 q1 q1
p q2
p2 q2 P2
P Q M N
P Q M N p1 q1 m1 n1
m1 n1
p1 q1 p1 q1 m2 n2
m2 n2
p1 q1 m3 n3
p2 q2 m3 n3
p2 q2 m1 n1
p2 q2 m2 n2
p2 q2 m3 n3

100
Join / Cartesian Product

Properties of Join operation


Given two relation A and B with :
degree(A) = m degree(B)=n
cardinality(A)= c1 cardinality(B)=c2
Then
degree(A×B) = degree(A)+ degree(B)=>m+n
cardinality(A×B) = cardinality(c1) * cardinality(c2) => c1*c2

101
Join in relational Algebra
Join is a combination of a Cartesian product followed by a selection process.
A Join operation pairs two tuples from different relations, if and only if a given join condition is satisfied.
Various forms of join operation are:
Inner Joins:
Theta join
EQUI join
Natural join
Outer join:
Left Outer Join
Right Outer Join
Full Outer Join
Inner Join:
In an inner join, only those tuples that satisfy the matching criteria are included, while the rest are excluded.

102
103
104
105
106
107
A B
OUTERAJOIN
⋈B

Id Name Na Mar Id Name Marks


me ks
1 Jay
0 Roh 20 20 Veer 18
an 30 John 14
2 Veer
0 Veer 18
3 John John 14
0 Sam 13
•In an outer join, along with tuples that satisfy the matching criteria, we also include some
or all tuples that do not match the criteria.
Outer join:
•Left Outer Join (A B)
•Right Outer Join (A B)
•Full Outer Join (A B)

108
LEFT JOIN ( )
• This join returns all the rows of the table on the left side of
the join and matching rows for the table on the right side
of join.
• The rows for which there is no matching row on right side,
the result-set will contain null.
• LEFT JOIN is also known as LEFT OUTER JOIN

id Name Marks
10 Jay NULL
20 Veer 18
30 John 14

109
RIGHT JOIN( )
• RIGHT JOIN is similar to LEFT JOIN.
• This join returns all the rows of the table on the right side
of the join and matching rows for the table on the left side
of join.
• The rows for which there is no matching row on left side,
the result-set will contain null.
• RIGHT JOIN is also known as RIGHT OUTER JOIN
id name marks
Null Rohan 20
20 Veer 18
30 John 14
Null Sam 13

110
FULL JOIN ( )
• FULL OUTER JOIN creates the result-set by combining
result of both LEFT JOIN and RIGHT JOIN.
• The result-set will contain all the rows from both the
tables.
• The rows for which there is no matching, the result-set will
contain NULL values.
ID Name Marks
10 Jay NULL
Table 20 Veer 18
Table A
B 30 John 14
NULL Rohan 20
Null Sam 13

111
OUTER UNION Operations
• The outer union operation was developed to take the union of tuples
from two relations if the relations are not type compatible.
• This operation will take the union of tuples in two relations R(X, Y)
and S(X, Z) that are partially compatible, meaning that only some of
their attributes, say X, are type compatible.
• The attributes that are type compatible are represented only once in the
result, and those attributes that are not type compatible from either
relation are also kept in the result relation T(X, Y, Z).

112
Division ÷
• Binary Operation between two relation C and B
• Implicitly C is A × B where A is any Relation
• C÷B => (A × B) ÷ B
• The operator ‘÷ ‘ splits B from C and produces A
• e.g C÷B=>
C= A × B B A
P Q M N M N P Q
p1 q1 m1 n1 m1 n1 p1 q1
p1 q1 m2 n2 m2 n2
p2 q2
p1 q1 m3 n3 m3 n3
p2 q2 m1 n1
p2 q2 m2 n2
p2 q2 m3 n3

113
Division ÷
Student Subject TestQP(Student ×Subject)
1 1 120 combinations
DBM 2 DBM
2
S .. S
p3 COA
.. COA
60 60
TestQP(Student ×Subject) Student Subject(TestQP ÷ Student)
1,DBMS
1,COA
1
2,DBMS 2
2,COA
…. 3
60,DBMS
60,COA ..
60

114
Division ÷

estQP(Student ×Subject) Subject Student(TestQP ÷


Subject)
1,DBMS 1
1,COA
2,DBMS
2
2,COA 3
….
60,DBMS ..
60,COA
60

115
Formal examples: Division ÷
Cases :
Case 1:
Given A,B,C are relations and X,Y are attributes
C(X,Y) ÷ A(X) => B(Y)
C(X,Y) ÷ A(Y) => B(X)
Case 2:
X Y ÷ Y = X
X1 Y1 X1
X2 Y2 Y1
÷
Y2
÷
X1 Y2
X4 y4

116
Division ÷
Formal examples:
Case 3:
X Y
÷ X = Y
X1 Y1 X1 Y1
X2 Y2 Y2

X1 Y2
Case 4: y4
X4
X Y Y X
X1 Y1 Null
X2 Y2 ÷ Y1 =
Y2
Y3
X1 Y2
X4 y4 Y4

117
Division ÷
Formal examples:
Case 5:
X Y ÷ Y = x
X1 Y1 Y1 X1
X2 Y1 X2
X3

X3 Y1 X4
Case 6:
X4 y1
X Y Y X
X1 Y1 X1
÷ Y1
=
X2 Y1 X2
Y2

X2 Y2
X1 y2

118
Recap of Relational Algebra Operations

119
Aggregate Function Operation
• Use of the Aggregate Functional operation ℱ
•  ℱMAX Salary (EMPLOYEE) retrieves the maximum salary value

from the EMPLOYEE relation


•  ℱMIN Salary (EMPLOYEE) retrieves the minimum Salary value

from the EMPLOYEE relation


•  ℱSUM Salary (EMPLOYEE) retrieves the sum of the Salary from

the EMPLOYEE relation


•  ℱCOUNT SSN, AVERAGE Salary (EMPLOYEE) computes the count (number) of

employees and their average salary


•  Note: count just counts the number of rows, without removing duplicates

120
Examples of applying aggregate functions and grouping

121
Examples of Queries in Relational Algebra

• Query 1. Retrieve the name and address of all


employees who work for the ‘Research’
department.

122
• Query 2. For every project located in
‘Stafford’, list the project number, the
controlling department number, and the
department manager’s last name, address,
and birth date.

123
Query 3. Find the names of employees who work on all
the projects controlled by department number 5.

124
Query 4. Make a list of project numbers for projects
that involve an employee whose last name is ‘Smith’,
either as a worker or as a manager of the
department that controls the project.

125
Query 5. List the names of all employees with
two or more dependents.

126
Query 6. Retrieve the names of employees who have
no dependents.

127
Query 7. List the names of managers who have
at least one dependent.

128
Thank YOU
129

You might also like