Chp2 Database

Download as pdf or txt
Download as pdf or txt
You are on page 1of 8

Chp2

CHP 2
A data model is a collection of conceptual tools for describing data, data
relationships,
data semantics, and consistency constraints.
The relational model describes data at the logical
and view levels, abstracting away low-level details of data storage.
Vertical partitioning involves breaking a relation into sub-relations by selecting a
subset of the attributes (columns) to form a new relation, while horizontal
partitioning involves breaking a relation into sub-relations by selecting a subset of
the tuples (rows) to form a new relation.

2.1 Structure of Relational Databases


Three types of key
1.Primary key
2.Super key

3.Candidate key

degree- number of attributes


a tuple is simply a sequence(or list) of values.

attribute refers to a column of a table.


domain of the salary attribute of the instructor relation is
the set of all possible salary values.

Atomic

if we split the value from the phone number attribute into a


country code, an area code, and a local number, we would be treating it
as a non-atomic
value

Null value

We would then have to use the null value to signify that the value is
unknown or does not existStructure of Relational Databases

Chp2 1
2.2 Database Schema
a relation schema consists of a list of attributes and their corresponding
domains.
instructor, to refer to both the
schema and the instance.

2.3 Keys
no two tuples in a relation are allowed to have exactly the same value for all
attributes.
A superkey is a set of one or more attributes that, taken collectively, allow us
to
identify uniquely a tuple in the relation.

minimal superkeys are called candidate keys.


left keys→ alternate keys

primary key to denote a candidate key that is chosen by the


database designer as the principal means of identifying tuples within a
relation.

primary keys are also referred to as primary key constraints.

foreign-key constraint
referencing relation, referenced relation

a foreign-key constraint, the referenced attribute(s) must be the primary key of


the referenced relation.

they
do not support referential integrity constraints where the referenced attribute
is not a
primary key.

2.4 Schema Diagram


composite key→A composite key in SQL can be defined as a combination of
multiple columns, and these columns are used to identify all the rows that
are involved uniquely. Even though a single column can't identify any row
uniquely, a combination of over one column can uniquely identify any record.

Alternate keys are those candidate keys which are not the Primary key.
There can be only one Primary key for a table. Therefore all the remaining

Chp2 2
Candidate keys are known as Alternate or Secondary keys.

An artificial key is an extra attribute added to the table that is seen by the
user. It does not exist in the external reality but can be verified for syntax or
check digits inside itself. Example: the open codes in the UPC/EAN scheme
that a user can assign to his own stuff.

Association table shows relations between two table by foreign or primary key.
Library Management System

two headed arrows for referential integrity constraints

2.5 Relational query language


The functional language was designed to allow for direct translation of the
functional forms into first-order logical forms, which are used to query
the target database
. The functional query language uses the same basic constructs as the
original query language to represent objects in the target database.

Declarative query languages let users express what data to retrieve,


letting the engine underneath take care of seamlessly retrieving it
. They function in a more general manner and involve giving broad
instructions about what task is to be completed, rather than the specifics on
how to complete it

2.6 The Relational Algebra


-The relational algebra consists of a set of operations that take one or two
relations as
input and produce a new relation as their result.

-The select, project, and rename operations, are


called unary operations because they operate on one relation

-union, Cartesian product, and set difference, operate on pairs of relations


and
are, therefore, called binary operations.

We must ensure that the input relations to the union operation have the same
number of attributes; the number of attributes of a relation is referred to as its
arity.

When the attributes have associated types, the types of the ith attributes of
both

Chp2 3
input relations must be the same, for each i.
Such relations are referred to as compatible relations.

The select operation selects tuples that satisfy a given predicate.


σdept name =“Physics” ∧ salary>90000 (instructor)
Suppose we want to list all instructors’ ID, name, and salary, but we do not
care about
the dept name. The project operation allows us to produce this relation.
ΠID, name, salary(instructor)
ΠID,name,salary∕12(instructor) to get the monthly salary of each instructor.

Composition of Relational Operations


“Find the names of all instructors in the Physics
department.” We write:
Πname (σdept name =“Physics” (instructor))

The Cartesian-product operation, denoted by a cross (×), allows us to


combine information from any two relations.
relation schema for r = instructor × teaches

(instructor.ID, instructor.name, instructor.dept name, instructor.salary,


teaches.ID, teaches.course id, teaches.sec id, teaches.semester,
teaches.year)
The join operation allows us to combine a selection and a Cartesian product
into
a single operation.

join operation r ⋈θ s is defined as follows:


r ⋈θ s = σθ(r × s)
σinstructor.ID=teaches.ID(instructor × teaches) can equivalently be
written as
instructor ⋈instructor.ID=teaches.ID teaches.
Intersection

Πcourse id (σsemester =“Fall” year=2017 (section)) ∩
Πcourse id (σsemester =“Spring” ∧
year=2018 (section))

set-difference operation

Πcourse id (σsemester =“Fall” year=2017 (section)) −
Πcourse id (σsemester =“Spring” ∧
year=2018 (section))

Chp2 4
assignment operation←

2016←pi course-id(sigma year=2016 ^ dept=”physics”(semester))


2017←pi course-id(sigma year=2017 ^ dept=”English”(semester))

2016 U 2017
rename operator
ρx (E)

renaming attributes
ρx(A1,A2,…,An) (E)

Πi.ID,i.name ((σi.salary > w.salary(ρi


(instructor) × σw.id=12121(ρw (instructor)))))

Equivalent Queries

σdept name=“Physics”(instructor ⋈instructor.ID=teaches.ID teaches)


(σdept name=“Physics”(instructor)) ⋈instructor.ID=teaches.ID teaches

2.7 Summary
The relational data model is based on a collection of tables. The user of
the
database system may query these tables, insert new tuples, delete tuples,
and update (modify) tuples. There are several languages for expressing
these operations.
• The schema of a relation refers to its logical design, while an instance of
the relation refers to its contents at a point in time. The schema of a
database and an
instance of a database are similarly defined. The schema of a relation
includes its
attributes, and optionally the types of the attributes and constraints on the
relation
such as primary and foreign-key constraints.
• A superkey of a relation is a set of one or more attributes whose values
are guaranteed to identify tuples in the relation uniquely. A candidate key
is a minimal
superkey, that is, a set of attributes that forms a superkey, but none of
whose subsets is a superkey. One of the candidate keys of a relation is
chosen as its primary

Chp2 5
key.
• A foreign-key constraint from attribute(s) A of relation r1 to the primary-
key B of
relation r2 states that the value of A for each tuple in r1 must also be the
value of
B for some tuple in r2. The relation r1 is called the referencing relation,
and r2 is
called the referenced relation.
Practice Exercises 59
• A schema diagram is a pictorial depiction of the schema of a database
that shows
the relations in the database, their attributes, and primary keys and
foreign keys.
• The relational query languages define a set of operations that operate
on tables and
output tables as their results. These operations can be combined to get
expressions
that express desired queries.
• The relational algebra provides a set of operations that take one or more
relations
as input and return a relation as an output. Practical query languages
such as SQL
are based on the relational algebra, but they add a number of useful
syntactic
features.
• The relational algebra defines a set of algebraic operations that operate
on tables,
and output tables as their results. These operations can be combined to
get expressions that express desired queries. The algebra defines the
basic operations used
within relational query languages like SQL.

imperative, functional, declarative

Imperative, functional, and declarative programming languages are all used


in database management systems (DBMS) for querying and manipulating
data. Each programming paradigm has its own strengths and weaknesses
when it comes to working with databases. Here are some relative merits of
each of these paradigms in DBMS:

Chp2 6
Imperative Languages:
Imperative programming languages, such as SQL, allow users to write
queries that specify exactly how data should be retrieved or manipulated.
This makes imperative languages suitable for writing complex queries that
require precise control over how the data is processed. Imperative languages
like SQL are also widely used in DBMSs and are well-supported, making
them a popular choice for data management tasks. However, the code can
be difficult to read and maintain, and it can be prone to errors such as
injection attacks.

Functional Languages:
Functional programming languages, such as Haskell, can be used to write
queries that are more declarative in nature. By focusing on what the user
wants to do with the data, rather than how to do it, functional languages can
make code more modular, reusable, and easier to test. This can lead to more
efficient and effective data management, especially for complex queries.
However, functional languages are not as widely used in DBMSs as
imperative languages, and there may be a steeper learning curve for users
who are not familiar with the functional programming paradigm.

Declarative Languages:
Declarative programming languages, such as Prolog, allow users to write
queries that specify the desired outcome or result, rather than how to achieve
it. This makes declarative languages particularly well-suited for complex rule-
based systems and other data-intensive applications. Declarative languages
can also be used to express complex relationships between data elements,
which can be useful in some database management tasks. However,
declarative languages are less flexible than imperative languages, and may
not be suitable for all types of data management tasks.

In summary, the choice of programming paradigm in DBMSs depends on the


specific needs of the project. Imperative languages like SQL are widely used
and well-supported, functional languages can make code more modular and
reusable, and declarative languages can be useful for complex rule-based
systems and other data-intensive applications.

need null

Null values may be introduced into a database for a variety of reasons. Here
are two common reasons:

Chp2 7
Missing or Unknown Data: Null values can be used to indicate missing or
unknown data. For example, if a person's middle name is not known or not
applicable, the corresponding field in the database can be left blank or filled
with a null value. Similarly, if a customer does not provide their email address
when creating an account, the email field can be set to null.

Optional Data: Sometimes, certain data fields in a database may be optional.


For example, a customer may choose to provide their phone number, but it is
not required to complete the registration process. In this case, if the customer
chooses not to provide their phone number, the corresponding field in the
database can be set to null to indicate that the data is missing, rather than
leaving it blank or filling it with a default value.

relation vs relational schema

In the context of databases, a "relation" refers to a table that contains data


organized into rows and columns. A relation consists of a set of tuples, where
each tuple represents a single record or row, and each attribute or column
represents a particular field or property of that record. The term "relation" is
synonymous with the term "table" in database terminology.

On the other hand, a "relation schema" refers to the structure or design of a


relation. It defines the names of the attributes or columns, the data types of
the attributes, and any constraints or rules that apply to the data. A relation
schema describes the metadata of a relation, but it does not contain any
actual data.

In summary, a "relation" is a set of data organized into rows and columns,


while a "relation schema" is the structure or design of that data, including the
names, types, and constraints of the columns.

Chp2 8

You might also like