Attributes:: Example: Student (Stu - Lastname, Stu - Middlename, Stu - Firstname
Attributes:: Example: Student (Stu - Lastname, Stu - Middlename, Stu - Firstname
Attributes:: Example: Student (Stu - Lastname, Stu - Middlename, Stu - Firstname
ATTRIBUTES:
Attributes are the characteristics of entities. Some entities can have
many attributes while others may only have a couple. As well, there
are five categories that attributes are classified in. This simple table
will be used to explain how each attribute can be a different type of
attribute:
Example: Student (stu_LastName, stu_MiddleName, stu_FirstName,
stu_Age, stu_Phone, stu_Email).
Required or Optional Attributes:
A required attribute is an attribute that must have a value in it, while
an optional attribute may not have a value in it and can be left blank.
The reasoning for making an attribute required is to put emphasis on
what is important in that entity and what makes it stand out from other
entities.
Example: Consider the entity Student above
stud_LastName and studFirstName would be required attributes as it
uniquely defines that table and we assume all students have a first and
last name. Optional attributes in the table Student could
be stu_MiddleName, stu_Email, and stu_Phone since some students
may not have a middle name, a phone number, or an email address .
Keys and Non-keys Attributes:
In every entity an attribute or grouped attributes uniquely identify that
entity. These attributes are the key attributes and range from Primary
key (single attribute identifier) to a Composite Key (Multi attribute
Identifier). The rest of the attributes after the identifier are considered
the non-key attributes or descriptors, which just describe the entity.
Example: Above in the table Student there is only one unique
identifier, stu_LastName, which is the primary key of the table. The rest
of the attributes are descriptors.
Single and Composite Attributes:
Attributes can be classified as having many parts to them or just a
single unbreakable attribute. The composite attribute is an attribute
something
in
order
to
reduce
it
to
set
of
essential
characteristics.
DBMS:
A database is an organized collection of data. It is the collection of
schemas, tables, queries, reports, views and other objects. The data is
typically organized to model aspects of reality in a way that
supports processes requiring information.
A database
management
system (DBMS)
is
a computer
database
management
system
has
promising
potential
private
files,
which
cannot
be
shared
between
multiple
It is clear from the above file systems, that there is some common data
of the student which has to be mentioned in each application, like
Rollno, Name, Class, Phone_No~ Address etc. This will cause the
problem of redundancy which results in wastage of storage space and
difficult to maintain, but in case of centralized database, data can be
shared by number of applications and the whole college can maintain
its computerized data with the following database:
It
is
clear
in
the
above
database
that
Rollno,
Name,
Class,
Father_Name, Address,
Phone_No, Date_of_birth which are stored repeatedly in file system in
each application, need not be stored repeatedly in case of database,
because every other application can access this information by joining
of relations on the basis of common column i.e. Rollno. Suppose any
user of Library system need the Name, Address of any particular
student and by joining of Library and General Office relations on the
basis of column Rollno he/she can easily retrieve this information.
Thus, we can say that centralized system of DBMS reduces the
redundancy of data to great extent but cannot eliminate the
redundancy because Roll_No is still repeated in all the relations.
2. Integrity can be enforced: Integrity of data means that data in
database is always accurate, such that incorrect information cannot be
stored in database. In order to maintain the integrity of data, some
inconsistent database
is
capable of
supplying
incorrect
or
can
be
shared: The
data
about
Name,
Class,
and
level,
National
level
or
International
level.
The
Enterprise
Requirement
than
Individual
Disadvantages of DBMS
The disadvantages of the database approach are summarized as
follows:
1. Complexity: The provision of the functionality that is expected of a
good DBMS makes the DBMS an extremely complex piece of software.
Database designers, developers, database administrators and endusers must understand this functionality to take full advantage of it.
Failure to understand the system can lead to bad design decisions,
which can have serious consequences for an organization.
2. Size: The complexity and breadth of functionality makes the DBMS
an extremely large piece of software, occupying many megabytes of
disk space and requiring substantial amounts of memory to run
efficiently.
3. Performance: Typically, a File Based system is written for a specific
application, such as invoicing. As result, performance is generally very
good. However, the DBMS is written to be more general, to cater for
many applications rather than just one. The effect is that some
applications may not run as fast as they used to.
4. Higher impact of a failure: The centralization of resources
increases the vulnerability of the system. Since all users and
applications rely on the availabi1ity of the DBMS, the failure of any
component can bring operations to a halt.
5. Cost of DBMS: The cost of DBMS varies significantly, depending on
the environment and functionality provided. There is also the recurrent
annual maintenance cost.
into
the
storage
manager
and
the
query
processor
components.
Storage Manager
Query Processor
Storage Manager
The storage manager is important because database typically require a
large amount of storage space. So it is very important efficient use of
storage, and to minimize the movement of data to and from disk.
A storage manager is a program module that provides the interface
between the low-level data stored in the database and the application
programs and the queries submitted to the system. The Storage
manager is responsible for the interaction with the file manager. The
Storage manager translates the various DML statements into low level
file system commands. Thus the storage manager is responsible for
storing, retrieving, and updating data in the database. The storage
manager components include the following.
Authorization and Integrity Manager
Transaction Manger
File Manager
Buffer Manger
Authorization and Integrity Manger tests for the satisfaction of integrity
constraints and checks the authority of users to access data.
Transaction manager ensures that the database remains in a
consistent state and allowing concurrent transactions to proceed
without conflicting. The file manager manages the allocation of space
on disk storage and the data structures used to represent information
stored on disk. The Buffer manager is responsible for fetching the data
from disk storage into main memory and deciding what data to cache
in main memory.
The storage manager implements the following data structures as part
of the physical system implementation. Data File, Data Dictionary,
Indices. Data files stores the database itself. The Data dictionary stores
metadata about the structure of database, in particular the schema of
the database. Indices provide fast access to data items.
The Query Processor
The Query Processor simplifies and facilitates access to data. The
Query processor includes the following component.
DDL Interpreter
DML Compiler
Query Evaluation Engine
The DDL interpreter interprets DDL statements and records the
definition in the data dictionary. The DML compiler translates DML
Relational Algebra
Relational algebra is a procedural query language, which takes
instances of relations as input and yields instances of relations as
output. It uses operators to perform queries. An operator can be
either unary or binary. They accept relations as their input and yield
relations as their output. Relational algebra is performed recursively
on a relation and intermediate results are also considered relations.
The fundamental operations of relational algebra are as follows
Select
Project
Union
Set different
Cartesian product
Rename
Select Operation ()
It selects tuples that satisfy the given predicate from a relation.
Notation p(r)
Where stands for selection predicate and r stands for relation. p is
prepositional logic formula which may use connectors like and,
or, and not. These terms may use relational operators like
=, , , < , >, .
For example
subject = "database"(Books)
Project Operation ()
It projects column(s) that satisfy a given predicate.
Notation A1, A2, An (r)
Where A1, A2 , An are attribute names of relation r.
Duplicate rows are automatically eliminated, as relation is a set.
For example
Selects and projects columns named as subject and author from the
relation Books.
Union Operation ()
It performs binary union between two given relations and is defined
as
r s = { t | t r or t s}
Notion r U s
Where r and s are either database relations or relation result set
(temporary relation).
For a union operation to be valid, the following conditions must hold
author
(Books)
author
(Articles)
Output Projects the names of the authors who have either written
a book or an article or both.
Set Difference ()
The result of set difference operation is tuples, which are present in
one relation but are not in the second relation.
Notation r s
Finds all the tuples that are present in r but not in s.
author
(Books)
author
(Articles)
Output Provides the name of authors who have written books but
not articles.
Cartesian Product ()
Combines information of two different relations into one.
Notation r s
Where r and s are relations and their output will be defined as
r s = { q t | q r and t s}
author = 'tutorialspoint'(Books Articles)
Output Yields a relation, which shows all the books and articles
written by tutorialspoint.
Rename Operation ()
The results of relational algebra are also relations but without any
name. The rename operation allows us to rename the output relation.
'rename' operation is denoted with small Greek letter rho .
Notation
(E)
Set intersection
Assignment
Natural join
Relational Calculus
In contrast to Relational Algebra, Relational Calculus is a nonprocedural query language, that is, it tells what to do but never
explains how to do it.
Relational calculus exists in two forms
Output Returns tuples with 'name' from Author who has written
article on 'database'.
TRC can be quantified. We can use Existential () and Universal
Quantifiers ().
For example
{ R| T Authors(T.article='database' AND R.name=T.name)}
Output The above query will yield the same result as the previous
one.
Where a1, a2 are attributes and P stands for formulae built by inner
attributes.
For example
{< article, page, subject > | TutorialsPoint subject = 'database'}
The DBMS creates and manages the complex structures required for
data storage, thus relieving you from the difficult task of defining and
programming the physical data characteristics.
A modern DBMS provides storage not only for the data, but also for
related data entry forms or screen definitions, report definitions, data
The DBMS provides backup and data recovery to ensure data safety
and integrity. Current DBMS systems provide special utilities that allow
the DBA to perform routine and special backup and restore
procedures. Recovery management deals with the recovery of the
database after a failure, such as a bad sector in the disk or a power
failure. Such capability is critical to preserving the databases integrity.
The DBMS promotes and enforces integrity rules, thus minimizing data
redundancy and maximizing data consistency. The data relationships
stored in the data dictionary are used to enforce data integrity.
Ensuring data integrity is especially important in transaction-oriented
database systems.
SQL VIEW:
A view is nothing more than a SQL statement that is stored in the
database with an associated name. A view is actually a composition of
a table in the form of a predefined SQL query.
A view can contain all rows of a table or select rows from a table. A
view can be created from one or many tables which depends on the
written SQL query to create a view.
Views, which are kind of virtual tables, allow users to do the
following:
Structure data in a way that users or classes of users find natural or intuitive.
Restrict access to the data such that a user can see and (sometimes) modify
exactly what they need and no more.
Summarize data from various tables which can be used to generate reports.
Integrity constraints:
FUNCTIONAL DEPENDENCY:
Functional dependency is a relationship that exists when one attribute
uniquely determines another attribute.
Employee ID
Employee Name
Department ID
Department Name
0001
John Doe
Human Resources
0002
Jane Doe
Marketing
0003
John Smith
Human Resources
0004
Jane Goodall
Sales
This case represents an example where multiple functional dependencies are embedded in a single
representation of data. Note that because an employee can only be a member of one department, the
unique ID of that employee determines the department.
Employee ID Department ID
In addition to this relationship, the table also has a functional dependency through a non-key attribute
This example demonstrates that even though there exists a FD Employee ID Department ID - the
employee ID would not be a logical key for determination of the department ID. The process of
normalization of the data would recognize all FD's and allow the designer to construct tables and
relationships that are more logical based on the data.
CHARACTERISTICS OF SQL:
High Speed:
SQL Queries can be used to retrieve
records from a database quickly and efficiently.
large
amounts
of
standard,
Non-SQL
standard.
No Coding Required:
Using
standard
systems without
SQL
it
having to
is
easier
to
manage
database
write substantial amount of code.
Emergence of ORDBMS:
Previously
relational
Oriented
extended
SQL
databases
database.
With
DBMS,
object
to
were
synonymous
with
the
emergence
of
Object
storage
capabilities
are
relational
databases.
CONSTRAINTS:
Constraints are the rules enforced on data columns on table. These
are used to limit the type of data that can go into a table. This
ensures the accuracy and reliability of the data in the database.
Constraints could be column level or table level. Column level
constraints are applied only to one column, whereas table level
constraints are applied to the whole table.
Following are commonly used constraints available in SQL. These
constraints have already been discussed in SQL - RDBMS
Concepts chapter but its worth to revise them at this point.
NOT NULL Constraint: Ensures that a column cannot have NULL value.
CHECK Constraint: The CHECK constraint ensures that all values in a column
satisfy certain conditions.
INDEX: Use to create and retrieve data from the database very quickly.
Example:
you can get nice "on delete cascade" behavior, automatically cleaning up tables
knowing about the relationships between tables in the database helps the Optimizer plan
your queries for most efficient execution, since
it is able to get better estimates on join cardinality. FKs give a
pretty big hint on what statistics are most important to collect on
the database, which in turn leads to better performance
they enable all kinds of auto-generated support -- ORMs can generate themselves,
visualization tools will be able to create nice schema
layouts for you, etc
someone new to the project will get into the flow of things faster since otherwise implicit
relationships are explicitly documented
OBJECT ORIENTATION:
EXAMPLE:
For example, object-oriented programming (OOP) refers to a special
type of programming that combines data structures withfunctions to
create re-usable objects. Object-oriented graphics is the same
as vector graphics.
NESTED RELATIONS:
Object relational data models extends the relational data model by providing a
richer
type
system
including
complex
data
types
and
object orientation.
The nested Relational model is an extension of the relational model in which
domains may be either atomic or relation valued. Thus the value of a tuple
within relations. A complex object thus can be represented by a single tuple of
a nested relation.
Example: For
nested
relation:
atomic.
NON
1
NF
books
relation,
books.
Books a, 1- NF version of non INF relation books.
Books a relation is disappeared if we assume that the following
Multivalued dependencies.
1. Title author
2. Title keyword
3. Title pub-name, pub-branch
Thus, we can decompose the relation into 4 NF using the schemas.
1. Authors (title, author)
2. Keywords (title, keyword)
3. Books 4 (title, pub-name, pub-branch)
Book database can be adequately expressed without using nested relations, the
use of nested relation leads to an easier to understand model. The 4 NF design
would require user to include joins in their queries, thereby complicating
interaction with the system.
Non nested relation (e.g. Books a relation) eliminates the need for users to
write joins in their query. But it will loose one-to-one correspondence between
tuples and books.
Where each node has its own main memory, but all nodes share mass storage, usually
a storage area network. In practice, each node usually also has multiple processors.
Shared nothing architecture
Where each node has its own mass storage as well as main memory.
The other architecture group is called hybrid architecture, which includes:
Non-Uniform Memory Architecture (NUMA), which involves the non-uniform memory access.
Cluster (shared nothing + shared disk: SAN/NAS), which is formed by a group of connected
computers