Dbms Data Base Management System Is A: Features of Database
Dbms Data Base Management System Is A: Features of Database
Features of Database:
Types of Database
A 3-tier architecture separates its tiers from each other based on the
complexity of the users and how they use the data present in the
database. It is the most widely used architecture to design a DBMS.
Physical Level: It is lowest level of abstraction and describes how
the data are actually stored and complex low level data structures
in detail.
Logical Level: It is the next higher level of abstraction and
describes what data are stored and what relationships exist among
those data. At the logical level, each such record is described by a
type definition and the interrelationship of these record types is
defined as well. Database administrators usually work at this level
of abstraction.
View Level: It is the highest level of abstraction and describes only
part of the entire database and hides the details of the logical level.
Schema:
Data model:
A data model is a plan for building a database. Data models define how
data is connected to each other and how they are processed and stored
inside the system.
Two widely used data models are:
Object-based logical model
Record based logical model
Entity :
Attributes
Keys
ER Modeling:
Notations/Shapes in ER Modeling:
The overall logical structure of a database can be expressed
graphically by an E-R diagram. The diagram consists of the following
major components.
Rectangles: represent entity set.
Ellipses: represent attributes.
Relational Algebra:
Integrity Constraints:
Decomposition
Join between the sub relations should not create any additional tuples or
there should not be a case such that more number of tuples in R 1 than R 2
R R1 R2 (Lossy)
R R1 R2 (Lossless)
Dependency Preservence: Because of decomposition, there must not
be loss of any single dependency.
Functional Dependency (FD): Dependency between the attribute is
known as functional dependency. Let R be the relational schema and X,
Y be the non-empty sets of attributes and t 1, t 2, ... ,t n are the tuples of
relation R. X Y {values for X functionally determine values for Y}
Trivial Functional Dependency: If X Y, then X Y will be trivial FD.
Case of semi-trivial FD
S id S id S name (semi-trivial)
Because on decomposition, we will get
S id S id (trivial FD) and
S id S name (non-trivial FD)
Properties of Functional Dependence (FD)
Reflexivity: If X Y, then X Y (trivial)
Transitivity: If X Y and Y Z, then X Z
Augmentation: If X Y, then XZ YZ
Splitting or Decomposition: If X YZ, then X Y and X Z
Union: If X Y and X Z, then X YZ
Normal Forms/Normalization:
If any of the above two conditions fail, then Y A will also become fully
functional dependency.
Full Functional Dependency: A functional dependency P Q is said to
be fully functional dependency, if removal of any attribute S from P
means that the dependency doesn't hold any more.
(Student_Name, College_Name College_Address)
Suppose, the above functional dependency is a full functional
dependency, then we must ensure that there are no FDs as below.
(Student_Name College_Address)
or (College_Name Collage_Address)
Third Normal Form (3NF): Let R be a relational schema, then any non-
trivial FD X Y over R is in 3NF, if X should be a candidate key or super
key or Y should be a prime attribute.
Either both of the above conditions should be true or one of them
should be true.
R should not contain any transitive dependency.
SQL:
Structured Query language (SQL) is a language that provides an
interface to relational database systems. SQL was developed by IBM in
the 1970, for use in system R and is a defector standard, as well as an
ISO and ANSI standard.
To deal with the above database objects, we need a programming
language and that programming language is known as SQL.
Type of SQL
SQL Keyword Function
Statement
SELECT
INSERT INTO Used to enter, modify,
Data manipulation
delete and retrieve
language(DML) UPDATE data from a table
DELETE FROM
DML Commands
SELECT A 1 , A 2 , A 3 ,A n what to return
FROM R 1, R 2, R 3, .., R m relations or table
WHERE condition filter condition i.e., on what basis, we want to restrict
the outcome/result.
If we want to write the above SQL script in the form of relational calculus,
we use the following syntax
Comparison operators which we can use in filter condition are (=, >, <, >
= , < =, < >,) < > means not equal to.
INSERT Statement: Used to add row (s) to the tables in a database
INSERT INTO Employee (F_Name, L_Name) VALUES ('Atal', 'Bihari')
UPDATE Statement: It is used to modify/update or change existing data
in single row, group of rows or all the rows in a table.
Example:
//Updates some rows in a table.
UPDATE Employee
SET City = LUCKNOW
WHERE Emp_Id BETWEEN 9 AND 15;
//Update city column for all the rows
UPDATE Employee SET City=LUCKNOW;
DELETE Statement: This is used to delete rows from a table,
Example:
//Following query will delete all the rows from Employee table
DELETE Employee
Emp_Id=7;
DELETE Employee
ORDER BY Clause: This clause is used to, sort the result of a query in a
specific order (ascending or descending), by default sorting order is
ascending.
SELECT Emp_Id, Emp_Name, City FROM Employee
WHERE City = LUCKNOW
ORDER BY Emp_Id DESC;
GROUP BY Clause: It is used to divide the result set into groups.
Grouping can be done by a column name or by the results of computed
columns when using numeric data types.
The HAVING clause can be used to set conditions for the
GROUPBY clause.
HAVING clause is similar to the WHERE clause, but having puts
conditions on groups.
WHERE clause places conditions on rows.
WHERE clause cant include aggregate: function, while HAVING
conditions can do so.
Example:
SELECT Emp_Id, AVG (Salary)
FROM Employee
GROUP BY Emp_Id
HAVING AVG (Salary) > 25000;
Aggregate Functions
Joins: Joins are needed to retrieve data from two tables' related rows on
the basis of some condition which satisfies both the tables. Mandatory
condition to join is that atleast one set of column (s) should be taking
values from same domain in each table.
Inner Join: Inner join is the most common join operation used in
applications and can be regarded as the default join-type. Inner join
creates a new result table by combining column values of two tables (A
and B) based upon the join-predicate. These may be further divided into
three parts.
1. Equi Join (satisfies equality condition)
2. Non-Equi Join (satisfies non-equality condition)
3. Self Join (one or more column assumes the same domain of
values).
Outer Join: An outer join does not require each record in the two joined
tables to have a matching record. The joined table retains each record-
even if no other matching record exists.
Considers also the rows from table (s) even if they don't satisfy the
joining condition
(i) Right outer join (ii) Left outer join (iii) Full outer join
Left Outer Join: The result of a left outer join for table A and B always
contains all records of the left table (A), even if the join condition does
not find any matching record in the right table (B).
Result set of T1 and T2
Right Outer Join: A right outer closely resembles a left outer join, except
with the treatment of the tables reversed. Every row from the right table
will appear in the joined table at least once. If no matching with left table
exists, NULL will appear.
Cross Join (Cartesian product): Cross join returns the Cartesian product
of rows form tables in the join. It will produce rows which combine each
row from the first table with each row from the second table.
Select * FROM T1, T2
Number of rows in result set = (Number of rows in table 1 Number of
rows in table 2)
Result set of T1 and T2 (Using previous tables T1 and T2)
Structure Storage:
The storage structure can be divided into two categories:
Volatile storage: As the name suggests, a volatile storage cannot
survive system crashes. Volatile storage devices are placed very close to
the CPU; normally they are embedded onto the chipset itself. For
example, main memory and cache memory are examples of volatile
storage. They are fast but can store only a small amount of information.
Non-volatile storage: These memories are made to survive system
crashes. They are huge in data storage capacity, but slower in
accessibility. Examples may include hard-disks, magnetic tapes, flash
memory, and non-volatile (battery backed up) RAM.
File Organisation:
Index files are typically much smaller than the original file because only
the values for search key and pointer are stored. The most prevalent
types of indexes are based on ordered files (single-level indexes) and
tree data structures (multilevel indexes).
Types of Single Level Ordered Indexes: In an ordered index file, index
enteries are stored sorted by the search key value. There are several
types of ordered Indexes
Primary Index: A primary index is an ordered file whose records are of
fixed length with two fields. The first field is of the same data type as the
ordering key field called the primary key of the data file and the second
field is a pointer to a disk block (a block address).
There is one index entry in the index file for each block in the data
file.
Indexes can also be characterised as dense or sparse.
Dense index A dense index has an index entry for every search
key value in the data file.
Sparse index A sparse index (non-dense), on the other hand has
index entries for only some of the search values.
A primary index is a non-dense (sparse) index, since it includes an
entry for each disk block of the data file rather than for every search
value.
B-Trees
When data volume is large and does not fit in memory, an extension
of the binary search tree to disk based environment is the B-tree.
In fact, since the B-tree is always balanced (all leaf nodes appear
at the same level), it is an extension of the balanced binary search
tree.
The problem which the B-tree aims to solve is given a large
collection of objects, each having a key and a value, design a disk
based index structure which efficiently supports query and update.
A B-tree of order p, when used as an access structure on a key field
to search for records in a data file, can be defined as follows
1. Each internal node in the B-tree is of the form
where, q p
Each P i is a tree pointer to another node in the B-tree.
Each is a data pointer to the record whose search key field
value is equal to K j .
2. Within each node, K 1 < K 2 < . < K q1
3. Each node has at most p tree pointers.
4. Each node, except the root and leaf nodes, has atleast [(p/2)]
tree pointers.
5. A node within q tree pointers q p, has q 1 search key field
values (and hence has q 1 data pointers).
e.g., A B-tree of order p = 3. The values were inserted in the
order 8, 5, 1, 7, 3, 12, 9, 6.
B + Trees
It is the variation of the B-tree data structure.