0% found this document useful (0 votes)
3 views

DBMS Systems

Narrates about database made simple

Uploaded by

Wachira Davis
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

DBMS Systems

Narrates about database made simple

Uploaded by

Wachira Davis
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 12

DATABASE SYSTEMS

WACHIRA DAVIS

INTRODUCTION

Achievements in database research underpin fundamental advances in communications systems,


transportation and logistics, financial management, knowledge-based systems, accessibility to scientific
literature, and a host of other civilian and defense applications. They also serve as the foundation
for considerable progress in the basic science fields ranging from computing to biology.

Importance of the database system increased with the significant developments in hardware capability,
hardware capacity, and communications, including the emergence of the Internet, electronic commerce,
business intelligence, mobile communications, and grid computing.

The database is now an integral part of our day-to-day life. For instance in purchases using your credit
card, there is a database somewhere that contains information about the purchases and a database to
check that the credit card details. A database is a collection of related data and the Database
Management System (DBMS) is the software that manages and controls access to the database. A
database application is simply a program that interacts with the database at some point in its execution.

A predecessor of database is the file-based system. A file-based system is a collection of application


programs that perform services for the end-users such as the production of reports. Each program
defines and manages its own data.

Limitations of the File-Based Approach

 Separation and isolation of data


 Duplication of data
 Data dependence
 Incompatible file formats
 Fixed queries/proliferation of application programs.
Despite the limitations of file based systems, there are circumstances when they may be preferred to
database management systems.

Database management systems may involve unnecessary overhead costs that would not be incurred in
traditional file processing. The overhead costs of using a database management system arise because
of:
 High initial hardware, software and training investment
 The requirement of database management systems to provide generality for defining and
processing data
 The need to provide for security, concurrency control, recovery and integrity function.
Problem may arise if the database or its applications are not well designed or implemented.

1
DATABASE SYSTEMS
WACHIRA DAVIS

In view of these, it may be more desirable to use file-based system under the following circumstances:
 Well-defined applications that are not expected to change
 Requirements for stringent, real-time functions. Such requirements would suffer with database
management systems that would be slow (inefficient) in view of their generality.
 There is no multiple-user access to warrant database systems support.
Some applications such as CAD thus have proprietary files and data management software – for efficient
processing.
Database Approach
The limitations of the file-based approach can be attributed to two factors:

 the definition of the data is embedded in the application programs, rather than being
stored separately and independently;
 there is no control over the access and manipulation of data beyond that imposed by
the application programs.
To become more effective, a new approach was required. What emerged were the database and the
Database Management System (DBMS).

Database

A database is a shared collection of logically related data, and a description of this data, designed to
meet the information needs of an organization. The database holds not only the organization's
operational data but also a description of this data. For this reason, a database is also defined as a self-
describing collection of integrated records. The description of the data is known as the system catalog
(or data dictionary or metadata - the 'data about data'). It is the self-describing nature of a database that
provides program-data independence.

The approach taken with database systems, where the definition of data is separated from the
application programs, is similar to the approach taken in modem software development, where an
internal definition of an object and a separate external definition are provided. The users of an object
see only the external definition and are unaware of how the object is defined and how it functions.

One advantage of this approach, known as data abstraction, is that we can change the internal definition
of an object without affecting the users of the object, provided the external definition remains the same.
(Data abstraction refers to the suppression of details of data organization and storage and the
highlighting of the essential features for an improved understanding of data.) In the same way, the
database approach separates the structure of the data from the application programs and stores it in
the database. If new data structures are added or existing structures are modified then the application
programs are unaffected, provided they do not directly depend upon what has been modified. For
example, if we add a new field to a record or create a new file, existing applications are unaffected.
However, if we remove a field from a file that an application program uses, then that application
program is affected by this change and must be modified accordingly.

2
DATABASE SYSTEMS
WACHIRA DAVIS

A database is 'logically related'. It represents the entities, the attributes, and the logical relationships
between the entities. When we analyze the information needs of an organization, we attempt to
identify entities, attributes, and relationships. An entity is a distinct object (a person, place, thing,
concept, or event e.g. a branch, staff) in the organization that is to be represented in the database. An
attribute is a property that describes some aspect of the object (e.g. branch no, staff no, property no
etc) that we wish to record, and a relationship is an association (e.g. X has Y, G offers H etc) between
entities. This could be represented using an Entity-Relationship (ER) diagram.

DBMS

A software system that enables users to define, create, maintain, and control access to the database.
The DBMS is the software that interacts with the users' application programs and the database.
Typically, a DBMS provides the following facilities:

It allows users to define the database, usually through a Data Definition Language (DDL). The DDL allows
users to specify the data types and structures and the constraints on the data to be stored in the
database.

It allows users to insert, update, delete, and retrieve data from the database, usually through a Data
Manipulation Language (DML). Having a central repository for all data and data descriptions allows the
DML to provide a general inquiry facility to this data, called a query language. The provision of a query
language alleviates the problems with file-based systems where the user has to work with a fixed set of
queries or there is a proliferation of programs, giving major software management problems. The
most common query language is the Structured Query Language (SQL, pronounced 'S-Q-L', or sometimes
'See-Quel'), which is now both the formal and de-facto standard language for relational DBMSs.

 It provides controlled access to the database. For example, it may provide:


 a security system, which prevents unauthorized users accessing the database;
an integrity system, which maintains the consistency of stored data;
 a concurrency control system, which allows shared access of the database;
 a recovery control system, which restores the database to a previous consistent state following
a hardware or software failure;
 a user-accessible catalog, which contains descriptions of the data in the database.

Users interact with the database through a number of application programs that are used to create and
maintain the database and to generate information. These programs can be conventional batch
applications or, more typically nowadays, they will be online applications.

Views

DBMS is an extremely powerful and useful tool. However, as the end-users are not too interested in how
complex or easy a task is for the system, it could be argued that the DBMS has made things more
complex because they now can see more data than they actually need or want. In recognition of this
3
DATABASE SYSTEMS
WACHIRA DAVIS

problem, a DBMS provides another facility known as a view mechanism, which allows each user to have
his or her own view of the database (a view is in essence some subset of the database).

Benefits of views

 Views reduce complexity by letting users see the data in the way they want to see it.

 Views provide a level of security. Views can be set up to exclude data that some users should not
see. For example, we could create a view that allows a branch manager and the Payroll Department
to see all staff data, including salary details, and we could create a second view that other staff
would use that excludes salary details.

 Views provide a mechanism to customize the appearance of the database. For example,
the Contracts Department may wish to call the monthly rent field (rent) by the more
obvious name, Monthly Rent.

 A view can present a consistent, unchanging picture of the structure of the database, even if the
underlying database is changed (for example, fields added or removed, relationships changed, files
split, restructured, or renamed). If fields are added or removed from a file, and these fields are not
required by the view, the view is not affected by this change. Thus, a view helps provide the
program-data independence.

The actual level of functionality offered by a DBMS differs from product to product. For example, a
DBMS for a personal computer may not support concurrent shared access, and it may provide only
limited security, integrity, and recovery control. However, modem, large multi-user DBMS products offer

all the above functions and much more. Modem systems are extremely complex pieces of software
consisting of millions of lines of code, with documentation comprising many volumes.

Components of the DBMS Environment

We can identify five major components in the DBMS environment: hardware, software, data,
procedures, and people.

The predecessor to the DBMS was the file-based system. In fact, the file-based system still exists in
specific areas. It has been suggested that the DBMS has its roots in the 1960s Apollo moon-landing
project, which was initiated in response to President Kennedy's objective of landing a man on the moon
by the end of that decade. At that time there was no system available that would be able to handle and
manage the vast amounts of information that the project would generate.

Advantages of DBMSs:
 Control of data redundancy
 Data consistency
 More information from the same amount of data

4
DATABASE SYSTEMS
WACHIRA DAVIS

 Sharing of data
 Data integrity
 Improved security
 Enforcement of standards
 Economy of scale
 Balance of conflicting requirements
 Improved data accessibility and responsiveness
 Increased productivity
 Improved maintenance through data independence
 Increased concurrency
 Improved backup and recovery services.
Database environment and ANSI-SPARC

A major aim of a database system is to provide users with an abstract view of data, hiding certain details
of how data is stored and manipulated. Furthermore, since a database is a shared resource, each user
may require a different view of the data held in the database. To satisfy these needs, the architecture of
most commercial DBMSs available today is based to some extent on the so-called ANSI-SPARC (The
American National Standards Institute (ANSI) - Standards Planning and Requirements Committee
(SPARC) architecture.

In ANSI-SPARC, we identify three levels of abstraction, that is, three distinct levels at which data items
can be described. The levels form a three-level architecture comprising an external, a conceptual, and an
internal level as the diagram below shows.

5
DATABASE SYSTEMS
WACHIRA DAVIS

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation
and management (Addison Wesley, N.Y., 2005).

The way users perceive the data is called the external level. The way the DBMS and the operating
system perceive the data is the internal level. This is where the data is actually stored using the data
structures and file organizations. The conceptual level provides both the mapping and the desired
independence between the external and internal levels.

The objective of the three-level architecture is to separate each user's view of the database from the
way the database is physically represented. There are several reasons why this separation is desirable:

 Each user should be able to access the same data, but have a different customized view of the data.
Each user should be able to change the way he or she views the data, and this change should not
affect other users.
 Users should not have to deal directly with physical database storage details, such as indexing or
hashing. A user's interaction with the database should be independent of storage considerations.
 The Database Administrator (DBA) should be able to change the database storage structures
without affecting the users' views.
 The internal structure of the database should be unaffected by changes to the physical aspects of
storage, such as the changeover to a new storage device.
 The DBA should be able to change the conceptual structure of the database without
affecting all users.
The differences between the three levels can be illustrated using the following diagram.

Source: Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation
and management (Addison Wesley, N.Y., 2005).
External level

6
DATABASE SYSTEMS
WACHIRA DAVIS

This is the users' view of the database. This level describes that part of the database that is relevant to
each user.

The external level consists of a number of different external views of the database. Each user has a view
of the 'real world' represented in a form that is familiar for that user. The external view includes only
those entities, attributes, and relationships in the 'real world' that the user is interested in. Other
entities, attributes, or relationships that are not of interest may be represented in the database, but the
user will be unaware of them. In addition, different views may have different representations of the
same data. For example, one user may view dates in the form (day, month, year), while another may
view dates as (year, month, day). Some views might include derived or calculated data: data not actually
stored in the database as such, but created when needed, such as age of a member of staff. The ages
may not be stored, as this data would have to be updated daily – but could be calculated from the staff's
date of birth. Views may even include data combined or derived from several entities.

Conceptual level

This is the community view of the database. This level describes what data is stored in the database and
the relationships among the data. It is in the middle level in the three-level architecture. This level
contains the logical structure of the entire database as seen by the DBA. It is a complete view of the data
requirements of the organization that is independent of any storage considerations.

The conceptual level represents:


 all entities, their attributes, and their relationships;
 the constraints on the data;
 semantic information about the data;
 security and integrity information.

The conceptual level supports each external view, in that any data available to a user must be contained
in or derivable from, the conceptual level. However, this level must not contain any storage-dependent
details. For instance, the description of an entity should contain only data types of attributes (for
example, integer, real, character) and their length (such as the maximum number of digits or
characters), but not any storage considerations, such as the number of bytes occupied.

Internal level

It is the physical representation of the database on the computer. This level describes how the data is
stored in the database. The internal level covers the physical implementation of the database to achieve
optimal runtime performance and storage space utilization. It covers the data structures and file
organizations used to store data on storage devices. It interfaces with the operating system access
methods (file management techniques for storing and retrieving data records) to place the data on the
storage devices, build the indexes, retrieve the data, and so on.

The internal level is concerned with such things as:


7
DATABASE SYSTEMS
WACHIRA DAVIS

 storage space allocation for data and indexes;


 record descriptions for storage (with stored sizes for data items);
 record placement;
 data compression and data encryption techniques.

Physical level

Below the internal level there is a physical level that may be managed by the operating system under
the direction of the DBMS. However, the functions of the DBMS and the operating system at the
physical level are not clear-cut and vary from system to system. Some DBMSs take advantage of many of
the operating system access methods, while others use only the most basic ones and create their own
file organizations. The physical level below the DBMS consists of items only the operating system knows,
such as exactly how the sequencing is implemented and whether the fields of internal records are stored
as contiguous bytes on the disk.

Schemas, Mappings, and Instances

The overall description of the database is called the database schema. There are three different types of
schema in the database and these are defined according to the levels of abstraction of the three-level
architecture.

At the highest level, we have multiple external schemas (also called subschemas) that correspond to
different views of the data. At the conceptual level, we have the conceptual schema, which describes all
the entities, attributes, and relationships together with integrity constraints. At the lowest level of
abstraction we have the internal schema, which is a complete description of the internal model,
containing the definitions of stored records, the methods of representation, the data fields, and the
indexes and storage structures used. There is only one conceptual schema and one internal schema per
database.

The DBMS is responsible for mapping between these three types of schema. It must also check the
schemas for consistency; in other words, the DBMS must check that each external schema is derivable
from the conceptual schema, and it must use the information in the conceptual schema to map between
each external schema and the internal schema. The conceptual schema is related to the internal schema
through a conceptual/internal mapping. This enables the DBMS to find the actual record or combination
of records in physical storage that constitute a logical record in the conceptual schema, together with
any constraints to be enforced on the operations for that logical record. It also allows any differences in
entity names, attribute names, attribute order, data types, and so on, to be resolved. Finally, each
external schema is related to the conceptual schema by the external/conceptual mapping. This enables
the DBMS to map names in the user's view on to the relevant part of the conceptual schema.

8
DATABASE SYSTEMS
WACHIRA DAVIS

Source:Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design,


Implementation and management (Addison Wesley, N.Y., 2005).

It is important to distinguish between the description of the database and the database itself. The
description of the database is the database schema. The schema is specified during the database design
process and is not expected to change frequently. However, the actual data in the database may change
frequently; for example, it changes every time we insert details of a new member of staff or a new
property. The data in the database at any particular point in time is called a database instance.
Therefore, many database instances can correspond to the same database schema. The schema is
sometimes called the intension of the database, while an instance is called an extension (or state) of the
database.

Data Independence

A major objective for the three-level architecture is to provide data independence, which means that
upper levels are unaffected by changes to lower levels. There are two kinds of data independence:
logical and physical.

Logical data independence refers to the immunity of the external schemas to changes in the conceptual
schema. Changes to the conceptual schema, such as the addition or removal of new entities, attributes,
or relationships, should be possible without having to change existing external schemas or having to
rewrite application programs. Clearly, the users for whom the changes have been made need to be
aware of them, but the other users should not be.

Physical data independence refers to the immunity of the conceptual schema to changes in the internal
schema. Changes to the internal schema, such as using different file organizations or storage structures,
using different storage devices, modifying indexes, or hashing algorithms, should be possible without
having to change the conceptual or external schemas. From the users' point of view, the only effect that
may be noticed is a change in performance. In fact, deterioration in performance is the most common
reason for internal schema changes.
9
DATABASE SYSTEMS
WACHIRA DAVIS

The two-stage mapping in the ANSI-SPARC architecture may be inefficient, but provides greater data
independence. However, for more efficient mapping, the ANSI-SPARC model allows the direct mapping
of external schemas on to the internal schema, thus by-passing the conceptual schema. This, of course,
reduces data independence, so that every time the internal schema changes, the external schema, and
any dependent application programs may also have to change.

Data Models and Conceptual Modeling

A schema is written using a data definition language. In fact, it is written in the data definition language
of a particular DBMS. This type of language is too low level to describe the data requirements of an
organization in a way that is readily understandable by a variety of users. What is required is a higher-
level description of the schema: that is, a data model.

Data model is an integrated collection of concepts for describing and manipulating data, relationships
between data, and constraints on the data in an organization. It should provide the basic concepts and
notations that will allow database designers and end-users unambiguously and accurately to
communicate their understanding of the organizational data.

A data model can be thought of as comprising three components: a structural part, consisting of a set of
rules according to which databases can be constructed;

 a manipulative part, defining the types of operation that are allowed on the data (this
includes the operations that are used for updating or retrieving data from the database
and for changing the structure of the database);
 possibly a set of integrity constraints, which ensures that the data is accurate.

There have been many data models proposed in the literature. They fall into three broad categories:
object-based, record-based, and physical data models. The first two are used to describe data at the
conceptual and external levels; the latter is used to describe data at the internal level.

Object- Based Data Models

Object oriented and record-based models are representation or implementation data models. They
provide concepts that may be understood by end-users but that are not too far removed from the way
data is organized within the computer. Representational data models hide some details of data storage
but can be implemented on a computer system directly. Object-based data models use concepts such as
entities, attributes, and relationships. An entity is a distinct object (a person, place, thing, concept,
event) in the organization that is to be represented in the database. An attribute is a property that
describes some aspect of the object that we wish to record, and a relationship is an association between
entities. Some of the more common types of object-based data model are:

 Entity-Relationship
 Semantic
 Functional
10
DATABASE SYSTEMS
WACHIRA DAVIS

 Object-Oriented.
The Entity-Relationship model has emerged as one of the main techniques for database design. The
object-oriented data model extends the definition of an entity to include not only the attributes that
describe the state of the object but also the actions that are associated with the object, that is, its
behavior. The object is said to encapsulate both state and behavior.

Record-Based Data Models

In a record-based model, the database consists of a number of fixed-format records possibly of differing
types. Each record type defines a fixed number of fields, each typically of a fixed length. There are three
principal types of record-based logical data model: the relational data model, the network data model,
and the hierarchical data model. The hierarchical and network data models were developed almost a
decade before the relational data model, so their links to traditional file processing concepts are more
evident.

Record-based (logical) data models are used to specify the overall structure of the database and a
higher-level description of the implementation. Their main drawback lies in the fact that they do not
provide adequate facilities for explicitly specifying constraints on the data, whereas the object-based
data models lack the means of logical structure specification but provide more semantic substance by
allowing the user to specify constraints on the data.

The majority of modem commercial systems are based on the relational paradigm, whereas the early
database systems were based on either the network or hierarchical data models. The latter two models
require the user to have knowledge of the physical database being accessed, whereas the former
provides a substantial amount of data independence. Hence, while relational systems adopt a
declarative approach to database processing (that is, they specify what data is to be retrieved), network
and hierarchical systems adopt a navigational approach (that is, they specify how the data is to be
retrieved).

Physical Data Models

Physical data models are low-level and describe how data is stored in the computer, representing
information such as record structures, record orderings, and access paths. There are not as many
physical data models as logical data models, the most common ones being the unifying model and the
frame memory. They are generally meant for computer specialists.

Conceptual Modeling

The conceptual schema is the 'heart' of the database. It supports all the external views and is, in turn,
supported by the internal schema. However, the internal schema is merely the physical implementation
of the conceptual schema. The conceptual schema should be a complete and accurate representation of
the data requirements of the enterprise. If this is not the case, some information about the enterprise

11
DATABASE SYSTEMS
WACHIRA DAVIS

will be missing or incorrectly represented and we will have difficulty fully implementing one or more of
the external views.

Conceptual modeling, or conceptual database design, is the process of constructing a model of the
information use in an enterprise that is independent of implementation details, such as the target
DBMS, application programs, programming languages, or any other physical considerations. It is a high
level description of the database. This model is called a conceptual data model. Conceptual models are
also referred to as logical models in the literature. However, the conceptual model is independent of all
implementation details, whereas the logical model assumes knowledge of the underlying data model of
the target DBMS. High level or conceptual data models are based on entities and relationships and they
provide concepts that are close to the way many users perceive data.

Reference

Connolly, T. and Begg, C. : Database Systems: A Practical Approach to Design, Implementation


and management (Addison Wesley, N.Y., 2005).

12

You might also like