Database

Database
The database sometimes referred to as an electronic database, is an organized collection of logically

related data that is stored in an efficient manner so that it can be easily accessed managed and updated.
Let's divide the whole definition into parts and understand in an easier way
Organized Collection
Data should be arranged in such a way that the user can easily process the data when required.
Logically Related Data

Logically related data means that the data should be relevant in some context.
Example: If we are going to make a database for a customer then the database may include customer
name, contact number etc. All these information are in the context of the customer. But the information
like the number of siblings of the customer is out of context and logically not related to the customer
database
DBMS - DataBase Management System

The software which is used to manage the database is called Database Management System(DBMS). It
provides us with an interface or a tool, to perform various operations like:
 creating the database

 manipulating the database
 storing and retrieving the data from the database
 deleting data from the database, etc
The changes in the database have to be made according to certain rules and these rules are defined in
DBMS itself.
*A DBMS can limit what data the end-user sees.
*It provides multiple views of the same database depending upon the user accessibility.
* It can provide access to read and write on the database.
Some popular DBMS software is MySQL, Oracle, SQLite, PostgreSQL, MariaDB etc.
Characteristics of DBMS
 Real-World Entity: A DBMS uses real-world entities(object) to design its architecture. Example: A
customer database uses customers as an entity and the phone number of the customer as an
attribute.
Relation-based Tables: Using DBMS we can form tables based on relations between various entities.
Example: In a university database, we can have students and college as entities.

-We can have relation b/w student & college i.e student studies in a college.
-Using this we can form two tables, one table of entity student and another of entity college.
Query Language: DBMS comes equipped with query language which allows the users to store and
retrieve the data. We can apply as many filtering options as required and get specific results.
Multiple Views: It provides multiple views of the same data depending upon the user.
Example: In a university database, the accountant will have a different view of data than a student. The
accountant may have access to the salary of teachers but students won't have that access.
Multiple Users: DBMS allows multiple users can access the data at the same time and work upon it
parallelly.
ACID Properties: The transaction(a group of tasks) in DBMS follows the concept of ACID;
Atomicity: It means either the transaction will happen or it will not happen. It means if any operation
is performed on the data, either it should be performed or executed completely or should not be
executed at all. It further means that the operation should not break in between or execute partially.
Consistency: It means the state of the database will be consistent before and after the transaction.
Isolation: One transaction will not affect the working of others.
Durability: It means the database should be durable and should not be affected by some system
failures or any other errors.
Users in a DBMS
-DBMS provides an interface for many users to access and retrieve the data.
-Type of access depends upon the software capabilities of the user.
Types of users in DBMS on the basis of their software capabilities and expertise:
1. Application Programmers : They make software programs for managing the database.
2. Database Administrator: He/She is responsible for managing the entire database system and
are called database admin(DBA).
3. End-Users: They are the people who use DBMS software and perform various operations like
retrieving, deleting, inserting etc.
Advantages of DBMS
Data Abstraction: It shows only those data to the user which are useful for them and hides the
complexity of data from the end-users.
Control data Redundancy: It controls the database from forming multiple copies of the same data.
Minimized Data Inconsistency: The DBMS keeps a check that if the value of an object is present in
two different files then both these values should be the same.
Easy Data Manipulation: In DBMS the data is centralized so we can easily modify the data at one
place and the change would be reflected at all other places where the data is present.
Concurrent Access: Multiple users can access the data at the same time.
Backup and Recovery: We can make copies of our data and data can also be recovered during system
failure by applying some recovery techniques.
Disadvantages of DBMS
Increased Cost: The cost of maintaining software, hardware, and personnel to operate and maintain
the DBMS can be very high.
Increased Complexity: Since most of these DBMS use many different technologies at the same time
they require training for users to use this. Only specialized personnel can operate it.
Frequent Update: As new technologies are coming in the market every day we need to remain
updated. These upgrades and training the database users and DBA to learn new changes increases costs
to the company.
Higher Impact Of Failure: The database in DBMS is centralized which increases the vulnerability of the
system. So the failure of any component or corruption of any storage device can bring the system to a
halt.
Why database design is important?
Database Design: Database design is the process of creating a structured plan for a database that
outlines how data will be stored, organized, and accessed.
-It involves designing database schema
-Database design is an important aspect of database management, as it determines the efficiency and
effectiveness of the database in storing and retrieving data.
-A well-designed database can improve the performance and scalability of the database, while a poorly
designed database can lead to problems such as data redundancy, data inconsistency, and slow query
performance.
(Extra)
database schema
What is Schema?
-The Skeleton of the database is created by the attributes(titles/headings) and this skeleton is named
Schema.
-Schema mentions the logical constraints like table, primary key, etc.
-The schema does not represent the data type of the attributes.
Details of a Customer
Schema of Customer
Database Schema
-A database schema is a logical representation of data that shows how the data in a database should be
stored logically.
-It shows how the data is organized and the relationship between the tables.
Database schema contains table, field, views and relation between different keys like primary
key,foreign key
(Process of database design) There are several key considerations in database design, including:
The database design process involves creating a structured plan for a database that outlines how data
will be stored, organized, and accessed. It typically involves the following steps:
1. Data requirements: Identify the types of data that need to be stored and the relationships
between different data elements.
2. Data normalization: Divide the data into smaller, related tables to eliminate
redundancy(unneeded) and improve data integrity(quality).
3. Indexes: Determine which data elements need to be indexed to improve query performance.
4. Data types: Choose appropriate data types for each data element to ensure efficient storage
and retrieval of data.

5. Create indexes: Determine which data elements need to be indexed to improve query
performance. An index is a data structure that helps the database system locate data more
quickly.
6. Test and refine the design: Test the database design to ensure that it meets the needs of the
organization and makes efficient use of resources. This might involve running performance tests
and making adjustments to the design as needed.
What is a Good Database Design?
-A good database design is one that is well-structured, efficient, and flexible.
-It meets the needs of the organization and supports the efficient storage and retrieval of data.
-A good database design is normalized, which means that the data is divided into smaller, related tables
and there is minimal redundancy.
Summarized
This helps to eliminate redundancy and improve the integrity and efficiency of the data. A good
database design is also efficient, which means that it makes effective use of resources such as storage
and processing power, and ensures that queries are fast and efficient. It is flexible, which means that it
can adapt to changing requirements and needs, and is able to handle new types of data and new
relationships between data elements without requiring major redesigns.
Additionally, a good database design is scalable, which means that it can handle increasing amounts of
data and queries without degrading performance, and is secure, with appropriate measures in place to
protect the data from unauthorized access and manipulation.
What is Data Normalization?
Data normalization is the process of organizing a database in a way that reduces redundancy and
dependency, redundancy can lead to inconsistencies and errors in the data and can make it more
difficult to update and maintain the database. By normalizing the data, organizations can reduce
redundancy and improve the integrity of the data.
It involves dividing the data into smaller, related tables and establishing relationships between those
tables.
There are several levels of data normalization, each with its own set of rules. The most common levels of
normalization are:
First normal form (1NF): In 1NF, data is divided into tables with unique primary keys, and there are no
repeating groups of data within a table.

 Second normal form (2NF): In 2NF, data is further normalized by removing partial dependencies on
the primary key. This means that non-key attributes are dependent on the entire primary key, rather
than just a part of it.
 Third normal form (3NF): In 3NF, data is further normalized by removing transitive dependencies. This
means that non-key attributes are dependent only on the primary key, and not on other non-key
attributes.
Data normalization is a process that helps to eliminate redundancy and improve the integrity and
efficiency of a database. By normalizing the data, organizations can design more effective and efficient
databases that are easier to update and maintain.
Normal Description
Form
1NF A relation is in 1NF if it contains an atomic value.
2NF A relation will be in 2NF if it is in 1NF and all non-key attributes are fully functional dependent
on the primary key.
3NF A relation will be in 3NF if it is in 2NF and no transition dependency exists.
BCNF A stronger definition of 3NF is known as Boyce Codd's normal form.
4NF A relation will be in 4NF if it is in Boyce Codd's normal form and has no multi-valued
dependency.
5NF A relation is in 5NF. If it is in 4NF and does not contain any join dependency, joining should be
lossless.
Types of Database Schemas:

A database schema is a visual representation of a database that shows the tables, columns, and
relationships between different data elements. There are several types of database schemas, each with
its own characteristics and uses:
 Conceptual schema. A conceptual database schema gives a high-level view of what your
database will contain and how different pieces of information relate to each other, without
offering real-world details i.e how everything is stored
 Logical database schema. Logical schemas flesh out conceptual schemas with more concrete
details about the objects that will be contained within them, such as names, tables, views, and
integrity constraints.
 Physical database schema. A physical schema is an actual design for a relational database. It
includes all the technical and contextual information needed for the schema and is created with
a specific physical data system in mind.
 User schema: A user schema is a view of the data that is specific to a particular user or group
of users. It represents the data that is relevant to the user and the way that the user wants to
access and view the data.
Why Database Design Matters:
Database design shapes how efficiently a database stores and retrieves data. A good design boosts
performance and scalability, while a poor one can cause issues like redundant data, inconsistencies, and
slow queries.
Key Reasons:
Data Integrity: Ensures accurate, consistent, and error-free data. Poor designs may lead to redundancy
and inconsistencies, eroding trust in query results.

Query Performance: Enhances speed and ease of data retrieval, crucial for frequently accessed or
large databases. Poor designs can result in sluggish query performance, impacting organizational
efficiency.
Scalability: Allows databases to handle growing data and queries without performance drop. Vital for
databases expected to expand, ensuring continued support for organizational needs.
Cost Savings: Well-designed databases are more cost-effective, reducing the need for extra hardware
and software. Poor designs may require more resources, leading to higher maintenance costs.
In summary, good database design improves performance, maintains data integrity, and saves costs—
essential for effective data management.
Database Maintenance
Good database maintenance is essential for ensuring the accuracy, reliability, and performance of a
database. Proper maintenance can help to prevent data loss, corruption, and other issues that can occur
over time, and it can also help to optimize the performance of the database.
There are several key benefits to good database maintenance:
Data accuracy: Good database maintenance helps to ensure that the data in the database is accurate
and up to date. This is important because errors in the data can lead to incorrect results and decision-
making, which can have serious consequences for an organization.
 Data integrity: Good database maintenance helps to maintain the integrity of the data, which means
that the data is consistent and follows the rules and constraints that have been set for it. This is
important because data integrity is essential for the reliability of the database.
Performance: Good database maintenance can help to optimize the performance of the database. This
is important because a poorly performing database can result in slow query times and other issues,
which can have a negative impact on the efficiency and productivity of an organization.
 Data security: Good database maintenance includes measures to protect the data from unauthorized
access and manipulation. This is important because data breaches and other security incidents can have
serious consequences for an organization, including legal and regulatory penalties, damage to
reputation, and financial losses.
There are several key activities that are involved in good database maintenance. These include:
Backups: Regular backups of the database are essential to ensure that the data can be restored in case
of a disaster or other data loss event.
Indexing: Indexing helps to improve the performance of the database by creating structures that allow
the database to locate data more quickly.
3. Data cleansing: Data cleansing involves identifying and correcting errors or inconsistencies in
the data. This is important because data errors can lead to incorrect results and decision-
making.
4. Optimization: Optimization involves identifying and addressing performance issues in the
database. This can include activities such as index optimization, query optimization, and
hardware optimization.
5. Security: Good database maintenance includes measures to protect the data from
unauthorized access and manipulation. This can include activities such as password
management, access control, and security audits.
Good database maintenance is essential for ensuring the accuracy, reliability, and performance of a
database. Proper maintenance can help to prevent data loss, corruption, and other issues that can occur
over time, and it can also help to optimize the performance of the database.
Key activities involved in good database maintenance include backups, indexing, data cleansing,
optimization, and security. By investing in good database maintenance, organizations can improve the
quality and value of their data, and increase the efficiency and productivity of their operations.
In a Database Management System (DBMS), the concept of files and the file system is crucial for
organizing and managing data. Here's an overview:
File:
In the context of a DBMS, a file is a collection of related records.
A file represents a table in a relational database or an entity in other types of databases.
Each record in the file corresponds to a row in a table, and each field in the record corresponds to a
column in a table.
File Organization:
Files in a DBMS can be organized in different ways based on the requirements of the application and the
efficiency of data retrieval.
Common file organizations include sequential, random, and hashed.
Sequential File Organization:

Records are stored in sequential order based on a primary key or some other field.
Suitable for applications that require sequential processing of records.
It may not be efficient for direct access or searching.
Random (or Direct) File Organization:
Records can be accessed directly without having to read through the preceding records.
Requires an index structure to facilitate direct access based on a key field.
Suitable for applications that require quick retrieval of specific records.
Hashed File Organization:
Uses a hash function to determine the storage location of records.
Provides fast access to records, especially in scenarios where a unique key is used.
Suitable for applications where quick access to specific records is crucial.
File System:
The file system in a DBMS is a mechanism for organizing and managing files.
It includes components like data dictionary, data catalog, and index files.
The data dictionary contains metadata about the structure of the database, such as information about
tables, fields, and relationships.
The data catalog stores information about the data stored in the database, including the location of files
and indexes.
Index files are used to speed up data retrieval by providing a quick reference to the location of specific
records.
Data Integrity and Security:

The file system in a DBMS also manages data integrity and security.
It enforces constraints to maintain the consistency and accuracy of data.
Access control mechanisms are implemented to ensure that only authorized users can access and
modify data.
Transaction Management:
The file system plays a role in managing transactions, ensuring that multiple operations on the database
occur atomically (all or nothing) and maintain consistency.
In modern database systems, the file system is often abstracted away, and databases use sophisticated
data structures and algorithms to manage data efficiently. Relational database management systems
(RDBMS) like MySQL, PostgreSQL, and Oracle, for example, provide a high-level interface for users and
applications to interact with data without dealing directly with file organization details.
Problems with file System Data Management
While the use of file systems for data management in databases was prevalent in early database
systems, it had several limitations and problems. Here are some of the key issues associated with using a
file system for data management in a Database Management System (DBMS):
Data Redundancy:
In a file system-based approach, data redundancy is a common problem. The same data may be
duplicated in multiple files, leading to inconsistencies and wastage of storage space.

Data Inconsistency:
The decentralized nature of file systems makes it challenging to maintain data consistency. Updates and
modifications to data may result in inconsistencies, especially when multiple applications access the
same data.
Data Isolation:
Data isolation refers to the situation where each application has its own set of files, and changes made
by one application may not be immediately visible to other applications. This lack of data sharing can
lead to inefficient use of data and difficulties in maintaining a unified and coherent view of the data.
Difficulty in Access and Retrieval:
Retrieving specific data from a file system can be inefficient, especially when dealing with large datasets.
File systems may not provide efficient mechanisms for searching, sorting, and filtering data.
Limited Concurrent Access:
File systems may not handle concurrent access by multiple users or applications well. This can result in
issues such as data corruption or the inability to perform certain operations when data is being accessed
or modified by others.
Security Concerns:
File systems often lack robust security features. It may be challenging to implement access controls,
encryption, and other security measures to protect sensitive data adequately.
Lack of Data Integrity Constraints:
Maintaining data integrity (ensuring that data satisfies certain consistency constraints) can be difficult in
a file system. Without the enforcement of constraints, there is a higher risk of introducing errors and
inconsistencies in the data.

Scalability Issues:
As the volume of data increases, file systems may struggle to scale efficiently. Performance degradation
and increased complexity can be significant challenges in managing large datasets.
Limited Data Relationships:
File systems do not inherently support the establishment and enforcement of relationships between
different sets of data. In contrast, relational database management systems excel in managing
relationships between tables, ensuring data integrity.
Maintenance Challenges:
Maintenance tasks, such as data backup, recovery, and optimization, can be more complex in a file
system-based approach compared to modern database systems.
In response to these challenges, relational database management systems (RDBMS) emerged as a more
structured and efficient way to manage data, providing features like data integrity, normalization, and
transaction management. RDBMS systems have largely supplanted file systems for data management in
modern applications due to their ability to address these issues effectively.

Database

Uploaded by

Copyright:

Available Formats

Database

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Database

Uploaded by

Copyright:

Available Formats

Database

The database sometimes referred to as an electronic database, is an organized collection of logically

Logically Related Data

DBMS - DataBase Management System

 creating the database

*A DBMS can limit what data the end-user sees.

* It can provide access to read and write on the database.

Example: In a university database, we can have students and college as entities.

Isolation: One transaction will not affect the working of others.

-Type of access depends upon the software capabilities of the user.

Why database design is important?

-It involves designing database schema

between different data elements.

redundancy(unneeded) and improve data integrity(quality).

and retrieval of data.

What is a Good Database Design?

-A good database design is one that is well-structured, efficient, and flexible.

and there is minimal redundancy.

relationships between data elements without requiring major redesigns.

protect the data from unauthorized access and manipulation.

What is Data Normalization?

redundancy and improve the integrity of the data.

repeating groups of data within a table.

than just a part of it.

databases that are easier to update and maintain.

1NF A relation is in 1NF if it contains an atomic value.

3NF A relation will be in 3NF if it is in 2NF and no transition dependency exists.

BCNF A stronger definition of 3NF is known as Boyce Codd's normal form.

Types of Database Schemas:

its own characteristics and uses:

offering real-world details i.e how everything is stored

access and view the data.

Why Database Design Matters:

and inconsistencies, eroding trust in query results.

databases expected to expand, ensuring continued support for organizational needs.

essential for effective data management.

There are several key benefits to good database maintenance:

making, which can have serious consequences for an organization.

reputation, and financial losses.

of a disaster or other data loss event.

the database to locate data more quickly.

4. Optimization: Optimization involves identifying and addressing performance issues in the

management, access control, and security audits.

organizing and managing data. Here's an overview:

In the context of a DBMS, a file is a collection of related records.

A file represents a table in a relational database or an entity in other types of databases.

efficiency of data retrieval.

Common file organizations include sequential, random, and hashed.

Sequential File Organization:

Suitable for applications that require sequential processing of records.

It may not be efficient for direct access or searching.

Random (or Direct) File Organization:

Requires an index structure to facilitate direct access based on a key field.

Suitable for applications that require quick retrieval of specific records.

Hashed File Organization:

Uses a hash function to determine the storage location of records.

Suitable for applications where quick access to specific records is crucial.

tables, fields, and relationships.

Data Integrity and Security: