Database Management System Answer Key - Activity 1

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 10

INFORMATION MANAGEMENT - ASSIGNMENT

PART 1:
1. Define each of the following terms:
a. Data – These are “raw” facts such as a telephone number, a birth date, a customer
name, and a year-to-date (YTD) sales value. Data have little meaning unless they have
been organized in some logical manner.
b. Field – A character or group of characters (alphabetical or numeric) that has a specific
meaning. A field is used to define and store data.
c. Record – A logically connected set of one or more fields that describes a person, place,
or thing. For example, the fields that constitute a record for a customer might consist of
the customer’s name, address, phone number, date of birth, credit limit and unpaid
balance.
d. File – It is a collection of related records. For example, a file might contain data about
the students currently enrolled at Gigantic University.

2. What is data redundancy, and which characteristics of the file system can lead to it?
Data redundancy is when the same data are stored unnecessarily at different places.
For example, a customer’s telephone number may be found in the customer file, in the sales agent file,
and in the invoice file. Data redundancy is symptomatic of a (computer) file system, given its inability to
represent and manage data relationships. Characteristics of the file system that may result of Data
redundancy: Poor data security when having multiple copies of data increases the chances for
a copy of the data to be susceptible to an unauthorized access. Next, Data inconsistency
exists when different and conflicting versions of the same data appear in different places. Last,
Data Anomalies is an abnormality and it develops when not all of the required changes in the
redundant data are made successfully.

3. What is data independence, and why is it lacking in file systems?


Data independence is a condition in which the programs that access data are not
dependent on the data storage characteristics of the data. Systems that lack data independence
are said to exhibit data dependence. File systems exhibit data dependence because file access
is dependent on a file's data characteristics. Therefore, any time the file data characteristics are
changed, the programs that access the data within those files must be modified.
Data independence exists when changes in the data characteristics don't require
changes in the programs that access those data. File systems lack data independence because
all data access programs are subject to change when any of the file system’s data storage
characteristics - such as changing a data type - change.
4. What is a DBMS, and what are its functions?
Database Management System (DBMS) is a collection of programs that manages the
database structure and controls access to the data stored in the database. In a sense, a
database resembles a very well-organized electronic filing cabinet in which powerful software,
known as database management system, helps manage the cabinet’s content.
INFORMATION MANAGEMENT - ASSIGNMENT

a. Data Dictionary Management – the DBMS stores definitions of the data elements and
their relationships (metadata) in a data dictionary. This data dictionary looks up the
required data component structures and relationships, thus relieving the person from
having to code such complex relationships in each program.
b. Data Storage Management – the DBMS creates and manages the complex structures
required for data storage, thus relieving the person from difficult task of defining and
programming the physical data characteristics. This is also important for database
performance tuning. Performance tuning relates to the activities that make the database
perform more efficiently in terms of storage and access speed.
c. Data Transformation and Presentation – the DBMS transforms entered data to
conform to required data structures. It relieves the person of the chore of making a
distinction between the logical data format and the physical data format. The DBMS
formats the physically retrieved data to make it conform to the user’s logical
expectations.
d. Security Management – the DBMS creates a security system that enforces user
security and data privacy. Security rules determine which users can access the
database, which data items each user can access, and which data operations (read,
add, delete, or modify) the user can perform.
e. Multiuser access control – to provide data integrity and data consistency, the DBMS
uses sophisticated algorithms to ensure that multiple users can access the database
concurrently without compromising the integrity of the database.
f. Backup and Recovery Management – the DBMS provides backup and data recovery
to ensure data safety and integrity. Recovery management deals with the recovery of the
database after a failure, such as bad sector in the disk or a power failure.
g. Database access languages and application programming interfaces – the DBMS
provides data access through a query language. A query language is a nonprocedural
language – one that lets the user specify what must be done without having to specify
how it is to be done. Structured Query Language (SQL) is the de facto query language
and data access standard supported by the majority of DBMS vendors. The DBMS also
provides application programming interfaces to procedural languages such as COBOL,
C, Java, Visual Basic.Net, and C#.
h. Database Communication Interfaces – current-generation DBMSs accept end-user
requests via multiple, different network environments. For example, DMBS might provide
access to the database via the Internet through the use of Web browsers such as
Mozilla Firefox or Microsoft Internet Explorer. Communication can be accomplished in
ways such as the end users can generate answers to queries by filling in screen forms
through their preferred Web browser, DBMS can automatically publish predefined
reports on a website, and DBMS can connect to third-party systems to distribute
information via e-mail or other productivity applications.

5. What is structural independence, and why is it important?


Structural independence exists when data access programs are not subject to change
when the file’s structural characteristics, such as the number or order of the columns in a table,
change. Structural independence is important because it substantially decreases programming
effort and program maintenance costs.
INFORMATION MANAGEMENT - ASSIGNMENT

6. Explain the difference between data and information.


Data are derived from raw facts to provide good information and to make a good
decision out from it. The word raw indicates that the facts have not yet been processed to reveal
their meaning. Raw data must be properly formatted for storage, processing, and presentation.
Meanwhile, Information is the result of processing raw data to reveal its meaning. Data
processing can be as simple as organizing data to reveal patterns or as complex as making
forecasts or drawing inferences using statistical modeling. To reveal meaning, information
requires context. And from having the Information, production of accurate, relevant, and timely
Information is the key to good decision-making. In turn, good decision making is the key to
business survival in a global market.

7. What is the role of a DBMS, and what are its advantages? What are its disadvantages?
The role of a DBMS is to serve as the intermediary between the user and the database.
The database structure itself is stored as a collection of files, and the only way to access the
data in those files is through the DBMS.
Advantages:
a. Having DBMS enables the data in the database to be shared among multiple
applications or users.
b. Integrates the many different users’ views of the data into a single all-encompassing
data repository.
c. Improved data sharing. The DBMS helps create an environment in which the end
users have better access to more and better-managed data. Such access makes it
possible for end users to respond quickly to changes in their environment.
d. Improved data security. The more users access the data, the greater the risks of data
security breaches. A DBMS provides a framework for better enforcement of data privacy
and security policies.
e. Better Data Integration. Wider access to well-managed data promotes an integrated
view of the organization’s operations and a clearer view of the big picture. It becomes
much easier to see how actions in one segment of the company affect other segments.
f. Minimized data inconsistency. Data Inconsistency is when different versions of the
same data appear in different places.
g. Improved data access. The DBMS makes it possible to produce quick answers to ad
hoc queries.
h. Improved decision making. Better-managed data and improved data access make it
possible to generate better-quality information, on which better decisions are based.
i. Increased end-user productivity. The availability of the data, combined with the tools
that transform data into usable information, empowers end users to make quick,
informed decisions that can make the difference between success and failure in the
global economy.
Disadvantages:
INFORMATION MANAGEMENT - ASSIGNMENT

a. Increased costs – Database systems require sophisticated hardware and software and
highly skilled personnel. The cost of maintain the hardware, software, and personnel
required to operate and manage a database system can be substantial. Training,
licensing, and regulation compliance costs are often overlooked when database systems
are implemented.
b. Management Complexity – Database systems interface with many different
technologies and have significant impact on a company’s resources and culture.
Changes must be properly managed to ensure that the adoption of database system
help advance the company’s objectives. Security of the company’s data must be
assessed constantly.
c. Maintaining currency – to maximize the efficiency of the database system, the system
must be current. Therefore, it must perform frequent updates and apply the latest and
security measures to all components. Because technology advances rapidly, personal
training costs tend to be significant.
d. Vendor dependence – Companies might be reluctant to change database vendors. As
a consequence, vendors are less likely to offer pricing point advantages to existing
customers, and those customers might be limited in their choice of database system
components.
e. Frequent upgrade/replacement cycles – DBMS vendors frequently upgrade their
products by adding new functionality. New features often come in bundled in new
upgrade versions of the software. Some of these versions require hardware upgrades.
Upgrades themselves cost money, but it also costs money to train database users and
administrators to properly use and manage the new features.

8. List and describe the different types of databases.


Based on users:
a. Single-user database – supports only one user at a time. If user A is using the
database, users B and C must wait until user A is done. A single-user database that runs
on a personal computer is called a desktop database.
b. Multiuser database – supports multiple users at the same time. If the multiuser
database supports relatively small number of users (usually fewer than 50) or a specific
department within an organization, it is called a workgroup database. And when the
database is used by the entire organization and supports many users (more than 50,
usually hundreds), across many departments, the database becomes an enterprise
database.
Based on locations:
a. Centralized database – it supports data located at a single site.
b. Distributed database – it supports data distributed across several different sites.
Based on the extent of use:
a. Operational database (transactional or production database) – it is designed
primarily to support a company’s day-to-day operations. Transactions such as product or
service sales, payments, and supply purchases reflect critical day-to-day operations.
Such operations must be recorded accurately and immediately.
INFORMATION MANAGEMENT - ASSIGNMENT

b. Data warehouse – focuses primarily on storing data used to generate information


required to make tactical or strategic decisions. Such decisions require extensive “data
massaging” or data manipulation to extract information to formulate pricing decisions,
sales forecasts, market positioning, and so on. Most decision support data are based on
data obtained from operational databases over time and stored in data warehouses.
Based on Degree:
a. Unstructured Data – these are data that exist in their original (raw) state, that is, in the
format in which they were collected. Therefore, unstructured data exist in a format that
does not lend itself to the processing that yields information.
b. Structured Data – are the results of taking unstructured data and formatting
(structuring) such data to facilitate storage, use, and the generation of information.

9. What are the main components of a database system?


Database system refers to an organization of components that define and regulate the
collection, storage, management, and use of data within a database environment.
a. Hardware – it refers to all of the system’s physical devices. For example, computers
(PCs, workstations, servers, and supercomputers), storage devices, printers, network
devices (hubs, switches, routers, fiber optics), and other devices (automated teller
machines, ID readers, and so on).
b. Software – the DBMS alone but we have three types of software.
1. Operating System software – manages all hardware components and makes it
possible for all other software to run on the computers. Examples of OS software
include Microsoft Windows, Linux, MacOS, Linux, and MVs.
2. Database Management System Software – manages the database within the
database system. Some of the DBMS examples include Microsoft SQL server,
Oracle Corporation’s Oracle, Sun’s MySQL, and IBM’s DB2.
3. Application program and utility software – it is used to access and manipulate
data in the DBMS and to manage the computer environment in which data access
and manipulation take place. Application program are commonly used to access data
found within the database to generate reports, tabulations, and other information to
facilitate decision making. On the other hand, utilities are software tools used to help
manage the database system’s computer components. Example, DBMS vendors
provide GUI to help create database structures, control database access, and
monitor database operations.\
c. People – users of the database system. Five types of users.
1. System Administrators – oversee the database system’s general operations.
2. Database administrators (known as DBA) – manage the DBMS and ensure that
the database is functioning properly.
3. Database designers (in effect, Database Architects) – design the database
structure.
4. System Analysts and programmers – design and implement the application
programs. They design and create the data entry screens, reports, and
procedures through which end users’ access and manipulate the database’s
data.
INFORMATION MANAGEMENT - ASSIGNMENT

5. End users – these are people who use the application programs to run the
organization’s daily operations. High-level end users employ the information
obtained from the database to make tactical and strategic business decisions.
Example: salesclerks, supervisors, managers, and directors.
d. Procedures – these are instructions and rules that govern the design and use of the
database system. Procedures play an important role in a company because they enforce
the standards by which business is conducted within the organization and with
customers. Procedures are also used to ensure that there is an organized way to
monitor and audit both the data that enter the database and the information that is
generated through the use of those data.
e. Data – it covers the collection of facts stored in the database. Data are the raw material
from which information is generated, the determination of what data are to be entered
into the database and how those data are to be organized is a vital part of the database
designer’s job.

10. What are metadata?


Metadata, or data about data, through which the end-user data are integrated and
managed. Metadata is a description and context of the data. It helps to organize, find, and
understand data. Example of Metadata in Relational database in a structure called data
dictionary system catalog. It holds information about tables, columns, data types, table
relationship, constraints and etc.

11. Explain why database design is important.


Database design refers to the activities that focus on the design of the database
structure that will be used to store and manage end-user data. The structure of the database
must be designed carefully. It is the crucial aspect of working with databases because even
DBMS will perform poorly with a badly designed database. Proper database design requires the
designer to identify precisely the database’s expected use. For example, designing transactional
database emphasizes accurate and consistent data and operational speed. Meanwhile,
designing a data warehouse database emphasizes the use of historical and aggregated data.
Other approaches are implemented in centralized, single-user, multiuser, and distributed
environment.
A well-designed database facilitates data management and generates accurate and
valuable information. A poorly designed database is likely to become a breeding ground for
difficult-to-trace errors that may lead to bad decision making- and bad decision making can lead
to the failure of an organization.

12. What are the potential costs of implementing a database system?


a. Sophisticated hardware and software, trained personnel
b. Training, licensing and regulation compliance costs
INFORMATION MANAGEMENT - ASSIGNMENT

c. Vendor dependence - vendors are less likely to offer pricing point advantages to existing
customers
d. Updating of hardware and software; additional training

13. Use examples to compare and contrast unstructured and structured data. Which type is
more prevalent in a typical business environment?
Unstructured data is simply data that has not been processed to yield information.
Examples of both types would include an invoice. If one were to take an invoice and simply scan
it into a graphic, it would be unstructured data. In contrast, if it were processed and put into a
database (subsequently becoming structured data), employees could eventually find the
monthly averages, amount owed, etc. from various invoices. While both are prevalent, I would
think semi structured data would be the most common in a typical business. Some data is
stored but not processed (unstructured data such as memos), and some others are stored in
databases (such as invoices) but most data are only processed to a certain extent that is
displayed in a prearranged format but not able to yield all of the information contained within.

14. What are some basic database functions that a spreadsheet cannot perform?
Spreadsheet allows the creation of multiple tables, but it does not support even the most
basic database functionality such as support for self-documentation through metadata,
enforcement of data types or domains to ensure consistency of data within a column, ,defined
relationships among tables, or constraints to ensure consistency of data access across related
tables. Most users lack the necessary training to recognize the limitations of spreadsheets for
these types of tasks.

15. What common problems does a collection of spreadsheets created by end users share with
the typical file system?
a. Lengthy development times
b. Difficulty of getting quick answers
c. Complex system administration
d. Lack of security and limited data sharing
e. Extensive programming

16. Explain the significance of the loss of direct, hands-on access to business data that end
users experienced with the advent of computerized data repositories.
Users lost direct, hands/on access to the business data when computerized data
repositories were developed because the IT skills necessary to directly access and manipulate
the data were beyond the average user’s abilities and because security precautions restricted
access to the shared data. This was significant because it removed users from the direct
manipulation of data and introduced significant time delays for data access. When users need
answers to business questions from the data, necessity often does not give them the luxury of
INFORMATION MANAGEMENT - ASSIGNMENT

time to wait days, weeks, or even months for the required reports. The desire to return hands/on
access to the data to the users, among other drivers, helped to propel the development of
database systems file database systems have greatly improved the ability of users to directly
access data, the need to quickly manipulate data for themselves has led to the problems of
spreadsheets being used when databases are needed

PART 2:
1. How many records does the file contain? How many fields are there per record
The file contains 7 records (21-5Z through 31-7P) and each of the records is composed
of 5 fields (PROJECT_CODE through PROJECT_BID_PRICE).
2. What problem would you encounter if you wanted to produce a listing by city? How would you
solve this problem by altering the file structure?
The city names are contained within the MANAGER_ADDRESS attribute and
decomposing this character (string) field at the application level is cumbersome at best.
(Queries become much more difficult to write and take longer to execute when internal string
searches must be conducted). If the ability to produce city listings is important, it is best to store
the city name as separate attribute.

3. If you wanted to produce a listing of the file contents by last name, area code, city, state, or
zip code, how would you alter the file structure?
The more we divide the address into its component parts, the greater its information
capabilities. For example, by diving MANAGER_ADDRESS into its component parts
(MGR_STREET, MGR_CITY, MGR_STATE, and MGR_ZIP), we gain the ability to easily select
records on the basis of zip codes, city names, and states. Similarly, by subdividing the
MANAGER name into its components MGR_LASTNAME, MGR_FIRSTNAME, and
MGR_INITIAL, we gain the ability to produce more efficient searches and listings. For example,
creating a phone directory is easy when you can sort by last name, first name, and initial.
Finally, separating the area code and the phone number will yield the ability to efficiently group
data by area codes. Thus, MGR_PHONE might be decomposed into MGR_AREA_CODE and
MGR_PHONE. The more you decompose the data into their component parts, the greater the
search flexibility. Data that are decomposed into their most basic components are said to be
atomic.

4. What data redundancies do you detect? How could those redundancies lead to anomalies?
Note that the manager named Holly B. Parker occurs 3 times, indicating that she
manages 3 projects coded 21-5Z, 25-9T, and 29-2D, respectively. (The occurrences indicate
that there is a 1:M relationship between PROJECT and MANAGER: each project is managed by
only one manager but, apparently, a manager may manager more than a project). Ms. Parker's
phone number and address also occur three times. If Ms. Parker moves and/or changes her
phone number, these changes must be made more than once and they must all be made
correctly... without missing a single occurrence. If any occurrence is missed during the change,
the data are , different, for the same person. After some time, it may become difficult to
determine what the correct data are. In addition, multiple occurrences invite misspellings and
digit transpositions, thus producing the same anomalies. The same problems exist for the
multiple occurrences of George F. Dorts.
INFORMATION MANAGEMENT - ASSIGNMENT

5. Identify and discuss the serious data redundancy problems exhibited by the file structure
shown in Figure P1.5.
Given the file’s poor structure, the stage is set for multiple anomalies. For example, if the
charge for JOB_CODE = EE changes from $85.00 to $90.00, that change must be made twice.
Also, if employee June H. Sattlemeir is deleted from the file, you also lose information about the
existence of her JOB_CODE = EE, its hourly charge of $85.00, and the PROJ_HOURS = 17.5.
The loss of the PROJ_HOURS value will ultimately mean that the coast project costs are not
being charged properly, thus, causing a loss of PROJ_HOURS * JOB_CHG_HOUR = 17.5 x
$85.00 = $1 487.50 to the company.
Incidentally, note that the file contains different JOB_CHG_HOUR values for the same
CT job code, thus illustrating the effect of changes in the hourly charge rate over time. The file
structure appears to present transactions that charge project hours to each project. However,
the structure of this file makes it difficult to avoid update anomalies and it is not possible to
determine whether a charge change is accurately reflected in each record. Ideally, a change in
the hourly rate would be made in only one place and this change would then be passed on the
transaction based on the hourly charge. Such a structural charge would ensure the historical
accuracy of the transactions. The recommended changes require a lot of work in a file system.

6. Looking at the EMP_NAME and EMP_PHONE contents in Figure P1.5, what change(s)
would you recommend?
A good recommendation would be to make the data more atomic. That is, break-up the
data components whenever possible. For example, separate the EMP_NAME into its
components EMP_FNAME, EMP_INITIAL, EMP_LNAME. This change will make it much easier
to organize employee data through the employee’s name component. Similarly, the
EMP_PHONE data should be decomposed into EMP_AREACODE and EMP_PHONE. For
example, breaking up phone number 653-234-3245 into the area code 653 and the phone
number 234-3245 will make it much easier to organize the phone numbers by area code. (If you
want to print an employee phone directory, the more atomic employee name data will make the
job much easier.)

7. Identify the various data sources in the file you examined in Problem 5.
a. Employee data such as names and phone numbers.
b. Project data such as project names. If you start with an EMPLOYEE file, the project
names clearly do not belong in that file. (Project names are clearly not employee
characteristics.)
c. Job data such as the job charge per hour. If you start with an EMPLOYEE file, the job
charge per hour clearly does not belong in that file. (Hourly charges are clearly not
employee characteristics.)
d. The project hours, which are most likely the hours worked by the employee for that
project. (Such hours are associated with a work product, not the employee per se.)

8. Given your answer to Problem 7, what new files should you create to help eliminate the data
redundancies found in the file shown in Figure P1.5?
The data sources are probably the PROJECT, EMPLOYEE, JOB, and CHARGE. The
PROJECT file should contain project characteristics such as the project name, the project
manager/coordinator, the project budget, and so on. The EMPLOYEE file might contain the
employee names, phone number, address, and so on. The JOB file would contain the billing
charge per hour for each of the job’s types – a database designer, an application developer, and
an accountant would generate different billing charges per hour. The CHARGE file would be
INFORMATION MANAGEMENT - ASSIGNMENT

used to keep track of the number of hours by job type that will be billed for each employee who
worked on the project.

9. Identify and discuss the serious data redundancy problems exhibited by the file structure
shown in Figure P1.9. (The file is meant to be used as a teacher class assignment schedule.
One of the many problems with data redundancy is the likely occurrence of data inconsistencies
—two different initials have been entered for the teacher named Maria Cordoza.)
Note that the teacher characteristics occur multiple times in this file. For example, the
teacher named Maria Cordoza’s first name, last name, and initial occur 3 times. If changes must
be made for any given teacher, those changes must be made multiple times. All it takes is one
correct entry or one forgotten change to create data inconsistencies. Redundant data are not a
luxury you can afford in a data environment.

10. Given the file structure shown in Figure P1.9, what problem(s) might you encounter if
building KOM were deleted?
You would lose all the time assignment data about teachers Williston, Cordoza, and
Hawkins, as well as the KOM rooms 204E, 123, and 34. Here is yet another good reason for
keeping data about specific entities in their own tables. This kind of anomaly is called as
deletion anomaly.

You might also like