Imt 37 Oracle
Imt 37 Oracle
Imt 37 Oracle
(ORACLE)
ASSIGNMENTS
PART A
Q1. a.
Describe what metadata are and what value they provide to the databa
se system.
b.
What are the advantages of having the DBMS between the end users a
pplications and the
database?
c. Discuss some considerations when designing a database.
ANS a) Metadata is "data about data". The term is ambiguous, as it is used for two
fundamentally different concepts (types). Structural metadata is about the design and
specification of data structures and is more properly called "data about the containers of
data"; descriptive metadata, on the other hand, is about individual instances of
application data, the data content. Metadata are traditionally found in the card
catalogs of libraries. As information has become increasingly digital, metadata are also
used to describe digital data using metadata standards specific to a particular discipline.
By describing the contents and context of data files, the quality of the original data/files is
greatly increased. For example, a webpage may include metadata specifying what
language it is written in, what tools were used to create it, and where to go for more on
the subject, allowing browsers to automatically improve the experience of users.
Metadata (metacontent)are defined as the data providing information about one or more
aspects of the data, such as:
Standards used
For example, a digital image may include metadata that describe how large the picture
is, the color depth, the image resolution, when the image was created, and other data.
[1]
A text document's metadata may contain information about how long the document is,
who the author is, when the document was written, and a short summary of the
document.
Metadata are data. As such, metadata can be stored and managed in a database, often
called a Metadata registry or Metadata repository.[2] However, without context and a
point of reference, it might be impossible to identify metadata just by looking at them.
[3]
For example: by itself, a database containing several numbers, all 13 digits long could
be the results of calculations or a list of numbers to plug into an equation - without any
other context, the numbers themselves can be perceived as the data. But if given the
context that this database is a log of a book collection, those 13-digit numbers may now
be identified as ISBNs - information that refers to the book, but is not itself the
information within the book.
The term "metadata" was coined in 1968 by Philip Bagley, in his book "Extension of
programming language concepts" where it is clear that he uses the term in the ISO
11179 "traditional" sense, which is "structural metadata" i.e. "data about the containers of
data"; rather than the alternate sense "content about individual instances of data
content" or metacontent, the type of data usually found in library catalogues. [4][5] Since
then the fields of information management, information science, information technology,
librarianship and GIS? have widely adopted the term. In these fields the
word metadata is defined as "data about data".[6] While this is the generally accepted
definition, various disciplines have adopted their own more specific explanation and uses
of the term.
b.) A database management system (DBMS) is a collection of programs
Personnel Advantages
The purpose of the database and how it affects the design. Create a
database plan to fit your purpose.
Maintenance.
Q2. a.
List and briefly describe the different types database maintenance activ
ities.
b.
Database backups can be performed at different levels. List and describ
e these.
c. What are the classical approaches to database design?
b.) The focus in Oracle backup and recovery is generally on the physical backup
of database files, which permit the full reconstruction of your database. The files
protected by the backup and recovery facilities built into Enterprise Manager
include datafiles, control files, server parameter files (SPFILEs), and archived redo
log files. With these your database can be reconstructed. The backup mechanisms
that work at the physical level protect against damage at the file level, such as the
accidental deletion of a datafile or the failure of a disk drive.
c.) There are two approaches for developing any database, the top-down method
and the bottom-up method. While these approaches appear radically different, they
share the common goal of uniting a system by describing all of the interaction
between the processes.
Top down design method
The top-down design method starts from the general and moves to the specific. In
other words, you start with a general idea of what is needed for the system and then
work your way down to the more specific details of how the system will interact. This
process involves the identification of different entity types and the definition of each
entitys attributes.
Added by Ryan.M
Added by Ryan.M
Two general approaches (top down and bottom up) to the design of the
databases can be heavily influenced by factors like scope, size of the system, the
organizations management style, and the organizations structure. Depending on
such factors, the design of the database might use two very different approaches,
centralized design and decentralized design.
Centralized design
Centralized design is most productive when the data component is composed of a
moderately small number of objects and procedures. The design can be carried out
and represented in a somewhat simple database. Centralized design is typical of a
simple or small database and can be successfully done by a single database
administrator or by a small design team. This person or team will define the
problems, create the conceptual design, verify the conceptual design with the user
views, and define system processes and data constraints to ensure that the design
complies with the organizations goals. That being said, the centralized design is not
limited to small companies. Even large companies can operate within the simple
database environment.
Added by Ryan.M
Decentralized design
Decentralized design might best be used when the data component of the system
has a large number of entities and complex relations upon which complex operations
are performed. This is also likely to be used when the problem itself is spread across
many operational sites and the elements are a subset of the entire data set. In large
and complex projects a team of carefully selected designers are employed to get the
job done. This is commonly accomplished by several teams that work on different
subsets or modules of the system. Conceptual models are created by these teams
and compared to the user views, processes, and constraints for each module. Once
all the teams have completed their modules they are all put aggregated into one
large conceptual model.
Q3.
a.Explain the differences between a centralized and decentralized appr
oach to database design.
b. Explain how database designers design and normalize databases.
c. Explain the BCNF. How is it related to other normal forms?
Ans a.) he issue of centralization versus decentralization of
computer resources is not a new one; it has been widely discussed and hotly
debated for
at least two decades now. The interest in this issue was
originally motivated by
the feeling that the computer, a costly expense in terr.s of
investment and operating budget, should be used to the fullest possible potential.
Interest also grew
because it was felt that within a corporation, a large measure of
political
- 3 -
b). Database design is the process of producing a detailed data model of a database.
This logical data model contains all the needed logical and physical design choices and
physical storage parameters needed to generate a design in a Data Definition
Language, which can then be used to create a database. A fully attributed data model
contains detailed attributes for each entity.
The term database design can be used to describe many different parts of the design of
an overall database system. Principally, and most correctly, it can be thought of as the
logical design of the base data structures used to store the data. In the relational
model these are the tables and views. In an object database the entities and
relationships map directly to object classes and named relationships. However, the term
database design could also be used to apply to the overall process of designing, not just
the base data structures, but also the forms and queries used as part of the overall
database application within the database management system (DBMS).[1]
The process of doing database design generally consists of a number of steps which will
be carried out by the database designer. Usually, the designer must:
Superimpose a logical structure upon the data on the basis of these relationships
c. ) BoyceCodd normal form (or BCNF or 3.5NF) is a normal form used in database
normalization. It is a slightly stronger version of the third normal form (3NF). BCNF was
developed in 1974 byRaymond F. Boyce and Edgar F. Codd to address certain types of
anomaly not dealt with by 3NF as originally defined. [1]
Chris Date has pointed out that a definition of what we now know as BCNF appeared in
a paper by Ian Heath in 1971.[2] Date writes:
"Since that definition predated Boyce and Codd's own definition by some three years, it
seems to me that BCNF ought by rights to be called Heath normal form. But it isn't."[3]
Edgar F.Codd released his original paper 'A Relational Model of Data for Large Shared
Databanks' in June 1970. This was the first time the notion of a relational database was
published. All work after this, including the Boyce-Codd normal form method was based
on this relational model.
Q4. a.
What is a schema? How many schemas can be used in one database?
b.
What command is used to save changes to the database? What is the sy
ntax for this
command?How do you delete a table from the database? Provide an exa
mple.
c.
What is a subquery? When is it used? Does the RDBMS deal with subque
ries any differently
from normal queries?
Schemas are generally stored in a data dictionary. Although a schema is defined in text
database language, the term is often used to refer to a graphical depiction of the
database structure. In other words, schema is the structure of the database that defines
the objects in the database.
In an Oracle Database system, the term "schema" has a slightly different connotation.
b. ) A transaction is a unit of work that is performed against a database. Transactions are units
or sequences of work accomplished in a logical order, whether in a manual fashion by a user or
automatically by some sort of a database program.
A transaction is the propagation of one or more changes to the database. For example, if you are
creating a record or updating a record or deleting a record from the table, then you are
performing transaction on the table. It is important to control transactions to ensure data integrity
and to handle database errors.
Practically, you will club many SQL queries into a group and you will execute all of them together
as a part of a transaction.
Properties of Transactions:
Transactions have the following four standard properties, usually referred to by the acronym
ACID:
Atomicity: ensures that all operations within the work unit are completed successfully;
otherwise, the transaction is aborted at the point of failure, and previous operations are rolled
back to their former state.
Consistency: ensures that the database properly changes states upon a successfully
committed transaction.
Isolation: enables transactions to operate independently of and transparent to each
other.
Durability: ensures that the result or effect of a committed transaction persists in case of
a system failure.
Transaction Control:
There are following commands used to control transactions:
examines data in the database and uses it to produce a report or update the
database. One of the simplest queries is a list of records in a database table. It
looks like the following SQL statement:
SELECT * FROM customers;
This query produces an unsorted list of all the information in the customers
table, record by record. By using the powerful WHERE clause, you can create
selective queries which evaluate the data and list only those records matching
the clauses criteria:
SELECT * FROM customers WHERE state = CA;
This query lists only customers from California. The WHERE clause
accommodates very complex conditions, including the results of correlated
sub-queries, for selecting only the data you want.
Sub-Queries
A sub-query is a query in which the WHERE clause itself has its own
query. This is a convenient way to combine information from different
database tables to produce more sophisticated results. The following query
produces a list of only those customers who have placed orders in 2011:
SELECT * FROM customers WHERE customer_code IN (SELECT
customer_code FROM orders WHERE order_date BETWEEN 1/1/2011 AND
12/31/2011);
Notice that this is a query inside a query. The SELECT statement inside the
parentheses generates a list of customer codes from the orders table. The
outer query uses the customer codes to produce a list of customer names,
addresses and other information. This is a sub-query but not a coordinated
sub-query; though the outer query depends on the inner one, a coordinated
sub-query also has an inner query that depends on the outer one.
Sponsored Links
Story of Send
Follow an email on its journey. An inside look at how email works.
www.google.com/green/storyofsend
Correlated Sub-Queries
normalization
1NF: This type of normalization states that there must not be any duplicates in the tables
that we use. In other words, all the tables used must have a primary key defined.
2NF: This type of normalization states that data redundancy can be reduced if attributes
those are dependent on one of the keys of a composite primary key are isolated to a
separate table. Not only does this reduces data redundancy but also helps in increasing
data retention when a delete is done. For example, consider a table that has the
following columns: Part Id, State, City, and Country. Here, assume Part Id & Country
form the composite primary key. The attributes state & city depend only on the country.
2NF states that if such is the case then split the table into 2 tables. One with Part Id &
country as the columns. Other with Country, state & city as the columns. In the 1st table
if a delete is made to all the rows with Part Id = X then we would lose country related
data too. But in the 2nd case this would not happen.
b.). a backup, or the process of backing up, refers to the copying and archiving of
computer data so it may be used to restore the original after a data loss event. The verb
form is to back up in two words, whereas the noun is backup.[1]
Backups have two distinct purposes. The primary purpose is to recover data after its
loss, be it by data deletion or corruption. Data loss can be a common experience of
computer users. A 2008 survey found that 66% of respondents had lost files on their
home PC.[2] The secondary purpose of backups is to recover data from an earlier time,
according to a user-defined data retentionpolicy, typically configured within a backup
application for how long copies of data are required. Though backups popularly
represent a simple form of disaster recovery, and should be part of adisaster recovery
plan, by themselves, backups should not alone be considered disaster recovery.[3] One
reason for this is that not all backup systems or backup applications are able to
reconstitute a computer system or other complex configurations such as a computer
cluster, active directory servers, or a database server, by restoring only data from a
backup.Since a backup system contains at least one copy of all data worth saving,
the data storage requirements can be significant. Organizing this storage space and
managing the backup process can be a complicated undertaking. A data repository
model can be used to provide structure to the storage. Nowadays, there are many
different types of data storage devices that are useful for making backups. There are
also many different ways in which these devices can be arranged to provide geographic
redundancy, data security, and portability.Before data is sent to its storage location, it is
selected, extracted, and manipulated. Many different techniques have been developed
to optimize the backup procedure. These include optimizations for dealing with open files
and live data sources as well as compression, encryption, and de-duplication, among
others. Every backup scheme should include dry runs that validate the reliability of the
data being backed up. It is important to recognize the limitations and human factors
involved in any backup scheme
c.). In a database environment such as Adabas, the same data is used by many
PART B
Q1a). Explain heterogeneous distributed database systems.
b.
A fully distributed database management system must perform all of th
e functions of a
centralized DBMS. Do you agree? Why or why not?
c. Describe the five types of users identified in a database system.
Technical heterogeneity
Different file formats, access protocols, query languages etc. Often called syntactic
heterogeneity from the point of view of data.
Semantic heterogeneity
Data across constituent databases may be related but different. Perhaps a database
system must be able to integrate genomic and proteomic data. They are relateda gene
may have several protein productsbut the data are different (nucleotide sequences
and amino acid sequences, or hydrophilic or -phobic amino acid sequence and positively
or negatively charged amino acids). There may be many ways of looking at semantically
similar, but distinct, datasets.
b.) A distributed database is a database in which storage devices are not all attached
to a common processing unit such as the CPU,[1] controlled by a distributed database
management system (together sometimes called a distributed database system). It
may be stored in multiple computers, located in the same physical location; or may be
c.) Software refers to the collection of programs used with in the database
system. It includes the operating system, DBMS Software, and application
programs and utilities.
Operating System
DBMS Software
Application Programs and Utilities
The operating System manages all the hardware components and makes it
possible for all other software to run on the computers. UNIX, LINUX,
Microsoft Windows etc are the popular operating systems used in database
environment.
DBMS software manages the database with in the database system. Oracle
Corporation's ORACLE, IBM's DB2, Sun's MYSQL, Microsoft's MS Access and
SQL Server etc are the popular DBMS (RDBMS) software used in the
database environment.
Application programs and utilities software are used to access and
manipulate the data in the database and to manage the operating
environment of the database.
People in a Database System Environment
People component includes all users associated with the database system. On
the basis of primary job function we can identify five types of users in a
database system: System Administrators, Database Administrators, Data
Modelers,System Analysts and Programmers and End Users.
System Administrators
Data Modelers
Database Administrators
System Analysts and Programmers
End Users
System Administrators oversees the database system's general operations.
database design, implementation, and use. This central control and coordination is
the role of the database administrator (DBA).
This part of the DBA documentation describes the roles of the DBA, the authority
and responsibility the DBA might have, the skills needed, the procedures,
standards, and contacts the DBA may need to create and maintain.
In the context of this documentation, the DBA is a single person; however, large
organizations may divide DBA responsibilities among a team of personnel, each
with specific skills and areas of responsibility such as database design, tuning, or
problem resolution. The ability of the database administrator (DBA) to work
effectively depends on the skill and knowledge the DBA brings to the task, and the
role the DBA has on the overall Information Systems (IS) operation. This section
describes how best to define the DBA role, discusses the relationship of the DBA
to the IS organization, and makes suggestions for taking advantage of that
relationship.
Position of the DBA in the Organization
The DBA should be placed high enough in the organization to exercise the
necessary degree of control over the use of the database and to communicate at the
appropriate level within user departments. However, the DBA should not be remote
b. ) A deadlock is a situation in which two or more competing actions are each waiting
for the other to finish, and thus neither ever does.
In computer science a deadly embrace is a deadlock involving exactly two competing
actions. It is a term more commonly used in Europe.
In a transactional database[disambiguation needed], a deadlock happens when two processes
each within its own transaction updates two rows of information but in the opposite order.
For example, process A updates row 1 then row 2 in the exact timeframe process B
updates row 2 then row 1. Process A can't finish updating row 2 until process B is
finished, but it cannot finish updating row 1 until process A finishes. No matter how much
time is allowed to pass, this situation will never resolve itself and because of
this database management systems will typically kill the transaction of the process that
has done the least amount of work.
In an operating system, a deadlock is a situation which occurs when
a process or thread enters a waiting state because a resource requested is being held
by another waiting process, which in turn is waiting for another resource. If a process is
unable to change its state indefinitely because the resources requested by it are being
used by another waiting process, then the system is said to be in a deadlock. [1]
Deadlock is a common problem in multiprocessing systems, parallel
computing and distributed systems, where software and hardware locks are used to
handle shared resources and implement process synchronization
c.) This article is about concurrency control. For commit consensus within a distributed
slightly later, and this apparent disadvantage is insignificant and disappears next to the
advantages of SS2PL.
Thus, the importance of the general Two-phase locking (2PL) is historic only,
while Strong strict two-phase locking (SS2PL) is practically the important mechanism
and resulting schedule property. A lock is a system object associated with a shared
resource such as a data item of an elementary type, a row in a database, or a page of
memory. In a database, a lock on a database object (a data-access lock) may need to
be acquired by a transaction before accessing the object. Correct use of locks prevents
undesired, incorrect or inconsistent operations on shared resources by other concurrent
transactions. When a database object with an existing lock acquired by one transaction
needs to be accessed by another transaction, the existing lock for the object and the
type of the intended access are checked by the system. If the existing lock type does not
allow this specific attempted concurrent access type, the transaction attempting access
is blocked (according to a predefined agreement/scheme). In practice a lock on an
object does not directly block a transaction's operation upon the object, but rather blocks
that transaction from acquiring another lock on the same object, needed to be
held/owned by the transaction before performing this operation. Thus, with a locking
mechanism, needed operation blocking is controlled by a proper lock blocking scheme,
which indicates which lock type blocks which lock type.
Two major types of locks are utilized:
The common interactions between these lock types are defined by blocking behavior as
follows:
Q3. a.
Describe a conceptual model and its advantages. What is the most wide
ly used conceptual
model?
b.What is a key and why is it important in the relational model? Describ
e the use of nulls in a database.
c. Explain singlevalued attributes and provide an example. Explain the difference betwe
en simple
and composite attributes. Provide at least one example of each.
Ans . a. ) In the most general sense, a model is anything used in any way to represent
anything else. Some models are physical objects, for instance, a toy model which may
be assembled, and may even be made to work like the object it represents. Whereas,
a conceptual model is a model made of the composition of concepts, that thus exists
only in the mind. Conceptual models are used to help us know, understand,
or simulate the subject matter they represent.
The term conceptual model may be used to refer to models which are formed after a
conceptualization process in the mind. Conceptual models represent human intentions
or semantics[citation needed]. Conceptualization from observation of physical existence and
conceptual modeling are the necessary means human employ to think and solve
problems[citation needed]. Concepts are used to convey semantics during various natural
languages based communication[citation needed]. Since a concept might map to multiple
semantics by itself, an explicit formalization is usually required for identifying and
locating the intended semantic from several candidates to avoid misunderstandings and
confusions in conceptual models.[ The term "conceptual model" is ambiguous. It could
mean a model of concept or it could mean a model that is conceptual. A distinction can
be made between what models are and what models are models of. With the exception
of iconic models, such as a scale model of Winchester Cathedral, most models are
concepts. But they are, mostly, intended to be models of real world states of affairs. The
value of a model is usually directly proportional to how well it corresponds to a past,
present, future, actual or potential state of affairs. A model of a concept is quite different
because in order to be a good model it need not have this real world correspondence.
[2]
Models of concepts are usually built by analysts who are not primarily concerned
about the truth or falsity of the concepts being modeled. For example, in management
problem structuring, Conceptual Models of human activity systems are used in Soft
systems methodology to explore the viewpoints of stakeholders in the client
organization. In artificial intelligence conceptual models and conceptual graphs are used
for building expert systems and knowledge-based systems, here the analysts are
concerned to represent expert opinion on what is true not their own ideas on what is
true.
b.) The relational model for database management is a database model based
on first-order predicate logic, first formulated and proposed in 1969 by Edgar F. Codd.[1]
[2]
In the relational model of a database, all data is represented in terms of tuples,
grouped into relations. A database organized in terms of the relational model is
a relational database.
In the relational model, related records are linked together with a "key".
The purpose of the relational model is to provide a declarative method for specifying
data and queries: users directly state what information the database contains and what
information they want from it, and let the database management system software take
care of describing data structures for storing the data and retrieval procedures for
answering queries.
Most relational databases use the SQL data definition and query language; these
systems implement what can be regarded as an engineering approximation to the
relational model. A table in an SQL database schema corresponds to a predicate
variable; the contents of a table to a relation; key constraints, other constraints, and SQL
queries correspond to predicates. However, SQL databases, including DB2,deviate from
the relational model in many details, and Codd fiercely argued against deviations that
compromise the original principles.
c.) For
Source: http://cnx.org/content/m28250/latest/
TYPES
OF
ENTITIES
EmployeePhone(EID, Phone)
ATTRIBUTES
Employee
Secretary
Technician
Engineer
Manager
Salaried_Emp
Hourly Emp
Engineering Manager
For example, in the figure above, Engineer is a subclass of Employee, but also
a super class of Engineering Manager.
This means that every Engineering Manager, must also be an Engineer.
Specialization Hierarchy has the constraint that every subclass
participates as a subclass in only one class/subclass relationship, i.e. that
each subclass has only one parent. This results in a tree structure.
Specialization Lattice has the constraint that a subclass can be a
subclass of more than one class/subclass relationship. The figure shown
above is a specialization lattice, because Engineering_Manager participates
has more than one parent classes.
In a lattice or hierarchy, the subclass inherits the attributes not only of the
direct superclass, but also all of the predecessor super classes all the way to
the root.
A subclass with more than one super class is called a shared subclass. This
leads to multiple inheritance, where a subclass inherits attributes from
multiple classes.
In a lattice, when a superclass inherits attributes from more than one
superclass, and some attributes are inherited more than once via different
paths (i.e. Engineer, Manager and Salaried Employee all inherit from
Employee, that are then inherited by Engineering Manager.
In this situation, the attributes are included only once in the subclass.
There are situations when you would like to model a relationship where a
single subclass has more than one super class, and where each super class
represents a different entity type.
The subclass will represent a collection of objects that is a subset of the
UNION of the distinct entity types.
This type of subclass is a union type, or category subclass.
See Text Example, page 99.
A category has two or more super classes that may be distinct entity types,
where other super class/subclass relationships have only one super class.
If we compare the Engineering Manager subclass, it is a sub class of each of
the three super classes, Engineer, Manager and Salaried employee, and
inherits the attributes of all three. An entity that exists in Engineering
Manager exists in all three super classes. This represents the constraint that
an Engineering Manager must be an Engineer, a Manager, AND a Salaried
Employee. It can be thought of as an AND condition.
By contrast, a category is a union of its subclasses. This means that an entity
that is a subclass of a union, exists in ONLY ONE of the super classes. An
owner may be a Company, OR a Bank OR a PERSON, but not more than one.
A category can be partial or total. A total category holds the union of ALL its
super classes, where a partial category can hold a subset of the union.
If a category is total, it can also be represented as a total specialization.
1.
2.
Vehicle
Car
Truck
u
d
Car
Truck
Registered_Vehicle
The first example implies that every car and truck is also a vehicle. In the
second example, a registered vehicle can be a car or a truck, but every car
and truck is not a registered vehicle.
Other examples:
University (Researcher)
Example 4.21
Design a database to keep track of information for an art museum. Assume that
the following requirements were collected.
The museum has a collection of ART_OBJECTS . Each art object has a unique
IDNo, and Artist, if known, a Year (when created, if known) a Title and a
Description. The art objects are categorized in several ways, as discussed
below.
ART_OBJECTS are categorized based on types. There are three main types,
Painting, Sculpture and Statue, plus an Other category for those that dont
fit into one of the categories above.
A PAINTING has a PaintType (oil, watercolor, etc) a material on which it is
CrawnOn (paper, canvas, wood) and Style (modern, abstract etc)
A SCULPTURE or a STATUE has a Material from which it was created (wood,
stone, etc) Height, Weight and Style.
An art object in the OTHER category has a Type(print, photo, etc) and Style.
ART_OBJECTS are also categorized as PERMANENT_COLLECTION, which are
owned by the museum (DateAcquired, whether it is OnDisplay or Stored and
Cost) or BORROWED, which has information on the Collection (where it was
borrowed from), DateBorrowed, and DateReturned.
ART_OBJECTS also have information describing their country-culture using
information on country/culture of Origin (Italian, Egyptial, American, Indian
etc) and Period(Renaissance, Modern, Ancient)
The museum keeps track of ARTISTSs information, if known: Name,
DateBorn, DateDied, CountryOfOrigin, Period, MainStyle and Description. The
name is assumed unique.
Different EXHIBITIONS occur, each having a Name, StartDate and EndDate.
EXHIBITIONS are related to all the art objects that were on display during the
exhibition.
Information is kept on other COLLECTIONS with which the museum interacts,
including Name (unique), Type (museum, personnel etc), Description,
Address, Phone and ContactPerson.
b.) A composite table represents the result of accessing one or more tables in a query.
If a query contains a single table, only one composite table exists. If one or more joins
are involved, an outer composite table consists of the intermediate result rows from the
previous join step. This intermediate result might, or might not, be materialized into a
work file.
The new table (or inner table) in a join operation is the table that is newly accessed in
the step.
A join operation can involve more than two tables. In these cases, the operation is
carried out in a series of steps. For non-star joins, each step joins only two tables.
Sometimes DB2 has to materialize a result table when an outer join is used in
conjunction with other joins, views, or nested table expressions. You can tell when this
happens by looking at the TABLE_TYPE and TNAME columns of the plan table. When
materialization occurs, TABLE_TYPE contains a W, and TNAME shows the name of the
c.) Stored procedures are generally non-portable, meaning they are specific to a
particular RDBMS. As a matter of fact, stored procedures tend to be specific to a
particular VERSION of a particular RDBMS.
The development tools for the lifecycle of stored procedures tend to be very limited
compared to the tools available for general programming languages/platforms. The
tools are lacking in contextual help, in storage of the code, in debugging, in
refactoring, etc.
The languages for writing stored procedures tend to be very limited compared to
general programming languages/platforms. They tend to be procedural, lack many
operations, lack most common APIs, and lack many syntax advances (classes,
scope, etc.). This has changed somewhat with the introduction of Java into Oracle
and .NET into SQL Server.
So, as a general rule, avoid writing stored procedures; writing your code in a general
programming environment is more desirable. Use stored procedures when you need
their particular advantages, which mainly means high-performance and/or tightlyisolated data processing. A typical system will then have maybe a stored procedure
or two, but definitely not dozens to hundreds.
Best wishes.
EDIT: Clarification...
Please note that I am addressing enterprise-class development in-the-large. If you
have a tiny application and a few toy stored procedures, then you can probably
ignore everyone's advice. I am assuming that the question is being asked for nontrivial scenarios.
I have dealt with every significant RDBMS over a period of nearly twenty years. I
have dealt with databases upto 138 TBs, and individual tables of 8 TBs. I have
worked with systems exceeding one thousand SPs. I have converted such
databases across major versions and across major vendors. I am an architect, DBA,
and just a programmer. If you want the benefit of such experience, then here it is. If
not, fair enough.
EDIT: Expounding...
Nearly everything done in a stored procedure can be done by issuing comparable
SQL statements from an application, particularly including anonymous procedure
blocks (the guts of an SP without the name and permanence). Doing it well can
avoid the problems and limitations of stored procedures while still retaining most of
the benefits.
However, don't forget that bad code can be written in any language, so it is just as
possible to write bad SP code as it is to write bad application code. Indeed, based on
history and reports of observations in the wild, it seems even more likely to write bad
SP code.
EDIT: @Chris Lively: regarding putting database code where the DBA can apply his
tools...
Crippling your application development by using the DBA's limited tools is not an
advantage or a step forward, nor is it even necessary.
Besides that, having been a senior DBA/architect for about twenty years, I am not
generally impressed with what most DBAs do with database code in the applications
that they support. I have mentored a lot of DBAs and programmers regarding
database code, so please let me describe what I encourage them to do.
Every DBA should know how to make the database engine show them every SQL
statement that is executed, regardless of source (inside or outside the engine), and
they should know how to analyze that SQL's performance characteristics. I
recommend that every programmer learn to do the same. If you can do this, then it
no longer matters where the SQL originated, so Chris' recommendation to put the
SQL in a SP is null and void.
If the performance of your system matters, such as when several million customers
depend on it every day, then you should be checking the performance of every piece
of SQL before it gets deployed to production. I recommend doing so as part of the
automated tests that can be run as a part of the automated build for the system.
For example, it is very easy to configure an Ant build script to issue each piece of
SQL to the database engine for an execution plan analysis. I like to save each
execution plan to a text file and commit it to source control, where I can readily see a
history of changes. I also make the build script check the execution plan against
some simple criteria to ensure that SQL changes have not altered or compromised
the performance.
Likewise, I check all my SQL into source control, and I make it easily available both
to my application (for execution) and to my build script (for verification). At a
minimum, my build script for the database can recreate the entire structure from
scratch, and I often make it capable of loading or transferring data as well.
Obviously, I can handle stored procedures, but they are just one tool among many. It
is a mistake (an antipattern) to treat SPs as a Golden Hammer.
On the other hand, when the performance really matters, a stored procedure can
often be the best and even the only option. For example, when I redesigned a
database recently for a major telecommunications provider, a stored procedure was
an essential part of the strategy. I was loading forty thousand data files per day,
totaling forty million rows, into a single database table (8 TB) that was growing past
two billion rows of current data. A public-facing web site accessed that data via a
web service, which required pulling a handful of rows from those two billion within
just a few seconds. This was done using Oracle 10g, a custom C application,
external tables, some bulk data loading, and a stored procedure. However, most of
the database code was still in the C application and the stored procedure handled
just one specific, performance-intensive piece.
The quickest way to retrieve the data from a table is to have a column in the table
whose data uniquely identifies a row. By using this column and a specific value, in
the where condition of a select statement the oracle engine will be able to identify
and retrieve the row fast.
To achieve this, a constraint is attached to a specific column in the table that ensures
that the column is never left blank and the data in the column are unique. Since data
entry is done by human being so it is quite likely to enter duplicate values.
If the value to be entered is machine generated it will always fulfill the constraints
and the row will always be accepted for storage. So sequences plays important role
for generating unique values.
Features
Triggers are similar to stored procedures, discussed in Chapter 14, "Procedures and
Packages". A trigger can include SQL and PL/SQL statements to execute as a unit
and can invoke stored procedures. However, procedures and triggers differ in the
way that they are invoked. While a procedure is explicitly executed by a user,
application, or trigger, one or more triggers are implicitly fired (executed) by
Oracle when a triggering INSERT, UPDATE, or DELETE statement is issued, no
matter which user is connected or which application is being used.
For example, Figure 15 - 1 shows a database application with some SQL
statements that implicitly fire several triggers stored in the database.
Figure 15 - 1. Triggers
Notice that triggers are stored in the database separately from their associated
tables.
Triggers can be defined only on tables, not on views. However, triggers on the base
table(s) of a view are fired if an INSERT, UPDATE, or DELETE statement is
issued against a view.
How Triggers Are Used
Oracle Forms triggers are part of an Oracle Forms application and are fired only
when a specific trigger point is executed within a specific Oracle Forms
application. SQL statements within an Oracle Forms application, as with any
database application, can implicitly cause the firing of any associated database
trigger. For more information about Oracle Forms and Oracle Forms triggers, see
the Oracle Forms User's Guide.
Triggers vs. Declarative Integrity Constraints
hr.employees
WHERE
department_id
IN
(SELECT department_id
FROM
hr.departments
WHERE
location_id = 1800);
PART C
Q1. a.) Explain ORDER BY and GROUP BY clause with an example.
b.)What is the difference between a nonprocedural language and a pro
cedural language? Give an
example of each.
Ans . a.) The GROUP BY clause will gather all of the rows together that
contain data in the specified column(s) and will allow aggregate functions
to be performed on the one or more columns. This can best be explained
by an example:
GROUP BY clause syntax:
SELECTcolumn1,
SUM(column2)
FROM"listoftables"
GROUPBY"columnlist";
Let's say you would like to retrieve a list of the highest paid salaries in
each dept:
SELECTmax(salary),dept
FROMemployee
GROUPBYdept;
This statement will select the maximum salary for the people in each
unique department. Basically, the salary for the person who makes the
most in each department will be displayed. Their, salary and their
department will be returned.
Multiple Grouping Columns - What if I wanted to display their lastname
too?
For example, take a look at the items_ordered table. Let's say you want
to group everything of quantity 1 together, everything of quantity 2
together, everything of quantity 3 together, etc. If you would like to
determine what the largest cost item is for each grouped quantity (all
quantity 1's, all quantity 2's, all quantity 3's, etc.), you would enter:
SELECTquantity,max(price)
FROMitems_ordered
GROUPBYquantity;
Enter the statement in above, and take a look at the results to see if it
returned what you were expecting. Verify that the maximum price in each
Quantity Group is really the maximum price.
Review Exercises
1. How many people are in each unique state in the customers table? Select
the state and display the number of people in each. Hint: count is used to
count rows in a column, sum works on numeric data only.
2. From the items_ordered table, select the item, maximum price, and
minimum price for each specific item in the table. Hint: The items will
need to be broken up into separate groups.
3. How many orders did each customer make? Use the items_ordered table.
Select the customerid, number of orders they made, and the sum of their
orders. Click the Group By answers link below if you have any problems.
Q2.a).
What are views? Most database management systems support the creat
ion of views. Give
reasons.
b.)How can you usen the COMMIT, SAVEPOINT and ROLLBACK command
s to support transactions?
Ans a). A database is an organized collection of data. The data are typically
organized to model relevant aspects of reality in a way that supports processes requiring
this information. For example, modeling the availability of rooms in hotels in a way that
supports finding a hotel with vacancies.
Database management systems (DBMSs) are specially designed applications that
interact with the user, other applications, and the database itself to capture and analyze
data. A general-purpose database management system (DBMS) is a software system
designed to allow the definition, creation, querying, update, and administration of
databases. Well-known DBMSs
includeMySQL, MariaDB, PostgreSQL, SQLite, Microsoft SQL
Server, Oracle, SAP, dBASE, FoxPro, IBM DB2, LibreOffice Base and FileMaker Pro. A
database is not generally portable across different DBMS, but different DBMSs can
interoperate by using standards such as SQL and ODBC or JDBC to allow a single
application to work with more than one database
b.) JDBC Connection is in auto-commit mode, which it is by default, then every SQL statement
is committed to the database upon its completion.
That may be fine for simple applications, but there are three reasons why you may want to turn
off auto-commit and manage your own transactions:
To increase performance
false to setAutoCommit( ), you turn off auto-commit. You can pass a boolean true to turn it back
on again.
For example, if you have a Connection object named conn, code the following to turn off autocommit:
Q3. a.)
What is an index? What are the disadvantages of using an index?
b. )Describe the format for the UPDATE command.
Ans a.) There are tradeoffs to almost any feature in computer programming, and
indexes are no exception. While indexes provide a substantial performance benefit to
searches, there is also a downside to indexing. Let's talk about some of those
drawbacks now.
Indexes and Disk Space
Indexes are stored on the disk, and the amount of space required will depend on the
size of the table, and the number and types of columns used in the index. Disk
space is generally cheap enough to trade for application performance, particularly
when a database serves a large number of users. To see the space required for a
table, use the sp_spaceused system stored procedure in a query window.
EXEC sp_spaceused Orders
Given a table name (Orders), the procedure will return the amount of space used by
the data and all indexes associated with the table, like so:
Name
rows
index_size
unused
----------
-------
Orders
320 KB
24 KB
830
reserved
504 KB
data
160 KB
According to the output above, the table data uses 160 kilobytes, while the table
indexes use twice as much, or 320 kilobytes. The ratio of index size to table size can
vary greatly, depending on the columns, data types, and number of indexes on a
table.
Indexes and Data Modification
b. ) To create a PROC SQL table from a query result, use a CREATE TABLE statement,
and place it before the SELECT statement. When a table is created this way, its data is
derived from the table or view that is referenced in the query's FROM clause. The new
table's column names are as specified in the query's SELECT clause list. The column
attributes (the type, length, informat, and format) are the same as those of the selected
source columns.
The following CREATE TABLE statement creates the DENSITIES table from the
COUNTRIES table. The newly created table is not displayed in SAS output unless you
query the table. Note the use of the OUTOBS option, which limits the size of the
DENSITIES table to 10 rows.
proc sql outobs=10;
title 'Densities of Countries';
create table sql.densities as
select Name 'Country' format $15.,
Population format=comma10.0,
Area as SquareMiles,
Population/Area format=6.2 as Density
from sql.countries;
The following DESCRIBE TABLE statement writes a CREATE TABLE statement to the
SAS log:
proc sql;
describe table sql.densities;
In this form of the CREATE TABLE statement, assigning an alias to a column renames
the column, while assigning a label does not. In this example, the Area column has been
renamed to SquareMiles, and the calculated column has been named Densities.
However, the Name column retains its name, and its display label is Country .
Q4. a. )
Describe the format of the ALTER TABLE command to add a new column.
Ans a. ) When declaring string or binary column types the maximum size must be
specified. The following example declares a string column that can grow to a
maximum of 100 characters,
CREATE TABLE Table ( str_col VARCHAR(100) )
When handling strings the database will only allocate as much storage space as the
string uses up. If a 10 character string is stored in str_col then only space for 10
characters will be allocated in the database. So if you need a column that can store
a string of any size, use an arbitrarily large number when declaring the column.
Mckoi SQL Database does not use a fixed size storage mechanism when storing
variable length column data.
is a column type that can contain serializable Java objects.
The JAVA_OBJECT type has an optional Java class definition that is used for runtime
class constraint checking. The following example demonstrates creating
a JAVA_OBJECT column.
JAVA_OBJECT
If the Java class is not specified the column defaults to java.lang.Object which
effectively means any type of serializable Java object can be kept in the column.
String types may have a COLLATE clause that changes the collation ordering of the
string based on a language. For example, the folling statement creates a string that
can store and order Japanese text;
CREATE TABLE InternationalTable (
japanese_text VARCHAR(4000) COLLATE 'jaJP')
The 'jaJP' is an ISO localization code for the Japanese language in Japan. Other
locale codes can be found in the documentation to java.text.Collate.
Unique, primary/foreign key and check integrity constraints can be defined in
the CREATE TABLE statement. The following is an example of defining a table with
integrity constraints.
CREATE TABLE Customer
number VARCHAR(40)
name
VARCHAR(100)
ssn
VARCHAR(50)
age
INTEGER
(
NOT
NOT
NOT
NOT
NULL,
NULL,
NULL,
NULL,
Operation
+, -
identity, negation
*, /
multiplication, division
+, -, ||
addition, subtraction,
concatenation
comparison
Operator
Operation
IN
NOT
AND
conjunction
OR
disjunction
Precedence Example
In the following expression, multiplication has a higher precedence than addition,
so Oracle first multiplies 2 by 3 and then adds the result to 1.
1+2*3
Example
+-
SELECT * FROM
orders
WHERE qtysold =
-1;
SELECT * FROM emp
WHERE -sal < 0;
Operator Purpose
*/
Example
UPDATE emp
SET sal = sal *
1.1;
Do not use two consecutive minus signs (--) in arithmetic expressions to indicate
double negation or the subtraction of a negative value. The characters -- are used to
begin comments within SQL statements. You should separate consecutive minus
signs with a space or a parenthesis.
See Also: "Comments" for more information on comments within SQL
statements
Concatenation Operator
The concatenation operator manipulates character strings. Table 3-3 describes the
concatenation operator.
Table 3-3 Concatenation Operator
Operator
Purpose
Example
||
The result of concatenating two character strings is another character string. If both
character strings are of datatype CHAR, the result has datatype CHAR and is limited to
2000 characters. If either string is of datatype VARCHAR2, the result has
datatype VARCHAR2 and is limited to 4000 characters. Trailing blanks in character
strings are preserved by concatenation, regardless of the strings' datatypes.
On most platforms, the concatenation operator is two solid vertical bars, as shown
in Table 3-3. However, some IBM platforms use broken vertical bars for this
operator. When moving SQL script files between systems having different
character sets, such as between ASCII and EBCDIC, vertical bars might not be
translated into the vertical bar required by the target Oracle environment. Oracle
provides the CONCAT character function as an alternative to the vertical bar operator
for cases when it is difficult or impossible to control translation performed by
operating system or network utilities. Use this function in applications that will be
moved between environments with differing character sets.
Although Oracle treats zero-length character strings as nulls, concatenating a zerolength character string with another operand always results in the other operand, so
null can result only from the concatenation of two null strings. However, this may
not continue to be true in future versions of Oracle. To concatenate an expression
that might be null, use the NVL function to explicitly convert the expression to a
zero-length string.
Q5.a.)Should a user be allowed to enter null values for the primary key?
Give reasons for your answer.
b.)
What is Data Independence? Explain Logical data independence Physica
l data independence.
b.) Data independence is the type of data transparency that matters for a
centralized DBMS. It refers to the immunity of user applications to make changes in the
definition and organization of data.
Physical data independence deals with hiding the details of the storage structure from
user applications. The application should not be involved with these issues, since there
is no difference in the operation carried out against the data.
The data independence and operation independence together gives the feature of data
abstraction. There are two levels of data independence. The physical structure of the
data is referred to as "physical data description". Physical data independence deals with
hiding the details of the storage structure from user applications. The application should
not be involved with these issues since, conceptually, there is no difference in the
operations carried out against the data. There are three types of data independence:
1. Logical data independence: The ability to change the logical (conceptual) schema
without changing the External schema (User View) is called logical data
independence. For example, the addition or removal of new entities, attributes, or
relationships to the conceptual schema should be possible without having to
change existing external schemas or having to rewrite existing application
programs.
2. Physical data independence: The ability to change the physical schema without
changing the logical schema is called physical data independence. For example,
a change to the internal schema, such as using different file organization or
storage structures, storage devices, or indexing strategy, should be possible
without having to change the conceptual or external schemas.
Physical Independence
The logical scheme stays unchanged even though the storage space or type of some
data is changed for reasons of optimization or reorganization. In this external schema
does not change. In this internal schema changes may be required due to some physical
schema were reorganized here. Physical data independence is present in most
databases and file environment in which hardware storage of encoding, exact location of
data on disk,merging of records, so on this are hidden from user.
One of the biggest advantages of database is data independence. It means we can
change the conceptual schema at one level without affecting the data at other level. It
means we can change the structure of a database without affecting the data required by
users and program. This feature was not available in file oriented approach. There are
two types of data independence and they are:
1. Physical data independence
2. Logical data independence
CASE STUDY - I
Questions:
Q1. Design the database system for Laxmi Cycles.
Ans A dynamic and RESULTS oriented Talent Manager with over Four years of
recruiting experience across Information Technology. High energy with great
relationship, team building, leadership, and communication skills.
My focus is to understand both the client and candidate's unique requirements and
leverage their expertise and knowledge to align their search to specific corporate
cultures. I find the right professional talent for my client's business, as I understand,
anticipate and fulfill both sides of the business transaction.
Q2. Draw the corresponding ER Diagram for the above
Ans Draw the corresponding ERD for the following data structure:
at the point of capturethe data source. Clinical documentation also plays a key
role in data quality. Clinical documentation practices need to be developed and
standardized to facilitate accurate data capture and encoding. In an EHR, it is
imperative these content standards are built into the fiber of decision making
screens, templates, drop-down lists and other tools for documentation.
Additionally, establishing consistent data models will assure the integrity and
quality of the data maintained in the EHR. Standardization of data definitions and
structure for clinical content (including smart text)and quality checkpoints, along
with traditional auditing procedureshelp ensure quality data is captured.
Productivity and effectiveness of new tools such as natural language processing
(NLP) and computer-assisted coding (CAC) can be enhanced when these controls
are in place.
CASE STUDY-II
Questions:
Q1. Design a database that stores the Cab service companys
information. Identify the entities
of interest and show their attributes.
Ans Data analysis is concerned with the NATURE and USE of data. It
involves the identification of the data elements which are needed to support
the data processing system of the organization, the placing of these
elements into logical groups and the definition of the relationships between
the resulting groups.
Other approaches, e.g. D.F.Ds and Flowcharts, have been concerned with
the flow of data-dataflow methodologies. Data analysis is one of several
data structure based methodologies Jackson SP/D is another.
data. Entity types fall into five classes: roles, events, locations, tangible
things or concepts. E.g. employee, payment, campus, book. Specific
examples of an entity are called instances. E.g. the employee John Jones,
Mary Smith's payment, etc.
Relationship
A data relationship is a natural association that exists between one or more
entities. E.g. Employees process payments. Cardinality defines the
number of occurrences of one entity for a single occurrence of the related
entity. E.g. an employee may process many payments but might not
process any payments depending on the nature of her job.
Attribute
one and only one instance of an entity is called a primary key or identifier.
E.g. Employee Number is a primary key for Employee.
4. Fill in Cardinality
Identify the data attribute(s) that uniquely identify one and only one occurrenc
each entity.
7. Identify Attributes
Name the information details (fields) which are essential to the system under
development.
8. Map Attributes
For each attribute, match it with exactly one entity that it describes.
Adjust the ERD from step 6 to account for entities or relationships discovered
8.
Does the final Entity Relationship Diagram accurately depict the system data
Ans The ability to share electronic health information both internally and
access the data, also called database system, or simply database. The
primary goal of such a system is to provide an environment that is both
convenient and efficient to use in retrieving and storing information.
A database management system (DBMS) is designed to manage a large body
of information. Data management involves both defining structures for
storing information and providing mechanisms for manipulating the
information. In addition, the database system must provide for the safety of
the stored information, despite system crashes or attempts at unauthorized
access. If data are to be shared among several users, the system must avoid
possible anomalous results due to multiple users concurrently accessing the
same data.
Examples of the use of database systems include airline reservation systems,
company payroll and employee information systems, banking systems, credit
card processing systems, and sales and order tracking systems.
A major purpose of a database system is to provide users with an abstract
view of the data. That is, the system hides certain details of how the data are
stored and maintained. Thereby, data can be stored in complex data
structures that permit efficient retrieval, yet users see a simplified and easyto-use view of the data. The lowest level of abstraction, the physical level,
describes how the data are actually stored and details the data structures.
The next-higher level of abstraction, the logical level, describes what data
are stored, and what relationships exist among those data. The highest level
of abstraction, the view level, describes parts of the database that are
relevant to each user; application programs used to access a database form
part of the view level.