UNIT II Database & Data Warehouse
UNIT II Database & Data Warehouse
Classification
One way to classify databases involves the type of their
contents, for example: bibliographic, document-text,
statistical, or multimedia objects. Another way is by their
application area, for example: accounting, music compositions,
movies, banking, manufacturing, or insurance. A third way is by
some technical aspect, such as the database structure or
interface type. This section lists a few of the adjectives used
to characterize different kinds of databases.
DATA WAREHOUSING
Decision support systems (DSS) are subject-oriented, integrated,
time-variant, and non-volatile. The term data warehouse was first used
by William Inmon in the early 1980s. He defined data warehouse to be
a set of data that supports DSS and is "subject-oriented, integrated,
time-variant and nonvolatile" With data warehousing, corporate-wide
Database & Data warehousing Page 5
data (current & historical) are merged into a single repository.
Traditional databases contain operational data that represent the day-
to-day needs of a company. Traditional business data processing (such
as billing, inventory control, payroll, and manufacturing support online
transaction processing and. batch reporting applications. A data
warehouse, however, contains informational data, which are used to
support other functions such as planning and forecasting. Although
much of the content is similar between the operational and
informational data, much is different. A data warehouse is a data
repository used to support decision support systems.
Data warehousing system include data migration, the warehouse, and
access tools. The data are extracted from operational systems, but
must be reformatted, cleansed, integrated, and summarized before
being placed in the warehouse. Much of the operational data are not
needed in the warehouse and are removed during this conversion
process. This migration process is similar to that needed for data
mining applications except that data mining application need not
necessarily be performed on summarized or business-wide data.
Goals
Knowledge representation
Planning
Learning
Perception
Social intelligence
General intelligence
Approaches
Components of ANNs
Learning Process
Advantages
Challenges
Over fitting: The model may perform well on training data but
poorly on unseen data.
Training Time: Can require significant computational resources.
Interpretability: Often seen as a "black box," making it hard to
understand how decisions are Detailed Components of ANNs
1. Neurons:
o Each neuron receives inputs, processes them, and sends an
output to the next layer.
o The processing typically involves calculating a weighted sum
of inputs followed by an activation function.
2. Weights and Biases:
o Weights determine the importance of each input. During
training, these are adjusted to minimize error.
o Biases provide an additional parameter that allows the
model to fit the data better by shifting the activation
function.
3. Activation Functions:
o Sigmoid: Outputs values between 0 and 1. Good for binary
classification but can suffer from vanishing gradients.
Non-Scalable Data:
Data analysis
Data requirements
Data cleaning
Once processed and organized, the data may be incomplete,
contain duplicates, or contain errors. The need for data cleaning will
arise from problems in the way that the datum are entered and
stored. Data cleaning is the process of preventing and correcting these
errors. Common tasks include record matching, identifying inaccuracy
of data, and overall quality of existing data, duplication and column
segmentation.
Scientific hypothesis
People refer to a trial solution to a problem as a hypothesis, often
called an "educated guess because it provides a suggested outcome
based on the evidence. However, some scientists reject the term
"educated guess" as incorrect. Experimenters may test and reject
several hypotheses before solving the problem.
Working hypothesis
A working hypothesis is a hypothesis that is provisionally accepted as a
basis for further research in the hope that a tenable theory will be
produced, even if the hypothesis ultimately fails.[18] Like all
hypotheses, a working hypothesis is constructed as a statement of
expectations, which can be linked to the exploratory research purpose
in empirical investigation.