Lecture 2
Lecture 2
The top-down approach starts with the overall design and planning. It is
useful in cases where the technology is mature and well known, and
where the business problems that must be solved are clear and well
understood.
The bottom-up approach starts with experiments and prototypes. This is
useful in the early stage of business modeling and technology
development. It allows an organization to move forward at considerably
less expense and to evaluate the benefits of the technology before
making significant commitments.
In the combined approach, an organization can exploit the planned and
strategic nature of the top-down approach while retaining the rapid
implementation and opportunistic application of the bottom-up
approach.
Tier-1:
Tier-2:
The middle tier is an OLAP server that is typically implemented using either
a relational OLAP (ROLAP) model or a multidimensional OLAP. OLAP
model is an extended relational DBMS that maps operations on
multidimensional data to standard relational operations. A multidimensional
OLAP (MOLAP) model, that is, a special-purpose server that directly
implements multidimensional data and operations.
Tier-3:
The top tier is a front-end client layer, which contains query and reporting
tools, analysis tools, and/or data mining tools (e.g., trend analysis,
prediction, and so on).
Metadata are data about data. When used in a data warehouse, metadata are
the data that define warehouse objects. Metadata are created for the data
names and definitions of the given warehouse. Additional metadata are
created and captured for timestamping any extracted data, the source of the
extracted data, and missing fields that have been added by data cleaning or
integration processes.
Consolidation (Roll-Up)
Drill-Down
Slicing and Dicing
Slicing and dicing is a feature whereby users can take out (slicing) a
specific set of data of the OLAP cube and view (dicing) the slices from
different viewpoints.
ROLAP works directly with relational databases. The base data and the
dimension tables are stored as relational tables and new tables are created to
hold the aggregated information. It depends on a specialized schema design.
This methodology relies on manipulating the data stored in the relational
database to give the appearance of traditional OLAP's slicing and dicing
functionality. In essence, each action of slicing and dicing is equivalent to
adding a "WHERE" clause in the SQL statement. ROLAP tools do not use
pre-calculated data cubes but instead pose the query to the standard
relational database and its tables in order to bring back the data required to
answer the question. ROLAP tools feature the ability to ask any question
because the methodology does not limit to the contents of a cube. ROLAP
also has the ability to drill down to the lowest level of detail in the database.
MOLAP tools have a very fast response time and the ability to quickly write
back data into the data set.
How can the data analyst or the computer be sure that customer id in one
database and customer number in another reference to the same attribute.
2. Redundancy:
For the same real-world entity, attribute values from different sources may
differ.
3. Data Transformation:
4. Data Reduction