Data Warehousing and Data Mining
Data Warehousing and Data Mining
Mining
Dr Shivani Thapliyal
Data Warehouse
A data warehouse is a subject-oriented, integrated,
time-variant and non-volatile collection of data in
support of management's decision making process.
source.
T(Transform): Data is transformed into the standard
format.
L(Load): Data is loaded into data warehouse after
2. Data Discrimination:
It compares common features of class which is under
5. Correlation Analysis:
heterogeneous sources
⇓
converted in accordance with the needs of the
decision support system
⇓
stored in the warehouse
OLAP tools in data warehouse
We can define OLAP in data warehouse as a
computing technology that allows query data and
analyze it from different perspectives. The
technology is a great solution for business analysts
who need to pre-aggregate and pre-calculate data
for fast analysis.
OLAP supports complex calculations;
Provides data view in multidimensional manner;
There are following three major OLAP
models in data warehouse:
ROLAP or Relational OLAP: the kind of system
where users query data from a relational
database or from their own local tables. Thus,
the number of potential questions is not limited.
MOLAP or Multidimensional OLAP: this system
stores the data in multidimensional database.
Provides high speed of calculations.
HOLAP or Hybrid OLAP: a mix of two above
mentioned systems. Pre-computed aggregates
and cube structure stored in multidimensional
database.
OLAP Operations
Here is the list of OLAP operations −
Roll-up
Drill-down
Slice and dice
Pivot (rotate)
1. Roll-up
Roll-up performs aggregation
on a data cube in any of the
following ways −
•By climbing up a concept
hierarchy for a dimension
•By dimension reduction
Roll-up
Roll-up is performed by climbing up a concept
hierarchy for the dimension location.
Initially the concept hierarchy was "street < city
< province < country".
On rolling up, the data is aggregated by
ascending the location hierarchy from the level
of city to the level of country.
The data is grouped into cities rather than
countries.
When roll-up is performed, one or more
dimensions from the data cube are removed.
2. Drill-down
Drill-down is the
reverse
operation of
roll-up. It is
performed by
either of the
following ways −
By stepping
down a concept
hierarchy for a
dimension
By introducing a
new dimension.
Drill-down
Drill-down is performed by stepping down a
concept hierarchy for the dimension time.
Initially the concept hierarchy was "day <
month < quarter < year."
On drilling down, the time dimension is
descended from the level of quarter to the level
of month.
When drill-down is performed, one or more
dimensions from the data cube are added.
It navigates the data from less detailed data to
highly detailed data.
3. Slice