0% found this document useful (0 votes)

66 views

DWM Unit-I Notes

The document discusses the key concepts of data warehousing including: 1) It provides definitions of a data warehouse as a subject-oriented, integrated, non-volatile collection of data to support management decisions. 2) Data warehouses have features such as being subject-oriented, integrated, time-variant and non-volatile. 3) It describes the differences between a data warehouse and a data mart, and the typical three-tier architecture of a data warehouse including bottom, middle and top tiers.

Uploaded by

Ankita Pawar

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

66 views

DWM Unit-I Notes

Uploaded by

Ankita Pawar

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

UNIT-I

INTRODUCTION TO DATA WAREHOUSING

Evolution of decision support systems, Failure of past decision

support system, Operational v/s decision support systems, Data
warehousing lifecycle, Architecture, Building blocks, Components of
DW, Data Marts and Metadata

1
Prof.Jayant S. Rohankar
Tulsiramji Gaikwad-Patil College of Engineering & Technology,
Nagpur
Department of Information Technology
Subject Notes
Academic Session: 2022 – 2023

Subject: Data Warehousing & Mining Semester: VII

Unit I

Data Warehouse:

1. Data Warehouse is a subject oriented, integrated, nonvolatile, and time

variant collection of data in support of management’s decisions.
2. A data warehouse is a semantically consistent data store that serves as a
physical implementation of a decision support data model and stores the
information on which an enterprise needs to make strategic decisions. A
data warehouse is also often viewed as an architecture, constructed by
integrating data from multiple heterogeneous sources to support
structured and/or ad hoc queries, analytical reporting, and decision
making.

Features of Data warehouse:

Subject-oriented:

 A data warehouse is organized around major subjects, such as customer,

supplier, product, and sales. Rather than concentrating on the day-to-
day operations and transaction processing of an organization, a data
warehouse focuses on the modeling and analysis of data for decision
makers.

2
Prof.Jayant S. Rohankar
 Hence, data warehouses typically provide a simple and concise view
around particular subject issues by excluding data that are not useful in
the decision support process.

Integrated:

 A data warehouse is usually constructed by integrating multiple

heterogeneous sources, such as relational databases, flat files, and on-
line transaction records.
 Data cleaning and data integration techniques are applied to ensure
consistency in naming conventions, encoding structures, attribute
measures, and so on.

3
Prof.Jayant S. Rohankar
Time-variant:

 Data are stored to provide information from a historical perspective (e.g.,

the past 5–10 years). Every key structure in the data warehouse
contains, either implicitly or explicitly, an element of time.

Nonvolatile:

 A data warehouse is always a physically separate store of data

transformed from the application data found in the operational
environment.
 Due to this separation, a data warehouse does not require transaction
processing, recovery, and concurrency control mechanisms. It usually
requires only two operations in data accessing: initial loading of
data and access of data.

4
Prof.Jayant S. Rohankar
Data warehouse and a Data mart:

DATA WAREHOUSE
Corporate/Enterprise-wide
Union of all data marts
Takes time to build
Low risk of failure
Structure to suit the departmental view of data
Data received from staging area
Well structured and architecture
Queries on presentation resource

DATA MART
Departmental -wide
A single business process
Faster and easier implementation
High risk of failure

5
Prof.Jayant S. Rohankar
Structure for corporate view of data
Data received from Star joins( facts & dimensions)
Each data mart has its own narrow view of data

Architecture of Data Warehouse:

Data warehouses often adopt a three-tier architecture, as presented in Figure.

Bottom Tier:

 The bottom tier is a warehouse database server that is almost always

a relational database system.
 Back-end tools and utilities are used to feed data into the bottom tier
from operational databases or other external sources (such as
customer profile information provided by external consultants).
 These tools and utilities perform data extraction, cleaning, and
transformation (e.g., to merge similar data from different sources into
a unified format), as well as load and refresh functions to update the
data warehouse.
 The data are extracted using application program interfaces known as
gateways. A gateway is supported by the underlying DBMS and allows
client programs to generate SQL code to be executed at a server.
Examples of gateways include ODBC and OLEDB (Open Linking and
Embedding for Databases) by Microsoft and JDBC.
 This tier also contains a metadata repository, which stores
information about the data warehouse and its contents.

Middle Tier:

6
Prof.Jayant S. Rohankar
 The middle tier is an OLAP server that is typically implemented using
either

o a relational OLAP (ROLAP) model, that is, an extended relational

DBMS that maps operations on multidimensional data to
standard relational operations; or
o a multidimensional OLAP (MOLAP) model, that is, a special-
purpose server that directly implements multidimensional data
and operations.

Top Tier:

 The top tier is a front-end client layer, which contains query and
reporting tools, analysis tools, and/or data mining tools (e.g., trend
analysis, prediction, and so on).

7
Prof.Jayant S. Rohankar
8
Prof.Jayant S. Rohankar
The KDD process( Lifecycle of Data Warehousing):

Knowledge discovery as a process is depicted and consists of an iterative sequence of the

following steps:

1. Data cleaning: to remove noise and inconsistent data

2. Data integration: where multiple data sources may be combined
3. Data selection: where data relevant to the analysis task are retrieved from the database
4. Data transformation: where data are transformed or consolidated into forms appropriate
for mining by performing summary or aggregation operations.

9
Prof.Jayant S. Rohankar
5. Data mining: an essential process where intelligent methods are applied in order to
extract data pattern.
6. Pattern evaluation to identify the truly interesting patterns representing knowledge based
on some interestingness measures;
7. Knowledge presentation where visualization and knowledge representation techniques
are used to present the mined knowledge to the user.

Steps 1 to 4 are different forms of data preprocessing, where the data are prepared for mining.
The data mining step may interact with the user or a knowledge base.

The interesting patterns are presented to the user and may be stored as new knowledge in the
knowledge base. Data mining is only one step in the entire process but an essential one because it
uncovers hidden patterns for evaluation. Therefore, data mining is a step in the knowledge
discovery process.

Prof.Jayant S. Rohankar
Subject Incharge

10
Prof.Jayant S. Rohankar

Diamond Essentials
100% (3)
Diamond Essentials
260 pages
Data Warehousing Assignment
100% (2)
Data Warehousing Assignment
9 pages
No More Tears (Enough Is Enough)
No ratings yet
No More Tears (Enough Is Enough)
11 pages
Assignment On Chapter 1 Data Warehousing and Management
100% (1)
Assignment On Chapter 1 Data Warehousing and Management
11 pages
Learn Data Warehousing in 24 Hours
From Everand
Learn Data Warehousing in 24 Hours
Alex Nordeen
No ratings yet
Material Recovery Facility (MRF) : Advisory On
No ratings yet
Material Recovery Facility (MRF) : Advisory On
60 pages
Registration and Shop Orientation: Division Skills Training For TLE Teachers by Specialization
88% (8)
Registration and Shop Orientation: Division Skills Training For TLE Teachers by Specialization
3 pages
Module1-Question Bank With Answers (1) - 2
No ratings yet
Module1-Question Bank With Answers (1) - 2
23 pages
Data Mining notes (1, 2, 3,4)
No ratings yet
Data Mining notes (1, 2, 3,4)
82 pages
DWBI Unit-1
No ratings yet
DWBI Unit-1
19 pages
FGFG
No ratings yet
FGFG
6 pages
DWM Unit 1
No ratings yet
DWM Unit 1
34 pages
Simad University: Chapter 8: Data Warehousing
No ratings yet
Simad University: Chapter 8: Data Warehousing
9 pages
How Evolution of Database Led To Data Mining
No ratings yet
How Evolution of Database Led To Data Mining
10 pages
DWDM-UNIT-1
No ratings yet
DWDM-UNIT-1
27 pages
DM Module 1
No ratings yet
DM Module 1
16 pages
BCS18010 - Datawarehousing & Data Mining
No ratings yet
BCS18010 - Datawarehousing & Data Mining
136 pages
Applications of Data Warehousing
No ratings yet
Applications of Data Warehousing
10 pages
DWM Unit 1
No ratings yet
DWM Unit 1
24 pages
DWDM Notes - Final
No ratings yet
DWDM Notes - Final
46 pages
Data Warehouse Power Point
No ratings yet
Data Warehouse Power Point
18 pages
DWM QB Soln
No ratings yet
DWM QB Soln
18 pages
Soft Copy of The Seminar Topic On
No ratings yet
Soft Copy of The Seminar Topic On
23 pages
Data Warehouse Week 1
No ratings yet
Data Warehouse Week 1
78 pages
DWDM Lecture Notes U-1
No ratings yet
DWDM Lecture Notes U-1
11 pages
Unit 9 - Data Warehousing
No ratings yet
Unit 9 - Data Warehousing
8 pages
DM - MOD - 2 Part - I
No ratings yet
DM - MOD - 2 Part - I
19 pages
Data Warehousing & Data Mining: Unit-1
No ratings yet
Data Warehousing & Data Mining: Unit-1
24 pages
DH&DM Unit-1
No ratings yet
DH&DM Unit-1
16 pages
Assignment 2
No ratings yet
Assignment 2
6 pages
DM Mod1 PDF
No ratings yet
DM Mod1 PDF
16 pages
Introduction On Data Warehouse With OLTP and OLAP: Arpit Parekh
No ratings yet
Introduction On Data Warehouse With OLTP and OLAP: Arpit Parekh
5 pages
Unit 1
No ratings yet
Unit 1
26 pages
DWM Unit 1
No ratings yet
DWM Unit 1
48 pages
Data Warehousing (MOHD UMAIR AHMED FAROOQUI) - 1
No ratings yet
Data Warehousing (MOHD UMAIR AHMED FAROOQUI) - 1
76 pages
DW Unit1
No ratings yet
DW Unit1
26 pages
Datawarehousing and Data Mining Full Notes PDF
No ratings yet
Datawarehousing and Data Mining Full Notes PDF
162 pages
Data Warehousing and Olap Technology: Manya Sethi
No ratings yet
Data Warehousing and Olap Technology: Manya Sethi
6 pages
Data Mining Complete
No ratings yet
Data Mining Complete
95 pages
Unit1 (DW&DM)
No ratings yet
Unit1 (DW&DM)
30 pages
Data Warehousing & Mining: Unit - Ii
No ratings yet
Data Warehousing & Mining: Unit - Ii
41 pages
Unit 2 - Data Science
No ratings yet
Unit 2 - Data Science
21 pages
Olap and Oltap
No ratings yet
Olap and Oltap
14 pages
Data Warehouse: From Wikipedia, The Free Encyclopedia
No ratings yet
Data Warehouse: From Wikipedia, The Free Encyclopedia
5 pages
Datastage Anwers
No ratings yet
Datastage Anwers
75 pages
Data Warehousing-Notes(Module -I & II) (1) (1)
No ratings yet
Data Warehousing-Notes(Module -I & II) (1) (1)
32 pages
DW Basics
No ratings yet
DW Basics
8 pages
Data Warehouse Power Point Presentation
No ratings yet
Data Warehouse Power Point Presentation
18 pages
DW Module-1
No ratings yet
DW Module-1
4 pages
Data Warehouse Definition
No ratings yet
Data Warehouse Definition
12 pages
Unit 1 (DMW)
No ratings yet
Unit 1 (DMW)
53 pages
Data Warehousing & Data Mining
100% (1)
Data Warehousing & Data Mining
22 pages
Assignment On Chapter 1 Data Warehousing and Management
No ratings yet
Assignment On Chapter 1 Data Warehousing and Management
11 pages
Unit-I Part II Erp
No ratings yet
Unit-I Part II Erp
60 pages
Unit 1 Data Warehousing and Mining
100% (1)
Unit 1 Data Warehousing and Mining
19 pages
DWDM Unit-1
No ratings yet
DWDM Unit-1
31 pages
Data Warehouse
No ratings yet
Data Warehouse
71 pages
Unit 1
No ratings yet
Unit 1
9 pages
DWM Unit I
No ratings yet
DWM Unit I
114 pages
adbms-unit5 (1)
No ratings yet
adbms-unit5 (1)
10 pages
Lecture 2
No ratings yet
Lecture 2
11 pages
12 01 09 10 32 12 1287 Sindhujam PDF
No ratings yet
12 01 09 10 32 12 1287 Sindhujam PDF
23 pages
Data Warehousing & Data Mining
No ratings yet
Data Warehousing & Data Mining
16 pages
Data Warehouse Essentials: Mastering the Foundations of Data Management
From Everand
Data Warehouse Essentials: Mastering the Foundations of Data Management
Virversity Online Courses
No ratings yet
(Verbalearn - Com) Bang Phan Phoi Student
No ratings yet
(Verbalearn - Com) Bang Phan Phoi Student
1 page
Film and Television In Jokes Nearly 2 000 Intentional References Parodies Allusions Personal Touches Cameos Spoofs and Homages 1st Edition Bill Van Heerden 2024 scribd download
100% (5)
Film and Television In Jokes Nearly 2 000 Intentional References Parodies Allusions Personal Touches Cameos Spoofs and Homages 1st Edition Bill Van Heerden 2024 scribd download
61 pages
Discussion Forum Unit 7
No ratings yet
Discussion Forum Unit 7
3 pages
Geography H.C.G. - Paper-2 (Two Hours) (2014) : Part I Is Compulsory. All Questions From Part I Are To Be Attempted
No ratings yet
Geography H.C.G. - Paper-2 (Two Hours) (2014) : Part I Is Compulsory. All Questions From Part I Are To Be Attempted
11 pages
Pro Prompt Civit AI
No ratings yet
Pro Prompt Civit AI
5 pages
Feeding Systems: Feedlot Design and Construction
No ratings yet
Feeding Systems: Feedlot Design and Construction
18 pages
Vihasifine Chem PVT LTD, (Vihasi.m@gmail - Com)
No ratings yet
Vihasifine Chem PVT LTD, (Vihasi.m@gmail - Com)
647 pages
M.E Project Ms Word
No ratings yet
M.E Project Ms Word
25 pages
Coordinating Conjunctions Exercise
No ratings yet
Coordinating Conjunctions Exercise
5 pages
Workshop Permit To Hold - Feb 3
No ratings yet
Workshop Permit To Hold - Feb 3
2 pages
Sensor Principle MH-C MH-C2 Electrical Data: Vert-X 48E - 24V / 4 - 20ma Applications
No ratings yet
Sensor Principle MH-C MH-C2 Electrical Data: Vert-X 48E - 24V / 4 - 20ma Applications
5 pages
Bhabav 6
No ratings yet
Bhabav 6
12 pages
Unit-1 & 2 - EED-380 Sensor and Transducer
No ratings yet
Unit-1 & 2 - EED-380 Sensor and Transducer
119 pages
General Brochure HESS AAC Systems English
No ratings yet
General Brochure HESS AAC Systems English
9 pages
Introducing RS: A New 3D Program For Geotechnical Analysis
100% (1)
Introducing RS: A New 3D Program For Geotechnical Analysis
4 pages
Iesco 2019
No ratings yet
Iesco 2019
6 pages
The Ultimate Origins Book v2 PDF
100% (3)
The Ultimate Origins Book v2 PDF
396 pages
Nitoflor Dissipative
No ratings yet
Nitoflor Dissipative
4 pages
SMS Projector CL H Secret-Ritning
No ratings yet
SMS Projector CL H Secret-Ritning
1 page
Coca Cola Case Study
No ratings yet
Coca Cola Case Study
7 pages
MAE113 HW1 Solution
No ratings yet
MAE113 HW1 Solution
12 pages
Review Paper
No ratings yet
Review Paper
7 pages
F900713008 Technical Specification - Light Rev 2013-03-19
No ratings yet
F900713008 Technical Specification - Light Rev 2013-03-19
47 pages
Plural+ Homework
No ratings yet
Plural+ Homework
3 pages
ERS Handbook of Paediatric Respiratory Medicine 2nd edition by Edber Ernst,Midulla Fabio 1849841306 9791849841305 download
100% (2)
ERS Handbook of Paediatric Respiratory Medicine 2nd edition by Edber Ernst,Midulla Fabio 1849841306 9791849841305 download
78 pages
Chapter 2 Final
No ratings yet
Chapter 2 Final
7 pages