0% found this document useful (0 votes)
104 views15 pages

04 Data Warehouse and Data Mart

Download as pptx, pdf, or txt
0% found this document useful (0 votes)
104 views15 pages

04 Data Warehouse and Data Mart

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1/ 15

DATA WAREHOUSE

DATA MART
&
ETL (Extraction Transform Load)
Data Warehouse
Definition :
A data warehouse is a subject-oriented,
integrated, time-variant and non-volatile
collection of data in support of management's
decision making process.

2
Explanation
:
 Subject-Oriented: A data warehouse can be
used to analyze a particular subject area. For
example, "sales" can be a particular subject.

 Integrated: A data warehouse integrates data


from multiple data sources. For example, source A
and source B may have different ways of
identifying a product, but in a data warehouse,
there will be only a single way of identifying a
product.
3
 Time-Variant: Historical data is kept in a data
warehouse. For example, one can retrieve data from 3
months, 6 months, 12 months, or even older data from a
data warehouse. This contrasts with a transactions
system, where often only the most recent data is kept. For
example, a transaction system may hold the most recent
address of a customer, where a data warehouse can hold
all addresses associated with a customer.

 Non-volatile: Once data is in the data warehouse, it


will not change. So, historical data in a data warehouse
should never be altered.

4
Benefits of a Data Warehouse
 A Data Warehouse Delivers Enhanced
Business Intelligence

By providing data from various sources, managers and


executives will no longer need to make business
decisions based on limited data or their gut. In addition,
“data warehouses and related BI can be applied directly
to business processes including marketing segmentation,
inventory management, financial management, and
sales.”

5
 A Data Warehouse Saves Time

Since business users can quickly access critical data


from a number of sources—all in one place—they can
rapidly make informed decisions on key initiatives.

 A Data Warehouse Enhances Data Quality


and Consistency

A data warehouse implementation includes the


conversion of data from numerous source systems
into a common format.
Since each data from the various departments is
standardized, each department will produce results
that are in line with all the other departments. 6
 A Data Warehouse Provides Historical
Intelligence

A data warehouse stores large amounts of historical data so


you can analyze different time periods and trends in order to
make future predictions. Such data typically cannot be
stored in a transactional database or used to generate
reports from a transactional system.

 A Data Warehouse Generates a High ROI

Finally, the piece de resistance—return on investment.


Companies that have implemented data warehouses and
complementary BI systems have generated more revenue and
saved more money than companies that haven’t invested in 7

BI systems and data warehouses.


Data Mart
Definition :
A data mart is a simple form of a data warehouse that
is focused on a single subject (or functional area), such
as Sales, Finance, or Marketing. Data marts are often
built and controlled by a single department within an
organization.

8
Data Mart
Definition :
A data mart is an access layer which is used to get data
out to the users. It is presented as an option for large
size data warehouse as it takes less time and money to
build. However, there is no standard definition of a data
mart is differing from person to person.

In a simple word Data mart is a subsidiary of a data


warehouse. The data mart is used for partition of data
which is created for the specific group of users.
8

Data marts could be created in the same database as


the Datawarehouse or a physically separate Database.
Differences Between
a Data Warehouse and a Data Mart

Category Data Warehouse Data


Mart
• Scope • Corporate • Line of
Busines
s (LOB)
• Subject • Multiple • Single
subject
• Data Sources • Many •
• Size (typical) • 100 GB- • Few
• TB+ • < 100
Implementation • GB 9
Months
Time
to years Months
ETL(Extract Transform and
Load)
Definition :
ETL stands for extract, transform, load, three
database functions that are combined into one
tool to pull data out of one database and place
it into another database.

10
Explanation
:
 Extract means to get data from
source system as efficiently as possible

 Transform means to perform


calculations on data

 Load is the process of writing the data into


the target database.

11
ETL
Tools
At present the most popular and widely used ETL tools and
applications on the market are:
 IBM Websphere DataStage (Formerly known as Ascential
DataStage and Ardent DataStage)
 Informatica PowerCenter
 Oracle ETL
 Ab Initio
 Pentaho Data Integration - Kettle Project (open
source ETL)
 SAS ETL studio
 Cognos Decisionstream
 Business Objects Data Integrator (BODI)
 Microsoft SQL Server Integration Services (SSIS) 12
ETL

13
Workflow
Thank You…
Have a Nice 14

Day…!

You might also like