Business Intelligence/Data Integration/Etl/Integration: An Introduction Presented By: Chandrashekar P

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 40

BUSINESS INTELLIGENCE/DATA

INTEGRATION/ETL/INTEGRATION

AN INTRODUCTION
Presented by: Chandrashekar p
What is Business Intelligence

Business Intelligence (BI) encompasses the processes, tools,


and technologies required to transform enterprise data
into information, and information into knowledge that can
be used to enhance decision-making and to create
actionable plans that drive effective business activity.

• BI can be used to acquire


– Tactical insight to optimize business processes by
identifying trends, anomalies, and behaviors that
require management action. 
– Strategic insight to align multiple business processes
with key business objectives through integrated
performance management and analysis.
What is Business Intelligence

Business Intelligence (BI) is about getting the right


information, to the right decision makers, at the right
time.
BI is an enterprise-wide platform that supports reporting,
analysis and decision making.
BI leads to:
fact-based decision making
“single version of the truth”
BI includes reporting and analytics.
BI is not a single computer system, but framework for leveraging data for tactical and
strategic use

Used for:
How BI Works Together
Extract Real-time
Data Input Disparate Data Sources Single
Transform Dashboards
Reporting
OLTP Load
Repository

AIMSPC Static and


Ad-hoc Reporting

TIMS DW
OLTP

RECBASS

Graphical
OLTP Data Analysis

ATRRS

Other Possible Data Sources

RATSS
RFMSS
Components of BI

• Data Integration ( Informatica, DataStage)

• Data Reporting ( Cognos, Business Objects)


Data Integration

• Data integration involves combining data residing in


different sources and providing users with a unified view
of these data.This process becomes significant in a
variety of situations both commercial (when two similar
companies need to merge their database) and scientific
(combining research results from different bioinformatics
repositories, for example).

• Data integration appears with increasing frequency as the


volume and the need to share existing data explodes It has
become the focus of extensive theoretical work, and
numerous open problems remain unsolved. In management
circles, people frequently refer to data integration as
"Enterprise Information Integration" (EII).
How to enable Data Integration

USING ETL PROCESS


ETL ( Extract Transform Load)

• ETL stands for extract, transform and load,


the processes that enable companies to move
data from multiple sources, reformat and
cleanse it, and load it into another
database, a data mart or a data warehouse
for analysis, or on another operational
system to support a business process
ETL ( Extract Transform Load)
“A Properly designed ETL system extracts data
from the source systems, enforces data quality
and consistency standards, conforms data so
that separate sources can be used together, and
finally delivers data in a presentation-ready
format so that application developers can build
applications and end users can make
decisions… ETL makes or breaks the data
warehouse…” Ralph Kimball
ETL ( Extract Transform Load)
ETL ( Extract Transform Load)
ETL – Process Flow
ETL – Process Flow
ETL Glossary
• Source System
A database, application, file, or other storage facility from which the data in a data
warehouse is derived.
• Mapping
The definition of the relationship and data flow between source and target objects.
• Metadata
Data that describes data and other structures, such as objects, business rules, and
processes. For example, the schema design of a data warehouse is typically stored in a
repository as metadata, which is used to generate scripts used to build and populate the
data warehouse. A repository contains metadata.
• Staging Area
A place where data is processed before entering the warehouse
ETL Glossary
• Cleansing
The process of resolving inconsistencies and fixing the anomalies in source data,
typically as part of the ETL process.
• Transformation
The process of manipulating data. Any manipulation beyond copying is a
transformation. Examples include cleansing, aggregating, and integrating data from
multiple sources.
• Transportation
The process of moving copied or transformed data from a source to a data warehouse.
• Target System
A database, application, file, or other storage facility to which the "transformed source
data" is loaded in a data warehouse.
ETL Tools
Informatica 8.6 – What & How to work?

• What is Informatica 8.6?

– Informatica is an ETL tool that delivers an


open, scalable data integration solution
addressing the complete life cycle for data
warehouse and analytic application development.

– Informatica provides an environment that can


extract data from multiple sources, transform
the data according to the business logic that is
built in the Informatica Client application and
load the transformed data into files or
relational targets.
Informatica 8.6– PowerCenter

PowerCenter provides an environment that allows you to load


data into a centralized location, such as a data warehouse or
operational data store (ODS). You can extract data from
multiple sources, transform the data according to business
logic you build in the client application, and load the
transformed data into file and relational targets.
Informatica Architecture 8.6
Informatica Architecture 8.6- Data Flow
Informatica Architecture 8.6- Components
PowerCenter - Components
PowerCenter - Components
Informatica – PowerCenter Domain
PowerCenter - Domain
PowerCenter – Admin Console
PowerCenter – Application Services
Informatica-Power Center Repository
Service
Informatica-Power Center Integration
Service
PowerCenter – Client Components

The Informatica Client is used to manage users, define sources and targets,
building
mappings and mapplets with the transformation logic, and create sessions to run
the
mapping logic.

The Informatica Client has the following main applications:


Repository Manager
Designer
Workflow Manager
Workflow Monitor
PowerCenter – Repository
PowerCenter – Client Components
PowerCenter – Client Components

Repository Manager: This is used to create and administer the metadata repository .

• The repository users and groups are created through the Repository Manager.

• Assigning privileges and permissions, managing folders in the repository and


managing locks on the mappings are also done through the Repository Manager
Informatica/Power Center Client
Components

Designer: The Designer has five tools that are used to analyze sources, design target
schemas and build the Source to Target mappings. These are

1. Source Analyzer: This is used to either import or create the source definitions.

1. Target Designer: This is used to import or create target definitions.

1. Mapping Designer: This is used to create mappings that will be run by the Informatica Server to
extract, transform and load data.

1. Transformation Developer: This is used to develop reusable transformations that can be used in
mappings.

1. Mapplet Designer: This is used to create sets of transformations referred to as Mapplets which can be
used across mappings.
Informatica/Power Center Client
Components

• What is WORKFLOW MANAGER?


– It’s a tool where you define a set of instructions called a
workflow to execute mappings you build in the Designer.

• What are workflow manager tools?


– It consists of three tools to help you develop a workflow.
• Task Developer. Use the Task Developer to create tasks
you want to execute in the workflow.
• Workflow Designer. Use the Workflow Designer to create a
workflow by connecting tasks with links. You can also
create tasks in the Workflow Designer as you develop the
workflow.
• Worklet Designer. Use the Worklet Designer to create a
worklet.
Load Design Process

1. Create Source definition(s)


2. Create Target definition(s)
3. Create a Mapping
4. Create a Session Task
5. Create a Workflow from Task components
6. Run the Workflow and verify the results
Informatica Transformations
» Informatica – Transformations
• In Informatica,Transformations help to transform the source data according to the
requirements of target system and it ensures the quality of the data being loaded into
target.

• Following are the list of Transformations available in Informatica:


• Aggregator Transformation
• Expression Transformation
• Filter Transformation
• Joiner Transformation
• Lookup Transformation
• Normalizer Transformation
• Rank Transformation
• Router Transformation
• Sequence Generator Transformation
• Sorter Transformation
• Update Strategy Transformation
Informatica Transformations
• Aggregator Transformation
• Aggregator transformation is an Active and Connected transformation. This
transformation is useful to perform calculations such as averages and Sums

• Expression Transformation
• Expression transformation is a Passive and Connected transformation. This can be used
to calculate values in a single row before writing to the Target

• Filter Transformation
• Filter transformation is an Active and Connected transformation. This can be used to
filter rows in a mapping that do not meet the condition.

• Joiner Transformation
• Joiner Transformation is an Active and Connected transformation. This can be used to
join two sources coming from two different locations or from same location

• Rank Transformation
• Rank transformation is an Active and Connected transformation. It is used
• to select the top or bottom rank of data
• Any Suggestions

You might also like